109 Commits (1ee8879c787c19b6a6c092def2016e76e93ffefd)

Author SHA1 Message Date
  Martin Kroeker 9d6df1dd3e
Merge pull request #5422 from ChipKerchner/addRVVVectorizedPacking 5 months ago
  Chip Kerchner 64401b4417 Disable vectorized packing for DGEMM - since it is slower than scalar. 5 months ago
  Chip Kerchner c00afc86a6 Add and use vectorized packing to ZVL128B and ZVL256B. Up to 3x+ faster than generic scalar functions. 5 months ago
  Chip Kerchner 72f082f31d Fix bad vector zero initializer and other compiler warnings for RISC-V. 6 months ago
  Martin Kroeker e2d941e9af
Declare the "small" kernel static in addition to inline 6 months ago
  Martin Kroeker 8214700930
Declare the "small" kernel static in addition to inline 6 months ago
  Martin Kroeker d96daa220d
Merge pull request #5290 from Srangrang/develop 7 months ago
  Srangrang ec14e1648c fix: resolve non-RISCV host build failed issue 7 months ago
  Martin Kroeker 73af02b89f
use dummy2 as Inf/NAN handling flag 7 months ago
  Martin Kroeker f18b7a46bf
add dummy2 flag handling for inf/nan agnostic zeroing 7 months ago
  guoyuanplct 2ae019161a fixed the performance problem in RISCV64_ZVL256 when OPENBLAS_K is small 8 months ago
  Srangrang fb89820f20 Merge branch 'develop' of https://github.com/Srangrang/OpenBLAS into develop 8 months ago
  Srangrang 4e1a381e5b fix: resolve the compilation failure without zfh instruction 8 months ago
  gkdddd 670ec6f757 Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B 8 months ago
  guoyuanplct d2003dc886 del lines 8 months ago
  guoyuanplct 45fd2d9b07 Optimized the axpby function. 8 months ago
  Srangrang 2996c25c94 add shgemm for RISCV_ZVL128B 8 months ago
  guoyuanplct be9f7550b5 Format Code 8 months ago
  guoyuanplct 4d213653d8 kernel/riscv64:Added support for omatcopy on riscv64. 8 months ago
  guoyuanplct 9a7e3f102b kernel/riscv64:Fixed the bug of openblas_utest_ext failing in c/zgemv and some c/zgbmv tests: 8 months ago
  guoyuanplct 11ffc8680e Format the code 9 months ago
  guoyuanplct 7616c42095 Optimized RVV_ZVL256B Implementation of zgemv_n 9 months ago
  lglglglgy 1ff303f36e Optimizing the Implementation of GEMV on the RISC-V V Extension 10 months ago
  Martin Kroeker 180ba5e7d0
Merge pull request #5069 from tingboliao/dev_rotm_20250107 1 year ago
  tingbo.liao 3c8df6358f Further rearranged the rotm kernel for the different architectures. 1 year ago
  tingbo.liao ef7f54b357 Optimized the gemm_tcopy_8_rvv to be compatible with the vlens 128 and 256. 1 year ago
  tingbo.liao 0a5dbf13d3 Optimize the omatcopy_cn and zomatcopy_cn kernels with RVV 1.0 intrinsic. 1 year ago
  tingbo.liao c37509c213 Optimize the nrm2_rvv function to further improve performance. 1 year ago
  tingbo.liao 0bea1cfd9d Optimize the zgemm_tcopy_4_rvv function to be compatible with the situations where the vector lengths(vlens) are 128 and 256. 1 year ago
  tingbo.liao d00cc400b1 Replaced the __riscv_vid_v_i32m2 and __riscv_vid_v_i64m2 with __riscv_vid_v_u32m2 and __riscv_vid_v_u64m2 for riscv64-unknown-linux-gnu-gcc compiling. 1 year ago
  Martin Kroeker a875304eb0
fix inverted conditional for NAN handling 1 year ago
  Martin Kroeker f5d04318e3
Merge branch 'OpenMathLib:develop' into scalfixes 1 year ago
  Martin Kroeker a815594fd1
Merge pull request #4801 from markdryan/markdryan/riscv-dynamic-arch 1 year ago
  Martin Kroeker 2020569705
fix NAN handling and make it depend on dummy2 parameter 1 year ago
  Martin Kroeker 3870995f01
make NAN handling depend on dummy2 parameter 1 year ago
  Martin Kroeker 7284c533b5
make NAN handling depend on dummy2 parameter 1 year ago
  Mark Ryan 67bf4b6998 Fix axpby_rvv kernels for cases where inc_y = 0 1 year ago
  Mark Ryan 3b715e6162 Add autodetection for riscv64 1 year ago
  Martin Kroeker c1019d5832
Handle INF and NAN in inputs 1 year ago
  Martin Kroeker 516743f7dc
fix other instances of mishandling INF 1 year ago
  Martin Kroeker cf80bd8500
Update nrm2_rvv.c 1 year ago
  Martin Kroeker 9baa757905
Update nrm2_vector.c 1 year ago
  Martin Kroeker 18a6db6862
Update nrm2_vector.c 1 year ago
  Martin Kroeker 3752e73919
handle incx < 0 1 year ago
  Martin Kroeker db70c7f7fb
handle incx < 0 1 year ago
  Martin Kroeker dee8557d58
handle incx < 0 1 year ago
  Martin Kroeker d9dff17aec
handle incx < 0 1 year ago
  Martin Kroeker 6b89e1f1d7
fix loop condition for incx < 0 1 year ago
  Martin Kroeker 20016a0096
fix loop condition for incx < 0 1 year ago
  Sergei Lewis ba17758c02 fix axpy implementations where y has a stride of 0 1 year ago