110 Commits (ce79fe12fdacdfd5d48c4a61a08f86aa6170eae9)

Author SHA1 Message Date
  yuanjia c2cc7a3602 riscv64: optimize gemv_t_vector.c 8 months ago
  Martin Kroeker 9d6df1dd3e
Merge pull request #5422 from ChipKerchner/addRVVVectorizedPacking 9 months ago
  Chip Kerchner 64401b4417 Disable vectorized packing for DGEMM - since it is slower than scalar. 9 months ago
  Chip Kerchner c00afc86a6 Add and use vectorized packing to ZVL128B and ZVL256B. Up to 3x+ faster than generic scalar functions. 9 months ago
  Chip Kerchner 72f082f31d Fix bad vector zero initializer and other compiler warnings for RISC-V. 9 months ago
  Martin Kroeker e2d941e9af
Declare the "small" kernel static in addition to inline 10 months ago
  Martin Kroeker 8214700930
Declare the "small" kernel static in addition to inline 10 months ago
  Martin Kroeker d96daa220d
Merge pull request #5290 from Srangrang/develop 10 months ago
  Srangrang ec14e1648c fix: resolve non-RISCV host build failed issue 11 months ago
  Martin Kroeker 73af02b89f
use dummy2 as Inf/NAN handling flag 11 months ago
  Martin Kroeker f18b7a46bf
add dummy2 flag handling for inf/nan agnostic zeroing 11 months ago
  guoyuanplct 2ae019161a fixed the performance problem in RISCV64_ZVL256 when OPENBLAS_K is small 11 months ago
  Srangrang fb89820f20 Merge branch 'develop' of https://github.com/Srangrang/OpenBLAS into develop 11 months ago
  Srangrang 4e1a381e5b fix: resolve the compilation failure without zfh instruction 11 months ago
  gkdddd 670ec6f757 Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B 11 months ago
  guoyuanplct d2003dc886 del lines 11 months ago
  guoyuanplct 45fd2d9b07 Optimized the axpby function. 11 months ago
  Srangrang 2996c25c94 add shgemm for RISCV_ZVL128B 11 months ago
  guoyuanplct be9f7550b5 Format Code 1 year ago
  guoyuanplct 4d213653d8 kernel/riscv64:Added support for omatcopy on riscv64. 1 year ago
  guoyuanplct 9a7e3f102b kernel/riscv64:Fixed the bug of openblas_utest_ext failing in c/zgemv and some c/zgbmv tests: 1 year ago
  guoyuanplct 11ffc8680e Format the code 1 year ago
  guoyuanplct 7616c42095 Optimized RVV_ZVL256B Implementation of zgemv_n 1 year ago
  lglglglgy 1ff303f36e Optimizing the Implementation of GEMV on the RISC-V V Extension 1 year ago
  Martin Kroeker 180ba5e7d0
Merge pull request #5069 from tingboliao/dev_rotm_20250107 1 year ago
  tingbo.liao 3c8df6358f Further rearranged the rotm kernel for the different architectures. 1 year ago
  tingbo.liao ef7f54b357 Optimized the gemm_tcopy_8_rvv to be compatible with the vlens 128 and 256. 1 year ago
  tingbo.liao 0a5dbf13d3 Optimize the omatcopy_cn and zomatcopy_cn kernels with RVV 1.0 intrinsic. 1 year ago
  tingbo.liao c37509c213 Optimize the nrm2_rvv function to further improve performance. 1 year ago
  tingbo.liao 0bea1cfd9d Optimize the zgemm_tcopy_4_rvv function to be compatible with the situations where the vector lengths(vlens) are 128 and 256. 1 year ago
  tingbo.liao d00cc400b1 Replaced the __riscv_vid_v_i32m2 and __riscv_vid_v_i64m2 with __riscv_vid_v_u32m2 and __riscv_vid_v_u64m2 for riscv64-unknown-linux-gnu-gcc compiling. 1 year ago
  Martin Kroeker a875304eb0
fix inverted conditional for NAN handling 1 year ago
  Martin Kroeker f5d04318e3
Merge branch 'OpenMathLib:develop' into scalfixes 1 year ago
  Martin Kroeker a815594fd1
Merge pull request #4801 from markdryan/markdryan/riscv-dynamic-arch 1 year ago
  Martin Kroeker 2020569705
fix NAN handling and make it depend on dummy2 parameter 1 year ago
  Martin Kroeker 3870995f01
make NAN handling depend on dummy2 parameter 1 year ago
  Martin Kroeker 7284c533b5
make NAN handling depend on dummy2 parameter 1 year ago
  Mark Ryan 67bf4b6998 Fix axpby_rvv kernels for cases where inc_y = 0 1 year ago
  Mark Ryan 3b715e6162 Add autodetection for riscv64 1 year ago
  Martin Kroeker c1019d5832
Handle INF and NAN in inputs 1 year ago
  Martin Kroeker 516743f7dc
fix other instances of mishandling INF 1 year ago
  Martin Kroeker cf80bd8500
Update nrm2_rvv.c 2 years ago
  Martin Kroeker 9baa757905
Update nrm2_vector.c 2 years ago
  Martin Kroeker 18a6db6862
Update nrm2_vector.c 2 years ago
  Martin Kroeker 3752e73919
handle incx < 0 2 years ago
  Martin Kroeker db70c7f7fb
handle incx < 0 2 years ago
  Martin Kroeker dee8557d58
handle incx < 0 2 years ago
  Martin Kroeker d9dff17aec
handle incx < 0 2 years ago
  Martin Kroeker 6b89e1f1d7
fix loop condition for incx < 0 2 years ago
  Martin Kroeker 20016a0096
fix loop condition for incx < 0 2 years ago