2437 Commits (7b66330deada3236e2bcd0ccf3e919ab33a88e28)

Author SHA1 Message Date
  Martin Kroeker b30dc9701f
Merge pull request #5215 from annop-w/gemv_t 9 months ago
  Martin Kroeker 2893d0add4
Merge pull request #5211 from guoyuanplct/develop 9 months ago
  Annop Wongwathanarat ec146157d3 Use SVE kernel for S/DGEMVT for SVE machines 10 months ago
  Martin Kroeker 70865a894e
Merge pull request #5180 from ywwry66/openmp_use_cmake 9 months ago
  lglglglgy 1ff303f36e Optimizing the Implementation of GEMV on the RISC-V V Extension 9 months ago
  ColumbusAI 7bf848454d
Update zsum.c -- fixed spelling error to successfully compile 9 months ago
  Egbert Eich ea6515c4b3 On zarch don't produce objects from assembler with a writable stack section 10 months ago
  Ruiyang Wu 02fd1df10b CMake: Pass `OpenMP` compiler and linker flags through CMake targets 10 months ago
  Ye Tao f27ba5efd1 fix bugs in aarch64 sbgemv_n kernel 10 months ago
  Annop Wongwathanarat edef2e4441 Fix bug in ARM64 sbgemv_t 10 months ago
  Martin Kroeker b55ca71d5b
Merge pull request #5182 from annop-w/sgemm_ncopy 10 months ago
  Martin Kroeker 2f778554b8
Merge pull request #5181 from taoye9/change_sbgemn_cast_bf16 10 months ago
  Annop Wongwathanarat 9807f56580 Optimize aarch64 sgemm_ncopy 10 months ago
  Martin Kroeker a3e7b16072
Merge pull request #5157 from manaalmj/feature 10 months ago
  Ye Tao 4c00099ed6 replace customize bf16_to_fp32 with arm neon vcvtah_f32_bf16 10 months ago
  Annop Wongwathanarat a085b6c9ec Fix aarch64 sbgemv_t compilation error for GCC < 13 10 months ago
  manjam01 5c4e38ab17 Optimize gemv_n_sve kernel 11 months ago
  Martin Kroeker 1d5ed5c46b
Merge pull request #5168 from taoye9/add_sbgemvn_on_neonversen2 11 months ago
  Ye Tao 6b8b35cdf2 fix minior issues of redeclaration of float x0,x1 in sbgemv_n_neon.c 11 months ago
  Ye Tao 38ee7c9301 Add dispatch of SBGEMVNKERNEL for NEOVERSEN2 and NEOVERSEV2 11 months ago
  Martin Kroeker 2b941c44b5
Merge branch 'develop' into sbgemv_n_neon 11 months ago
  Ye Tao 35bdbca153 Add sbgemv_n_neon kernel for arm64. 11 months ago
  Annop Wongwathanarat edaf51dd99 Add sbgemv_t_bfdot kernel for ARM64 11 months ago
  Martin Kroeker 77fba0f400
Fix "dummy2" flag handling 11 months ago
  Martin Kroeker eb84aac7ad
Merge pull request #5084 from quic/topic/sgemm_direct_sme1 11 months ago
  Martin Kroeker b9ae246f20
define USE_TRMM for RISCV64 targets as well 11 months ago
  Vaisakh K V f66ca05b31
Merge branch 'develop' into topic/sgemm_direct_sme1 11 months ago
  Vaisakh K V d23eb3b93e Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API 1 year ago
  Martin Kroeker 8d487ef6eb
Merge pull request #5124 from XiWeiGu/LoongArch64-LA264-lapack-fixed 11 months ago
  Martin Kroeker 81eed868b6
Restore the non-vectorized code from before PR4880 for POWER8 11 months ago
  Martin Kroeker 98b5ef929c
Restore the non-vectorized code from before PR4880 for POWER8 11 months ago
  gxw 2c4a5cc6e6 LoongArch64: Fixed snrm2_lsx.S and cnrm2_lsx.S 11 months ago
  gxw 9e75d6b3d1 LoongArch64: Fixed swap_lsx.S 11 months ago
  gxw e8c740368c LoongArch64: Fixed rot_lsx.S ane crot_lsx.S 11 months ago
  Hao Chen c2212d0abd LoongArch64: Fixed copy_lsx.S 11 months ago
  Hao Chen 7f1ebc7ae6 LoongArch64: Fixed iamax_lsx.S 11 months ago
  Hao Chen 31d326f895 LoongArch64: Fixed dot_lsx.S 1 year ago
  Hao Chen 5d6356bc16 LoongArch64: Fixed amax_lsx.S 1 year ago
  Ye Tao c748e6a338 optimized sbgemm kernel for neoverse-v1 (sve-256) 1 year ago
  Aditya Tewari 4379a6fbe3 * checkpoint sbgemm for SVE-256 1 year ago
  Martin Kroeker d7036cfd74
Remove trailing blanks that break the cmake parser 1 year ago
  Martin Kroeker 6e393a5599
Merge branch 'develop' into gemv_t 1 year ago
  Martin Kroeker 876ba58e28
Merge pull request #5091 from goplanid/develop 1 year ago
  Martin Kroeker 180ba5e7d0
Merge pull request #5069 from tingboliao/dev_rotm_20250107 1 year ago
  Deeksha Goplani d1bfa979f7 small gemm kernel packing modifications 1 year ago
  Martin Kroeker 1a6a9fb22f
add another generator line for rotm 1 year ago
  Martin Kroeker 4924319c50
fix position of srotm, qrotm 1 year ago
  Martin Kroeker b58cba9eb6
fix qrotm build rules 1 year ago
  tingbo.liao 3c8df6358f Further rearranged the rotm kernel for the different architectures. 1 year ago
  Annop Wongwathanarat c0318cea6e Simplify gemv_t_sve_v1x3 kernel 1 year ago