326 Commits (57bb46bedfca77fdbce2a480cb3949ca5ea9ab91)

Author SHA1 Message Date
  abhishek-fujitsu 9c02cdb073 optimise dot using thread throttling for NEOVERSE V1 10 months ago
  Martin Kroeker d0e8fd6d40
Merge pull request #5239 from annop-w/gemv_n_sve 9 months ago
  Iha, Taisei 08b5c18d70 fixed a potential out-of-bounds on gemv. 9 months ago
  Annop Wongwathanarat e11744a411 Use SVE kernel for S/DGEMVN for SVE machines 9 months ago
  Martin Kroeker dd38b4e811
Merge pull request #5225 from annop-w/gemv_n 9 months ago
  Martin Kroeker 0241d516f6
Merge pull request #5220 from iha-taisei/sdgemv_n_unroll 9 months ago
  Annop Wongwathanarat d535728803 Improve performance for SGEMVN on NEONVERSEN1 9 months ago
  Usui, Tetsuzo d711906e3e Add symv kernels for arm64 9 months ago
  Iha, Taisei f1e628b889 Further performance improvements to [SD]GEMV. 9 months ago
  Annop Wongwathanarat ec146157d3 Use SVE kernel for S/DGEMVT for SVE machines 10 months ago
  Ye Tao f27ba5efd1 fix bugs in aarch64 sbgemv_n kernel 10 months ago
  Annop Wongwathanarat edef2e4441 Fix bug in ARM64 sbgemv_t 10 months ago
  Martin Kroeker b55ca71d5b
Merge pull request #5182 from annop-w/sgemm_ncopy 10 months ago
  Martin Kroeker 2f778554b8
Merge pull request #5181 from taoye9/change_sbgemn_cast_bf16 10 months ago
  Annop Wongwathanarat 9807f56580 Optimize aarch64 sgemm_ncopy 10 months ago
  Martin Kroeker a3e7b16072
Merge pull request #5157 from manaalmj/feature 10 months ago
  Ye Tao 4c00099ed6 replace customize bf16_to_fp32 with arm neon vcvtah_f32_bf16 10 months ago
  Annop Wongwathanarat a085b6c9ec Fix aarch64 sbgemv_t compilation error for GCC < 13 10 months ago
  manjam01 5c4e38ab17 Optimize gemv_n_sve kernel 11 months ago
  Martin Kroeker 1d5ed5c46b
Merge pull request #5168 from taoye9/add_sbgemvn_on_neonversen2 11 months ago
  Ye Tao 6b8b35cdf2 fix minior issues of redeclaration of float x0,x1 in sbgemv_n_neon.c 11 months ago
  Ye Tao 38ee7c9301 Add dispatch of SBGEMVNKERNEL for NEOVERSEN2 and NEOVERSEV2 11 months ago
  Martin Kroeker 2b941c44b5
Merge branch 'develop' into sbgemv_n_neon 11 months ago
  Ye Tao 35bdbca153 Add sbgemv_n_neon kernel for arm64. 11 months ago
  Annop Wongwathanarat edaf51dd99 Add sbgemv_t_bfdot kernel for ARM64 11 months ago
  Vaisakh K V f66ca05b31
Merge branch 'develop' into topic/sgemm_direct_sme1 11 months ago
  Vaisakh K V d23eb3b93e Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API 1 year ago
  Ye Tao c748e6a338 optimized sbgemm kernel for neoverse-v1 (sve-256) 1 year ago
  Aditya Tewari 4379a6fbe3 * checkpoint sbgemm for SVE-256 1 year ago
  Martin Kroeker 6e393a5599
Merge branch 'develop' into gemv_t 1 year ago
  Martin Kroeker 876ba58e28
Merge pull request #5091 from goplanid/develop 1 year ago
  Martin Kroeker 180ba5e7d0
Merge pull request #5069 from tingboliao/dev_rotm_20250107 1 year ago
  Deeksha Goplani d1bfa979f7 small gemm kernel packing modifications 1 year ago
  tingbo.liao 3c8df6358f Further rearranged the rotm kernel for the different architectures. 1 year ago
  Annop Wongwathanarat c0318cea6e Simplify gemv_t_sve_v1x3 kernel 1 year ago
  Martin Kroeker 87083fdbf6
[WIP] Work around assembler limitations in current LLVM for Windows on Arm (#5076) 1 year ago
  Martin Kroeker 229d8a025e
Merge pull request #4959 from CDAC-Bengaluru/level-1-sve 1 year ago
  SushilPratap04 3368a4e697
Update swap_kernel_sve.c 1 year ago
  CDAC-SSDG dd71e4234a
Added Updated swap and rot sve kernels. 1 year ago
  CDAC-SSDG 06ffd411a5
Update KERNEL.ARMV8SVE 1 year ago
  CDAC-SSDG 765850194e
Delete kernel/arm64/swap_kernel_sve.c 1 year ago
  CDAC-SSDG c17c19fbcf
Delete kernel/arm64/swap_kernel_c.c 1 year ago
  CDAC-SSDG f6416c0e37
Delete kernel/arm64/swap.c 1 year ago
  CDAC-SSDG 3b7b74664c
Delete kernel/arm64/scal_kernel_sve.c 1 year ago
  CDAC-SSDG 95a97012e8
Delete kernel/arm64/scal_kernel_c.c 1 year ago
  CDAC-SSDG 5540f2121e
Delete kernel/arm64/scal.c 1 year ago
  CDAC-SSDG f62519cc87
Delete kernel/arm64/rot_kernel_sve.c 1 year ago
  CDAC-SSDG 10857c9df4
Delete kernel/arm64/rot_kernel_c.c 1 year ago
  CDAC-SSDG b9f51a5cf7
Delete kernel/arm64/rot.c 1 year ago
  Martin Kroeker 81666de4ef
Merge pull request #5007 from martin-frbg/issue5006 1 year ago