334 Commits (be68ef03b4b7ec03c268dfc68aaa6b0b87f89ae9)

Author SHA1 Message Date
  davidz-ampere be68ef03b4 Add support for Ampere processors 7 months ago
  Martin Kroeker 58eeb9041c
fix handling of dummy2 8 months ago
  Martin Kroeker 1589d0b21e
Merge pull request #5281 from martin-frbg/zscal_arm64 8 months ago
  Arne Juul 5442aff218 Accumulate results in output register explicitly 8 months ago
  Martin Kroeker 28f8fdaf0f
support flag for NaN/Inf handling and fix scaling of NaN/Inf values 8 months ago
  Martin Kroeker 5141a90993
Fix ARMV9SME target in DYNAMIC_ARCH and add SME query code for MacOS (#5222) 9 months ago
  Martin Kroeker 151b74284e
Merge pull request #5203 from quic/fix-sgemmdirect-sme1 9 months ago
  abhishek-fujitsu 9c02cdb073 optimise dot using thread throttling for NEOVERSE V1 10 months ago
  Martin Kroeker d0e8fd6d40
Merge pull request #5239 from annop-w/gemv_n_sve 9 months ago
  Iha, Taisei 08b5c18d70 fixed a potential out-of-bounds on gemv. 9 months ago
  Annop Wongwathanarat e11744a411 Use SVE kernel for S/DGEMVN for SVE machines 9 months ago
  Martin Kroeker dd38b4e811
Merge pull request #5225 from annop-w/gemv_n 9 months ago
  Martin Kroeker 0241d516f6
Merge pull request #5220 from iha-taisei/sdgemv_n_unroll 9 months ago
  Annop Wongwathanarat d535728803 Improve performance for SGEMVN on NEONVERSEN1 10 months ago
  Usui, Tetsuzo d711906e3e Add symv kernels for arm64 10 months ago
  Iha, Taisei f1e628b889 Further performance improvements to [SD]GEMV. 10 months ago
  Annop Wongwathanarat ec146157d3 Use SVE kernel for S/DGEMVT for SVE machines 10 months ago
  Vaisakh K V 04915be829 Add vector registers to clobber list to prevent compiler optimization. 10 months ago
  Ye Tao f27ba5efd1 fix bugs in aarch64 sbgemv_n kernel 11 months ago
  Annop Wongwathanarat edef2e4441 Fix bug in ARM64 sbgemv_t 11 months ago
  Martin Kroeker b55ca71d5b
Merge pull request #5182 from annop-w/sgemm_ncopy 11 months ago
  Martin Kroeker 2f778554b8
Merge pull request #5181 from taoye9/change_sbgemn_cast_bf16 11 months ago
  Annop Wongwathanarat 9807f56580 Optimize aarch64 sgemm_ncopy 11 months ago
  Martin Kroeker a3e7b16072
Merge pull request #5157 from manaalmj/feature 11 months ago
  Ye Tao 4c00099ed6 replace customize bf16_to_fp32 with arm neon vcvtah_f32_bf16 11 months ago
  Annop Wongwathanarat a085b6c9ec Fix aarch64 sbgemv_t compilation error for GCC < 13 11 months ago
  manjam01 5c4e38ab17 Optimize gemv_n_sve kernel 11 months ago
  Martin Kroeker 1d5ed5c46b
Merge pull request #5168 from taoye9/add_sbgemvn_on_neonversen2 11 months ago
  Ye Tao 6b8b35cdf2 fix minior issues of redeclaration of float x0,x1 in sbgemv_n_neon.c 11 months ago
  Ye Tao 38ee7c9301 Add dispatch of SBGEMVNKERNEL for NEOVERSEN2 and NEOVERSEV2 11 months ago
  Martin Kroeker 2b941c44b5
Merge branch 'develop' into sbgemv_n_neon 11 months ago
  Ye Tao 35bdbca153 Add sbgemv_n_neon kernel for arm64. 11 months ago
  Annop Wongwathanarat edaf51dd99 Add sbgemv_t_bfdot kernel for ARM64 11 months ago
  Vaisakh K V f66ca05b31
Merge branch 'develop' into topic/sgemm_direct_sme1 1 year ago
  Vaisakh K V d23eb3b93e Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API 1 year ago
  Ye Tao c748e6a338 optimized sbgemm kernel for neoverse-v1 (sve-256) 1 year ago
  Aditya Tewari 4379a6fbe3 * checkpoint sbgemm for SVE-256 1 year ago
  Martin Kroeker 6e393a5599
Merge branch 'develop' into gemv_t 1 year ago
  Martin Kroeker 876ba58e28
Merge pull request #5091 from goplanid/develop 1 year ago
  Martin Kroeker 180ba5e7d0
Merge pull request #5069 from tingboliao/dev_rotm_20250107 1 year ago
  Deeksha Goplani d1bfa979f7 small gemm kernel packing modifications 1 year ago
  tingbo.liao 3c8df6358f Further rearranged the rotm kernel for the different architectures. 1 year ago
  Annop Wongwathanarat c0318cea6e Simplify gemv_t_sve_v1x3 kernel 1 year ago
  Martin Kroeker 87083fdbf6
[WIP] Work around assembler limitations in current LLVM for Windows on Arm (#5076) 1 year ago
  Martin Kroeker 229d8a025e
Merge pull request #4959 from CDAC-Bengaluru/level-1-sve 1 year ago
  SushilPratap04 3368a4e697
Update swap_kernel_sve.c 1 year ago
  CDAC-SSDG dd71e4234a
Added Updated swap and rot sve kernels. 1 year ago
  CDAC-SSDG 06ffd411a5
Update KERNEL.ARMV8SVE 1 year ago
  CDAC-SSDG 765850194e
Delete kernel/arm64/swap_kernel_sve.c 1 year ago
  CDAC-SSDG c17c19fbcf
Delete kernel/arm64/swap_kernel_c.c 1 year ago