290 Commits (2891fd8d6df7d634374ee46743deb96da130ab52)

Author SHA1 Message Date
  Martin Kroeker 229d8a025e
Merge pull request #4959 from CDAC-Bengaluru/level-1-sve 1 year ago
  SushilPratap04 3368a4e697
Update swap_kernel_sve.c 1 year ago
  CDAC-SSDG dd71e4234a
Added Updated swap and rot sve kernels. 1 year ago
  CDAC-SSDG 06ffd411a5
Update KERNEL.ARMV8SVE 1 year ago
  CDAC-SSDG 765850194e
Delete kernel/arm64/swap_kernel_sve.c 1 year ago
  CDAC-SSDG c17c19fbcf
Delete kernel/arm64/swap_kernel_c.c 1 year ago
  CDAC-SSDG f6416c0e37
Delete kernel/arm64/swap.c 1 year ago
  CDAC-SSDG 3b7b74664c
Delete kernel/arm64/scal_kernel_sve.c 1 year ago
  CDAC-SSDG 95a97012e8
Delete kernel/arm64/scal_kernel_c.c 1 year ago
  CDAC-SSDG 5540f2121e
Delete kernel/arm64/scal.c 1 year ago
  CDAC-SSDG f62519cc87
Delete kernel/arm64/rot_kernel_sve.c 1 year ago
  CDAC-SSDG 10857c9df4
Delete kernel/arm64/rot_kernel_c.c 1 year ago
  CDAC-SSDG b9f51a5cf7
Delete kernel/arm64/rot.c 1 year ago
  Martin Kroeker 81666de4ef
Merge pull request #5007 from martin-frbg/issue5006 1 year ago
  Martin Kroeker 3345007d8f
retire the thunderx2 NRM2 kernels due to reported inaccuracies and NAN 1 year ago
  Martin Kroeker 5fe983db29
retire the thunderx2 nrm2 kernels for now due to NAN and inaccuracies 1 year ago
  Iha, Taisei 4918beecbe Loop-unrolled transposed [SD]GEMV kernels for A64FX and Neoverse V1 1 year ago
  Juliya32 3b2421cba0
Add files via upload 1 year ago
  Juliya32 012fe4da36
Delete kernel/arm64/rot_kernel_sve.c 1 year ago
  Juliya32 d90ee00f85
Delete kernel/arm64/rot_kernel_c.c 1 year ago
  Juliya32 668e28adc4
Delete kernel/arm64/rot.c 1 year ago
  SushilPratap04 fa880ab1cf
Update KERNEL.ARMV8SVE 1 year ago
  SushilPratap04 7822ae9617
Added sve kernels for rot routine. 1 year ago
  SushilPratap04 b8bc2a752e
Added sve optimized kernels for swap routine 1 year ago
  CDAC-SSDG 0667cf6c92
Added optimized scal routine files 1 year ago
  Deeksha Goplani 4894c54055 Improve TN case with further unrolling 1 year ago
  Chris Sidebottom ba2e989c67 Add accumulators to AArch64 GEMV Kernels 1 year ago
  Martin Kroeker fb7c53c5e5
Merge pull request #4807 from martin-frbg/scalfixes 1 year ago
  Martin Kroeker a4e56e0452
Merge pull request #4806 from Mousius/small-gemm 1 year ago
  yamazaki-mitsufumi 88caf02f62 Fix ambiguous error on Mac OS 1 year ago
  Chris Sidebottom ea4ab3b310 Better header guard around bridge 1 year ago
  Chris Sidebottom 7311d93016 Unroll TT further 1 year ago
  Chris Sidebottom a9edddb695 Unroll TN further 1 year ago
  Chris Sidebottom 9984c5ce9d Clean up k2 removal more and unroll SGEMM more 1 year ago
  Chris Sidebottom b1c9fafabb Remove k2 loop from DGEMM TN and use a more conservative heuristic for SGEMM 1 year ago
  Martin Kroeker eb4879e04c
make NAN handling depend on the dummy2 parameter 1 year ago
  iha fujitsu 0985fdc82b A64FX: Add support for SVE to SGEMV/DGEMV kernels. 1 year ago
  Martin Kroeker 3677b3886c
Merge pull request #4702 from bashimao/detect-nv-grace 1 year ago
  Chris Sidebottom 8c472ef7e3 Further tweak small GEMM for AArch64 1 year ago
  Martin Kroeker a2ee4b1966
Merge branch 'OpenMathLib:develop' into issue4728 1 year ago
  Martin Kroeker 3ec59922b6
Add a clobber list to fix utest errors seen with gcc13 on Apple M 1 year ago
  Martin Kroeker 3d8054fb16
add clobber list 1 year ago
  Martin Kroeker c7cacd9b38
disable the shortcut for da=0 to ensure proper handling of INF and NAN 1 year ago
  Matthias Langer 0050a9660b Correctly detect ARM Neoverse V2 CPUs. 1 year ago
  Martin Kroeker 7cfd433d0c
revert the C/Z NRM2 kernels to the base NEON kernel as well 1 year ago
  Martin Kroeker 441c81026e
Add support for Cortex-A76 1 year ago
  Martin Kroeker 9ead81bd39
Revert S/DNRM2 to the base NEON kernel to fix precision loss 1 year ago
  Martin Kroeker 552c521353
remove another early exit for incx < 0 1 year ago
  Martin Kroeker ed532dc75b
remove another early exit for incx < 0 1 year ago
  Martin Kroeker e41d01bad9
remove early exit on negative inc_x 1 year ago