1955 Commits (72caceb324aaa79ea41a71a07f535cefc7d9fb76)

Author SHA1 Message Date
  Martin Kroeker 72caceb324
Merge pull request #4009 from Mousius/sve-gemm 2 years ago
  Martin Kroeker c9174ae8d7
Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
  Martin Kroeker c2fe9cb91f
Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
  Martin Kroeker 66b39b835c
Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
  Martin Kroeker bb6d6735bf
Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
  Martin Kroeker d18efaed20
Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
  Martin Kroeker 99f6d31ed5
Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
  Martin Kroeker 7de9335c56
Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
  Martin Kroeker 437c0bf2b4
Merge pull request #3843 from Mousius/switch-ratio 2 years ago
  Chris Sidebottom ec334e69dc Use SVE kernel for SGEMM/DGEMM on Arm(R) Neoverse(TM) V1 2 years ago
  Chris Sidebottom 32f2fafde7 Propagate SWITCH_RATIO to DYNAMIC_ARCH builds 3 years ago
  Martin Kroeker 44164e3a3d
revert "move alpha out of register 18" (out of PR scope, no SVE on Apple hw) 2 years ago
  Martin Kroeker 8be68fa7f4
move declaration of sca to really keep the compiler from throwing it out (for now) 2 years ago
  Martin Kroeker 3727672a74
Improve workaround and keep compilers from optimizing it out 2 years ago
  Martin Kroeker 108a21e47a
Move ALPHA out of register 18 (reserved on OSX) 2 years ago
  Martin Kroeker 0b1acb0ba3
Move ALPHA_I out of register 18 (reserved on OSX) 2 years ago
  Martin Kroeker c7bbad09ad
Move ALPHA_I out of register 18 (reserved on OSX) 2 years ago
  Martin Kroeker cda29633a3
move ALPHA_I out of register 18 (reserved on OSX) 2 years ago
  Martin Kroeker 09ace3cf23
Merge pull request #3846 from lilh9598/sbgemm_opt 2 years ago
  Sergei Lewis cb0a70e0e2 dot.c early bail fix 2 years ago
  Martin Kroeker 38d6fb4225
Fix dependencies in builds with specified subsets of precision types 2 years ago
  Martin Kroeker e412bee313
fix GEMM kernel dependencies in builds that use only a subset of precisions 2 years ago
  Martin Kroeker d80adf253e
make SSYMV available to BUILD_DOUBLE-only builds 2 years ago
  Martin Kroeker 5481c328e8
fix DYNAMIC_ARCH builds that use only a subset of precisions 2 years ago
  Martin Kroeker 5a9cd87794
Merge pull request #3868 from Mousius/sve-prefetch 3 years ago
  Chris Sidebottom 1361229291 Remove prefetches from SVE kernels 3 years ago
  Bart Oldeman 60e49b851c Fix typo in clobber list, should be xmm14 instead of ymm14. 3 years ago
  Bart Oldeman 4afe1439a1 Fix skylake fallback kernel name for old compilers. 3 years ago
  Bart Oldeman 5ceca1a4d8 Add sscal.c + microkernels for Haswell, Zen, Skylake and newer. 3 years ago
  lilianhuang 729af6406f bugfix for sbgemm_ncopy_8_neoversen2 3 years ago
  Martin Kroeker 042e3c0e7c
Merge pull request #3848 from bartoldeman/dscal-haswell-ymm 3 years ago
  Bart Oldeman 5c3169ecd8 dscal: use ymm registers in Haswell microkernel 3 years ago
  Chris Sidebottom eea006a688 Wrap SVE header with __has_include check 3 years ago
  Chris Sidebottom fd4f52c797 Add SVE implementation for sdot/ddot 3 years ago
  lilianhuang fdac8a97c1 Add sbgemm_ncopy_8 and sbgemm_tcopy_4 3 years ago
  lilianhuang 135718eafc Improve the performance of sbgemm_tcopy on neoversen2 3 years ago
  Chris Sidebottom 4f7b77e08a Remove unnecessary instructions from Advanced SIMD dot 3 years ago
  Martin Kroeker f73cfb7e2c
change line endings from CRLF to LF 3 years ago
  Martin Kroeker 1688c7da43
change line endings from CRLF to LF 3 years ago
  Bart Oldeman 6c1043eb41 Add [cz]scal microkernels for SKYLAKEX 3 years ago
  Martin Kroeker c9d78dc3b2
Remove excess initializer (leftover from rework of PR 3793) 3 years ago
  Martin Kroeker 65338a9493
Merge pull request #3799 from bartoldeman/cscal-zscal-no-fma 3 years ago
  Honglin Zhu 79066b6bf3 Change file name to match the norm and delete useless code. 3 years ago
  Bart Oldeman e7e3aa2948 x86_64: prevent GCC and Clang from generating FMAs in cscal/zscal. 3 years ago
  Honglin Zhu 4989e039a5 Define SBGEMM_ALIGN_K for DYNAMIC_ARCH build 3 years ago
  Honglin Zhu 843e9fd0b9 Fix typo error 3 years ago
  Honglin Zhu b00d5b9746 New sbgemm implementation for Neoverse N2 3 years ago
  Martin Kroeker f6f35a4288
fix copyobj declarations to work with DYNAMIC_ARCH 3 years ago
  Martin Kroeker b1d69fb3ac
Add MIPS64_GENERIC as a copy of GENERIC 3 years ago
  gxw edea1bcfaf MIPS64: Fixed failed utest dsdot:dsdot_n_1 when TARGET=I6500 3 years ago