569 Commits (a0a3bf7c81ed7bbd98c8e6effd0af2ecfc15577b)

Author SHA1 Message Date
  Martin Kroeker ea8eec5d17
Merge pull request #2422 from wjc404/develop 6 years ago
  wjc404 dd22eb7621
Update cgemm_kernel_8x2_haswell.c 6 years ago
  wjc404 2352331e60
Update zgemm_kernel_4x2_haswell.c 6 years ago
  wjc404 1b980001dd
Update zgemm_kernel_4x2_haswell.c 6 years ago
  wjc404 2515e1152f
Update cgemm_kernel_8x2_haswell.c 6 years ago
  wjc404 903854c168
Add files via upload 6 years ago
  wjc404 a2ff577a30
Update KERNEL.ZEN 6 years ago
  wjc404 97a32cb0a5
Update KERNEL.HASWELL 6 years ago
  Martin Liska aeea14ee40
Come up with LOAD_AND_COMPARE_TO_MXX macro in iamax_sse.S. 6 years ago
  Martin Liska 18bcc36a69
Fix implementation of iamax_sse.S as reported in #2116. 6 years ago
  wjc404 f566787e6e
Update KERNEL.SKYLAKEX 6 years ago
  wjc404 e3368cbf18
AVX512 STRMM kernel 6 years ago
  Bart Oldeman 7ea5e07d1c Fix inline asm in dscal: mark x, x1 as clobbered. Fixes #2408 6 years ago
  wjc404 3447d04eaf
Update dgemm_kernel_16x2_skylakex.c 6 years ago
  wjc404 8b5cdcc64c
Update sgemm_kernel_8x4_haswell.c 6 years ago
  wjc404 4e00d96a78
Update dgemm_kernel_16x2_skylakex.c 6 years ago
  wjc404 096da2f51a
Update dgemm_kernel_16x2_skylakex.c 6 years ago
  wjc404 081b188529
Update KERNEL.SKYLAKEX 6 years ago
  wjc404 8019e70211
AVX512 16x2 DGEMM kernel 6 years ago
  wjc404 e5dcdeb550
Update sgemm_direct_skylakex.c 6 years ago
  wjc404 952cc2ba38
Update sgemm_kernel_16x4_skylakex_2.c 6 years ago
  wjc404 feaafbedd3
make skylakex sgemm code more friendly for readers 6 years ago
  wjc404 3a100b2797
Update KERNEL.SKYLAKEX 6 years ago
  wjc404 bd4c032f52
Update sgemm_kernel_8x4_haswell.c 6 years ago
  wjc404 9dc9b7b95e
Update sgemm_kernel_8x4_haswell.c 6 years ago
  wjc404 92b10212de
optimize AVX2 SGEMM 6 years ago
  wjc404 b73bf01378
optimize AVX2 SGEMM 6 years ago
  wjc404 eb3c9f1db9
optimize AVX2 SGEMM 6 years ago
  wjc404 a0f0a802fc
Update zgemm3m_kernel_4x4_haswell.c 6 years ago
  wjc404 700fe5b5ee
Add files via upload 6 years ago
  wjc404 f60840c420
Update KERNEL.ZEN 6 years ago
  wjc404 109e18cd96
Update KERNEL.HASWELL 6 years ago
  wjc404 ae1579be13
Create zgemm3m_kernel_4x4_haswell.c 6 years ago
  wjc404 cd765f094b
Update cgemm3m_kernel_8x4_haswell.c 6 years ago
  wjc404 3a66c8cac1
Update KERNEL.ZEN 6 years ago
  wjc404 ed9af2f7da
Update KERNEL.HASWELL 6 years ago
  wjc404 5fd1edead9
Create cgemm3m_kernel_8x4_haswell.c 6 years ago
  wjc404 eeecd623d8
Update cgemm_kernel_8x2_haswell.c 6 years ago
  wjc404 2cd9306bb5
Update KERNEL.ZEN 6 years ago
  wjc404 c418c81224
Update KERNEL.HASWELL 6 years ago
  wjc404 025741f16a
Fast Haswell CGEMM kernel 6 years ago
  wjc404 f41d52665d
Fast Haswell ZGEMM kernel 6 years ago
  wjc404 d573d24de7
Fast Haswell ZGEMM kernel 6 years ago
  Isuru Fernando b863b32ac5 Workaround an ICE in clang 9.0.0 6 years ago
  wjc404 934e601e93
Update dgemm_kernel_4x8_skylakex_2.c 6 years ago
  wjc404 eb1e9c8c92
some optimizations 6 years ago
  Wang, Long bfb5fbdb4d revised fix windows compatible for #2313 6 years ago
  Wang, Long 1191db1a49 For the sake of windows compatible, used "unsigned long long" to ensure 64-bit length 6 years ago
  Wang, Long 0caf1434c9 Fix the integer overflow issue for large matrix size 6 years ago
  wjc404 819e852ae7
AVX512 CGEMM & ZGEMM kernels 6 years ago