2356 Commits (a53a197934702089800d3fa05fbddbb1fdf69187)

Author SHA1 Message Date
  Chip Kerchner c8f53b85ce Merge remote-tracking branch 'origin/develop' into vectorizeBF16GEMV 1 year ago
  Martin Kroeker e52d9b4cf1
Merge pull request #4928 from austinpagan/czgemm_in_c 1 year ago
  Gordon Fossum 0b7fb5c791 CGEMM & ZGEMM using C code. 1 year ago
  Martin Kroeker 9783dd07ab
Rename KERNEL.LOONGSONGENERIC to KERNEL.LA64_GENERIC 1 year ago
  Chip Kerchner d6bb8dcfd1 Common code. 1 year ago
  Martin Kroeker c9e92348a6
Handle inf/nan if dummy2 flag is set 1 year ago
  Chip Kerchner 9ac0fb0111 Merge branch 'develop' into vectorizeBF16GEMV 1 year ago
  Martin Kroeker d714013ab9
change sgemm kernel to 4x4 as the 16x4 altivec goes out of bounds 1 year ago
  Chip Kerchner 915a6d6e44 Add casting. 1 year ago
  Chip Kerchner 7ec3c16d82 Remove beta from optimized functions. 1 year ago
  Martin Kroeker de421b7764
Merge pull request #4904 from XiWeiGu/la64_cross_cmake 1 year ago
  Chip Kerchner 7cc00f68c9 Remove more duplicate. 1 year ago
  Chip Kerchner e238a68c03 Remove duplicate. 1 year ago
  Chip Kerchner 32095b0cbb Remove parameter. 1 year ago
  gxw 30af9278dc LoongArch64: Enable cmake cross-compilation 1 year ago
  gxw 48698b2b1d LoongArch64: Rename core 1 year ago
  Chip Kerchner c8788208c8 Fixing block issue with transpose version. 1 year ago
  Chip Kerchner d7c0d87cd1 Small changes. 1 year ago
  Chip Kerchner eb6f3a05ef Common MMA code. 1 year ago
  Chip Kerchner fb287d17fc Common code. 1 year ago
  Chip Kerchner 8ab6245771 Small change. 1 year ago
  Chip Kerchner df19375560 Almost final code for MMA. 1 year ago
  Chip Kerchner 05aa63e738 More MMA BF16 GEMV code. 1 year ago
  Chip Kerchner c9ce37d527 Force vector pairs in clang. 1 year ago
  Chip Kerchner 89a12fa083 MMA BF16 GEMV code. 1 year ago
  Chip Kerchner 7947970f9d Move common code. 1 year ago
  Chip Kerchner 72216d28c2 Fix bug with inc_y adding results twice. 1 year ago
  Chip Kerchner 2f142ee857 More common code. 1 year ago
  Chip Kerchner 39fd29f1de Minor improvement and turn off BF16 GEMV forwarding by default. 1 year ago
  Chip Kerchner 8541b25e1d Special case beta is one. 1 year ago
  Chip Kerchner 76227e2948 Initial commit for vectorized BF16 GEMV. Added GEMM_GEMV_FORWARD_BF16 to enable using BF16 GEMV for one dimension matrices. Updated unit test to support inc_x != 1 or inc_y for GEMV. 1 year ago
  Deeksha Goplani 4894c54055 Improve TN case with further unrolling 1 year ago
  Martin Kroeker e05d98d00a
expressly use fld.d/fst.d for floating point registers instead of LD/ST macros 1 year ago
  Chip Kerchner a0aeba631d Merge branch 'develop' into betterPowerGEMVTail 1 year ago
  Chip Kerchner 083faf7556 Merge branch 'develop' into betterPowerGEMVTail 1 year ago
  Chip Kerchner 75472b830a Merge branch 'develop' into betterPowerGEMVTail 1 year ago
  Henry Chen ef94b96530 Use ldc1 and sdc1 for the prologue and epilogue on LOONGSON3A 1 year ago
  Martin Kroeker 7ca835a82c
address clang array overflow warning 1 year ago
  Martin Kroeker 46e331a917
remove the unworkable GEMM3M restriction from GENERIC again 1 year ago
  Martin Kroeker ccc23338d7
have the dummy GEMM3M kernel at least forward to regular GEMM 1 year ago
  Martin Kroeker f1c9803f9a
add proper return statement 1 year ago
  Martin Kroeker 60abcc3991
add proper return statement 1 year ago
  Chip Kerchner 1a7b8c650d Merge branch 'develop' into betterPowerGEMVTail 1 year ago
  Martin Kroeker 9afd0c8afd
Merge pull request #4814 from Mousius/gemv-proxy 1 year ago
  Martin Kroeker edbf093c98
Update zarch SCAL kernels to handle INF and NAN arguments (#4829) 1 year ago
  Chris Sidebottom ba2e989c67 Add accumulators to AArch64 GEMV Kernels 1 year ago
  Martin Kroeker a875304eb0
fix inverted conditional for NAN handling 1 year ago
  Martin Kroeker 24acdd6bbb
correct offset 1 year ago
  Martin Kroeker fb7c53c5e5
Merge pull request #4807 from martin-frbg/scalfixes 1 year ago
  Martin Kroeker 15c53dd2e0
Merge pull request #4794 from XiWeiGu/Fixed_Numpy_CI_Test 1 year ago