428 Commits (d36093d08419be26c7995c329bb9c8bb6ed45420)

Author SHA1 Message Date
  Martin Kroeker 11ff18bb0f
Merge pull request #5081 from XiWeiGu/kernel_generic_fixed_cscal_zscal 11 months ago
  Martin Kroeker 42b7d1f897
Fix addressing of alpha in CBLAS 11 months ago
  Martin Kroeker 6680e0592f
Fix conditional inclusion of SGEMM_KERNEL_DIRECT 11 months ago
  Martin Kroeker 70865a894e
Merge pull request #5180 from ywwry66/openmp_use_cmake 1 year ago
  Ruiyang Wu 02fd1df10b CMake: Pass `OpenMP` compiler and linker flags through CMake targets 1 year ago
  Martin Kroeker 51c1fb1f93
Fix ?spmv build and misinterpretation of NO_LAPACK=0 1 year ago
  shubham.chaudhari 8e289ecddc Simplified thread throttling function in gemv 1 year ago
  shubham.chaudhari 189dbbc04f Add thread throttling for dynamic arch neoversev1 1 year ago
  shubham.chaudhari b6cb5ece58 Add thread throttling profile for DGEMV on NEOVERSEV1 1 year ago
  Martin Kroeker 7338a473a7
Merge pull request #5150 from Harishmcw/WoA-Experiments 1 year ago
  Martin Kroeker 09ba099461
make throttling code conditional on SMP 1 year ago
  Harishmcw 030ae1fd97 Redefined threading logic for WoA 1 year ago
  Martin Kroeker c03a81b927
Merge pull request #5141 from michalowski-arm/fork-throttle 1 year ago
  Martin Kroeker 75b958a018
Transform the B array back if necessary before returning 1 year ago
  Marek Michalowski 650a062e19 Add thread throttling profile for SGEMV on `NEOVERSEV2` 1 year ago
  Marek Michalowski b723c1b7b7 Add thread throttling profile for SGEMM on `NEOVERSEV2` 1 year ago
  Vaisakh K V f66ca05b31
Merge branch 'develop' into topic/sgemm_direct_sme1 1 year ago
  Vaisakh K V d23eb3b93e Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API 1 year ago
  Harish-Gits daf16b8229 Adjusted GESV threading logic for optimal performance on WoA 1 year ago
  Martin Kroeker 60d0be0e97
Update nrm2.c 1 year ago
  Martin Kroeker 0fd5448b2c
Handle INCX=0 1 year ago
  Martin Kroeker db7e5f1fa7
Update gemmt.c 1 year ago
  Martin Kroeker ff30ac9666
Update Makefile 1 year ago
  Martin Kroeker 7c3e169b67
Update gemmt.c 1 year ago
  Martin Kroeker 09414a4187
Ensure that GEMMTR name appears in XERBLA if gemmt was called as such 1 year ago
  Marek Michalowski 838bb57e27
Merge branch 'develop' into develop 1 year ago
  Martin Kroeker a54f9a9c69
Merge pull request #5071 from annop-w/sgemm_throttling 1 year ago
  Marek Michalowski 4d5b13f765 Add thread throttling profile for SGEMV on `NEOVERSEV1` 1 year ago
  tingbo.liao 3c8df6358f Further rearranged the rotm kernel for the different architectures. 1 year ago
  gxw e114880dc4 kernel/generic: Fixed cscal and zscal 1 year ago
  Annop Wongwathanarat c8cd8da496 Add thread throttling profile for SGEMM on NEOVERSEV1 1 year ago
  Martin Kroeker a1075477c3
Merge pull request #4994 from martin-frbg/issue4886 1 year ago
  Martin Kroeker 0c440f8a27
disable multithreading for small workloads 1 year ago
  Martin Kroeker 2a290dfc2c
forward GEMM3M calls for GENERIC targets to the regular C/ZGEMM for now 1 year ago
  Martin Kroeker 0cf656fd3e
Add copies of GEMMT under its new name GEMMTR 1 year ago
  Chris Daley cb48505251 optimize gemv forwarding on ARM64 systems 1 year ago
  Chip Kerchner 36bd3eeddf Vectorize BF16 GEMV (VSX & MMA). Use GEMM_GEMV_FORWARD_BF16 (for Power). 1 year ago
  Chip Kerchner 1d51ca5798 Change multi-threading logic for SBGEMV to be the same as SGEMV. 1 year ago
  Martin Kroeker 9762464718
Fix CBLAS interface filling in the wrong triangle for Row-Major 1 year ago
  gxw 48698b2b1d LoongArch64: Rename core 1 year ago
  Martin Kroeker 7878976236
disable forwarding from SBGEMM to SBGEMV for now 1 year ago
  Chris Sidebottom b26424c6a2 Allow opt into GEMM -> GEMV forwarding 1 year ago
  Chris Sidebottom 90eb863d4b Re-add accidental removal 1 year ago
  Chris Sidebottom 28b5334f22 Complete implementation of GEMV forwarding 1 year ago
  Martin Kroeker 3db5dbc88e forward to GEMV when one argument is actually a vector 1 year ago
  gxw f3cebb3ca3 x86: Fixed numpy CI failure when the target is ZEN. 1 year ago
  Martin Kroeker 2f12a47405
fix build options for CAXPYC/ZAXPYC 1 year ago
  Martin Kroeker db9f7bc552
fix float array types to include bfloat16 1 year ago
  Martin Kroeker 076766df4e
Update CMakeLists.txt 1 year ago
  Martin Kroeker ff6670cb83
don't generate non-cblas files for gemm_batch 1 year ago