783 Commits (45aa27b64b3cf923ca403f4e6fcd5f52d86a14c3)

Author SHA1 Message Date
  Ye Tao 63ce52ee77 change data type of bgemm alpha and beta from bfloat16 to fp32 and add makefiles changes for bgemm interface 11 months ago
  Ye Tao 082a9d28c3 Resolve symbol conflicts when building sbgemm and bgemm together 11 months ago
  Ye Tao 1eb0815b09 support mutithreaded bgemm interface 11 months ago
  Martin Kroeker 20f2ba0141
Move declaration of i for pre-C99 compilers 11 months ago
  Masato Nakagawa 2351a98005 Update 2D thread-partitioned GEMM for M << N case. 11 months ago
  Martin Kroeker 5141a90993
Fix ARMV9SME target in DYNAMIC_ARCH and add SME query code for MacOS (#5222) 1 year ago
  Ruiyang Wu 02fd1df10b CMake: Pass `OpenMP` compiler and linker flags through CMake targets 1 year ago
  Masato Nakagawa 80d3c2ad95 Add Improving Load Imbalance in Thread-Parallel GEMM 1 year ago
  Martin Kroeker 39eb43d441
Improve thread safety of pthreads builds that rely on C11 atomic operations for locking (#5170) 1 year ago
  Martin Kroeker 1533fe49be
Merge pull request #5144 from taoye9/dispatch_neoversve2_to_neoversven2 1 year ago
  Ye Tao f0bea79a6e dispatch NEOVERSEV2 to NEOVERSEN2 under dynamic setting 1 year ago
  Martin Kroeker eb84aac7ad
Merge pull request #5084 from quic/topic/sgemm_direct_sme1 1 year ago
  Martin Kroeker 77c638db67
Revert "Fix potential inaccuracy in multithreaded level3 related to SWITCH_RATIO" 1 year ago
  Vaisakh K V f66ca05b31
Merge branch 'develop' into topic/sgemm_direct_sme1 1 year ago
  Vaisakh K V d23eb3b93e Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API 1 year ago
  John Hein 6cd9bbe531 fix signedness of pointer to integer type passed to blas_lock() 1 year ago
  Martin Kroeker a182251284
fix typo 1 year ago
  Martin Kroeker ed95791618
fix conflicting variables 1 year ago
  Martin Kroeker 3c3d1c4849
Identify all cores and select the most performant one as TARGET 1 year ago
  Ralf Gommers 765ad8bcd2 Fix guard around `alloc_hugetlb`, fixes compile warning 1 year ago
  Ralf Gommers 48caf2303d Fix build warning about discarding volatile qualifier in memory.c 1 year ago
  Martin Kroeker 4060dd43e3
Add dummy implementations of openblas_get/set_affinity 1 year ago
  Martin Kroeker 8a1710dd0d
don't apply switch_ratio to tail of loop 1 year ago
  Martin Kroeker de421b7764
Merge pull request #4904 from XiWeiGu/la64_cross_cmake 1 year ago
  gxw 30af9278dc LoongArch64: Enable cmake cross-compilation 1 year ago
  gxw 48698b2b1d LoongArch64: Rename core 1 year ago
  Martin Kroeker 3ee9e9d8d0
Merge pull request #4879 from martin-frbg/issue4868-2 1 year ago
  Martin Kroeker a8d6b0219a
Merge pull request #4877 from XiWeiGu/fixed_undefined_blas_set_parameter 1 year ago
  Martin Kroeker d24b3cf393
properly fix buffer allocation and assignment 1 year ago
  gxw fd033467ac Fixed the undefined reference to blas_set_parameter 1 year ago
  Martin Kroeker 23b5d66a86
Ensure a memory buffer has been allocated for each thread before invoking it 1 year ago
  Martin Kroeker 753c7ebe17
Merge pull request #4835 from martin-frbg/revertwin4359 1 year ago
  Martin Kroeker 50397e017a
Merge pull request #4838 from martin-frbg/fix4662-3 1 year ago
  Martin Kroeker 5257f807a9
fix invalid ifdef syntax in HUGETLB handling 1 year ago
  Martin Kroeker 2aed90171a
Add riscv sources for DYNAMIC_ARCH 1 year ago
  Martin Kroeker 6468dc1142
restore the coarse locking of the pre-4359 version 1 year ago
  yamazaki-mitsufumi 821ef34635 Add A64FX to the list of CPUs supported by DYNAMIC_ARCH 1 year ago
  Martin Kroeker a815594fd1
Merge pull request #4801 from markdryan/markdryan/riscv-dynamic-arch 1 year ago
  Martin Kroeker a373d0f107
Improve the error message for thread creation failure 1 year ago
  Mark Ryan 3b715e6162 Add autodetection for riscv64 1 year ago
  Martin Kroeker d0b9948b23
Guard against invalid thread_status.queue 1 year ago
  Martin Kroeker 7e9a4ba427
Merge pull request #4741 from shivammonaka/Pthread_Scalability_Improvement 1 year ago
  Martin Kroeker 9b2a0c79cb
Add Zhaoxin KX7000 1 year ago
  shivammonaka 9e22d70957 Dynamic locking in Pthread Backend to allow multiple BLAS calls to be executed parallelly 1 year ago
  Martin Kroeker db070a9223
add gemm_batch drivers 1 year ago
  Martin Kroeker d0794f88dc
add gemm_batch driver 1 year ago
  Martin Kroeker 0073affe63
Merge pull request #4693 from goplanid/locks-improvement 1 year ago
  Martin Kroeker 6ca9ffa7f5
Merge pull request #4655 from yamazakimitsufumi/update_2d_thread_distribution 2 years ago
  Deeksha Goplani 0dc80a5c8d locks improvement 2 years ago
  Martin Kroeker 8da6f7e5f2
Merge pull request #4686 from XiWeiGu/loongarch64_dgemm_kernel_16x6 2 years ago