0228d3621
move -fopenmp to CFLAGS by
2024-09-30 21:38:05 +0200
7087b0a7d
ARM64: Enable SMALL_MATRIX_OPT when compiling with CMake by
2024-09-29 10:31:26 +0800
30af9278d
(refs/pull/4904/head)
LoongArch64: Enable cmake cross-compilation by
2024-09-26 16:55:06 +0800
48698b2b1
(refs/pull/4900/head)
LoongArch64: Rename core by
2024-09-18 17:20:43 +0800
c8788208c
Fixing block issue with transpose version. by
2024-09-27 13:27:03 -0500
d7c0d87cd
Small changes. by
2024-09-26 15:21:29 -0500
eb6f3a05e
Common MMA code. by
2024-09-26 09:28:56 -0500
fb287d17f
Common code. by
2024-09-25 16:31:36 -0500
8ab624577
Small change. by
2024-09-24 16:50:21 -0500
df1937556
Almost final code for MMA. by
2024-09-24 16:30:01 -0500
05aa63e73
More MMA BF16 GEMV code. by
2024-09-24 12:54:02 -0500
c9ce37d52
Force vector pairs in clang. by
2024-09-23 08:43:58 -0500
89a12fa08
MMA BF16 GEMV code. by
2024-09-23 06:32:14 -0500
e9824ae79
deploy: 92f7a2dc3e by
2024-09-19 12:15:38 +0000
92f7a2dc3
Merge pull request #4899 from martin-frbg/flangmtune by
2024-09-19 14:15:06 +0200
969bb949b
(refs/pull/4899/head)
Strip any mtune option from FFLAGS is the compiler is flang-new by
2024-09-19 11:10:28 +0200
30733e7d6
deploy: fca86e359c by
2024-09-16 09:17:50 +0000
fca86e359
Merge pull request #4887 from goplanid/develop by
2024-09-16 11:17:19 +0200
7947970f9
Move common code. by
2024-09-13 06:22:13 -0500
60c1519e0
Merge pull request #4896 from martin-frbg/update_azure_mac_hpc by
2024-09-12 21:09:28 +0200
c8313d9d8
Merge pull request #4895 from martin-frbg/update_homebrewjob by
2024-09-12 21:09:10 +0200
b588e922a
(refs/pull/4896/head)
Update oneAPI download location for Mac to final by
2024-09-12 18:13:46 +0200
4178905fa
(refs/pull/4895/head)
Update version of upload-artifacts following deprecation by
2024-09-12 16:39:20 +0200
70ea109d6
deploy: 5f70e245a2 by
2024-09-12 13:10:29 +0000
5f70e245a
Merge pull request #4894 from martin-frbg/issue4893 by
2024-09-12 15:09:54 +0200
383e0b133
(refs/pull/4894/head)
remove suppression of gcc14's incompatible pointer error by
2024-09-11 22:21:09 +0200
869a169c5
Fix ZAXPYTEST prototype by
2024-09-11 22:18:14 +0200
72216d28c
Fix bug with inc_y adding results twice. by
2024-09-11 08:47:32 -0500
2f142ee85
More common code. by
2024-09-09 14:41:55 -0500
39fd29f1d
Minor improvement and turn off BF16 GEMV forwarding by default. by
2024-09-08 18:28:31 -0500
8541b25e1
Special case beta is one. by
2024-09-06 14:48:48 -0500
76227e294
Initial commit for vectorized BF16 GEMV. Added GEMM_GEMV_FORWARD_BF16 to enable using BF16 GEMV for one dimension matrices. Updated unit test to support inc_x != 1 or inc_y for GEMV. by
2024-09-06 14:03:31 -0500
4894c5405
(refs/pull/4887/head)
Improve TN case with further unrolling by
2024-09-02 22:22:49 +0530
060c86351
(refs/pull/4885/head)
BLD: Add Windows build by
2024-08-25 18:10:15 +0000
6ce99e314
MAINT: Add a configuration for meson format by
2024-08-17 16:37:15 -0500
3f9ffecf8
MAINT: Fixup hardcoded build folder by
2024-08-17 16:32:41 -0500
4ee4873c2
deploy: 485027563e by
2024-08-17 09:47:57 +0000
485027563
Merge pull request #4883 from ChipKerchner/fixSGEMMUnitTestZeroSize by
2024-08-17 11:47:26 +0200
89702e1f4
(refs/pull/4883/head)
Fix zero element GEMV test. by
2024-08-16 11:37:39 -0500
77f85c7c0
GEMV tests don't like zero elements. by
2024-08-16 11:15:32 -0500
868aa857b
Change malloc zero to return one byte and update the SBGEMM test to again use sizes of zero. by
2024-08-16 10:28:10 -0500
b1802f4dc
Fix unit test to start at 1 instead of 0 - since malloc zero bytes fails on some systems. by
2024-08-16 09:51:37 -0500
f61930eb1
Merge pull request #4882 from martin-frbg/issue4805-3 by
2024-08-16 11:24:51 +0200
dfba3f884
(refs/pull/4882/head)
restore the pragma as it is reportedly still needed on 3C6000/gcc14.2 by
2024-08-16 11:23:19 +0200
54b868f71
deploy: 7129a64d87 by
2024-08-16 06:47:47 +0000
7129a64d8
Merge pull request #4881 from martin-frbg/issue4805-2 by
2024-08-16 08:47:12 +0200
49080b631
(refs/pull/4881/head)
remove optimizer pragma again by
2024-08-15 22:15:27 +0200
e05d98d00
expressly use fld.d/fst.d for floating point registers instead of LD/ST macros by
2024-08-15 22:14:29 +0200
3ee9e9d8d
Merge pull request #4879 from martin-frbg/issue4868-2 by
2024-08-15 22:06:54 +0200
dd71df8fa
Merge pull request #4880 from ChipKerchner/betterPowerGEMVTail by
2024-08-15 20:36:22 +0200
a8d6b0219
Merge pull request #4877 from XiWeiGu/fixed_undefined_blas_set_parameter by
2024-08-15 15:35:26 +0200
d24b3cf39
(refs/pull/4879/head)
properly fix buffer allocation and assignment by
2024-08-15 15:32:58 +0200
a0aeba631
(refs/pull/4880/head)
Merge branch 'develop' into betterPowerGEMVTail by
2024-08-15 08:00:00 -0500
eba8615c1
Merge pull request #4876 from martin-frbg/granite by
2024-08-15 13:50:54 +0200
bc80e7f02
Merge pull request #4878 from martin-frbg/cirrus-androidndk by
2024-08-15 13:50:09 +0200
94c9e0b7a
(refs/pull/4878/head)
Update ndk version number by
2024-08-15 11:30:23 +0200
ed0321563
fix installation of NDK in armv7 crossbuild by
2024-08-15 11:11:07 +0200
fd033467a
(refs/pull/4877/head)
Fixed the undefined reference to blas_set_parameter by
2024-08-15 16:48:48 +0800
1b8e40874
(refs/pull/4876/head)
Add autodetection support for Intel Granite Rapids as Sapphire Rapids by
2024-08-15 09:33:42 +0200
cbfe72ca7
deploy: 4944148e66 by
2024-08-15 07:32:47 +0000
4944148e6
Merge pull request #4875 from ChipKerchner/addGEMVtoBF16Test by
2024-08-15 09:32:11 +0200
a388c4b83
Merge pull request #4872 from chenx97/ls3a-fix-stack-fpr-len by
2024-08-15 00:10:16 +0200
f24b52170
Merge pull request #4787 from vlad0x00/patch-1 by
2024-08-15 00:09:53 +0200
2d84ed7e7
(refs/pull/4787/head)
Update README.md by
2024-08-14 14:31:35 -0700
083faf755
Merge branch 'develop' into betterPowerGEMVTail by
2024-08-14 15:56:03 -0500
c23897f58
(refs/pull/4875/head)
Add GEMV testing to SBGEMx vs SGEMx testing. by
2024-08-14 15:55:23 -0500
0d8ee96f1
Merge pull request #4874 from martin-frbg/issue4869 by
2024-08-14 22:49:12 +0200
b80671d89
Merge pull request #4871 from martin-frbg/issue4868 by
2024-08-14 20:53:39 +0200
6452f7b46
Merge pull request #4873 from ChipKerchner/fixSBGEMMDefaults by
2024-08-14 19:22:03 +0200
75472b830
Merge branch 'develop' into betterPowerGEMVTail by
2024-08-14 10:52:46 -0500
9842a6cf2
deploy: ca7777de18 by
2024-08-14 15:37:07 +0000
ca7777de1
Merge pull request #4870 from chenx97/fix-recursive-make-var by
2024-08-14 16:03:50 +0200
f6469e21b
(refs/pull/4874/head)
move gelqs and geqrs to lapack-deprecated by
2024-08-14 16:00:43 +0200
31226740d
(refs/pull/4873/head)
Cleanup of SBGEMM unit test. by
2024-08-14 08:10:25 -0500
070183571
Merge pull request #24 from HaoZeke/sharedLib by
2024-08-14 06:03:04 -0700
04d9a533b
BLD: Use `both_libraries` to build libs by
2024-08-14 10:45:26 +0000
ef94b9653
(refs/pull/4872/head)
Use ldc1 and sdc1 for the prologue and epilogue on LOONGSON3A by
2024-08-13 14:53:37 +0800
23b5d66a8
(refs/pull/4871/head)
Ensure a memory buffer has been allocated for each thread before invoking it by
2024-08-14 10:35:44 +0200
20bdb6588
(refs/pull/4870/head)
Fix recursive variable expansion in Makefiles for LOONGSON3A by
2024-08-12 16:22:31 +0800
adea56954
BLD: Create OpenBLAS shared object by
2024-08-13 09:42:46 +0000
b1737698d
Fix DEFAULTS in SBGEMM for POWER10. Also comparisons for SBGEMM unit test can be exactly due to epilison differences. by
2024-08-13 07:01:21 -0500
62d1a3cf3
deploy: e5525036e7 by
2024-08-13 05:20:43 +0000
e5525036e
Merge pull request #4865 from martin-frbg/issue4856 by
2024-08-13 07:20:06 +0200
fd52d0949
Merge pull request #4864 from martin-frbg/issue4862 by
2024-08-13 00:16:45 +0200
f332ecbf1
deploy: 35dd625adf by
2024-08-12 20:06:18 +0000
35dd625ad
Merge pull request #4859 from martin-frbg/cooper_sb by
2024-08-12 22:05:43 +0200
a48b11763
Update version information for 0.3.28 by
2024-08-12 18:22:20 +0200
da6393ab9
(refs/pull/4866/head)
set larger threshold for POWER10 by
2024-08-12 09:13:01 -0400
d8f740791
(refs/pull/4865/head)
tweak threshold a little more to cover POWER10 fma by
2024-08-12 14:50:49 +0200
73e13b027
(refs/pull/4864/head)
flesh out HERK prototype by
2024-08-12 14:45:40 +0200
824306baa
flesh out HERK prototype by
2024-08-12 14:44:13 +0200
cf98f7afc
Merge pull request #23 from HaoZeke/mesonDocs by
2024-08-12 11:27:20 +0000
ff42a9f4f
DOC: Meson build docs by
2024-08-09 14:52:38 +0200
05a72c7a7
(refs/pull/4860/head)
Update azure-pipelines.yml by
2024-08-11 10:42:17 +0200
7ca835a82
(refs/pull/4859/head)
address clang array overflow warning by
2024-08-10 13:44:56 +0200
a87c4d26d
Merge pull request #4857 from nekopsykose/ppc by
2024-08-10 00:15:28 +0200
1265eee85
(refs/pull/4857/head)
fix cmake typo for power10 cc version check by
2024-08-09 20:38:05 +0200
6d31ff0b1
Merge pull request #17 from HaoZeke/multiArch by
2024-08-09 08:22:53 +0000
f0e9e93a2
deploy: cb38d666da by
2024-08-09 01:41:29 +0000
cd3945b99
Update version to 0.3.28.dev by
2024-08-08 23:09:45 +0200