732144466
(refs/pull/5260/head)
enable sbgemm to be forward to sbgemv on arm64 by
2025-05-12 13:41:21 +0000
d3ebb85d3
more cleanup by
2025-05-12 05:26:32 -0700
1db585c19
Fix missing return by
2025-05-11 23:51:52 -0700
4eb65e25f
Fix up f2c conversions by
2025-05-11 14:29:49 -0700
b167ae1ba
Update c_zblat3c.c by
2025-05-11 18:29:33 +0200
47f7d3694
Update c_cblat3c.c by
2025-05-11 18:28:35 +0200
11004a77d
Update c_dblat3c.c by
2025-05-11 18:27:31 +0200
13aa7d81d
Update c_sblat3c.c by
2025-05-11 18:20:59 +0200
77b2f15c4
Merge branch 'OpenMathLib:develop' into gemmt_tests by
2025-05-11 07:01:19 -0700
917ca9f13
deploy: cf9e34c1f4 by
2025-05-11 07:23:55 +0000
cf9e34c1f
Merge pull request #5258 from martin-frbg/issue5255 by
2025-05-11 00:23:28 -0700
5141a9099
Fix ARMV9SME target in DYNAMIC_ARCH and add SME query code for MacOS (#5222) by
2025-05-10 13:39:32 -0700
b283b77cb
deploy: 2320e0b757 by
2025-05-10 16:49:46 +0000
2320e0b75
Merge pull request #5244 from chitao1234/develop by
2025-05-10 09:49:18 -0700
0d69a2930
(refs/pull/5258/head)
Fix empty prototypes of select/selctg by
2025-05-10 08:39:57 -0700
ebbe682f7
Fix function prototypes by
2025-05-10 08:38:06 -0700
8e4751228
(refs/pull/5256/head)
c910v: use xtheadvector insteadof v0p7 by
2025-05-10 16:20:34 +0800
341fbd6c5
Bump xuantie toolchains V3.0.2 for c910v by
2025-05-10 16:18:08 +0800
3c74c63c4
deploy: 3c878f3e70 by
2025-05-09 12:58:34 +0000
a0bf09f25
(refs/pull/5222/head)
Merge branch 'OpenMathLib:develop' into fix_dyn_armv9sme by
2025-05-09 05:47:12 -0700
151b74284
Merge pull request #5203 from quic/fix-sgemmdirect-sme1 by
2025-05-09 05:39:47 -0700
3c878f3e7
Cirrus CI: Update xcode version in the Apple crossbuilds (#5254) by
2025-05-09 05:38:53 -0700
9e8dff519
(refs/pull/5254/head)
Update .cirrus.yml by
2025-05-09 14:16:44 +0200
0fb94b002
Update xcode version in the Apple crossbuilds by
2025-05-09 14:08:54 +0200
1e48d04aa
Update dynamic_arm64.c by
2025-05-09 11:59:28 +0200
7241c570d
always build sgemm_direct kernel on arm64, even if just as dummy by
2025-05-09 00:06:18 +0200
c81c1c8b2
sgemm_direct has to be unconditionally present on all arm64 by
2025-05-08 16:48:13 +0200
f912c4ccf
provide a dummy implementation for non-SME targets by
2025-05-08 16:44:23 +0200
6e210c6e3
build a (dummy) sgemm_direct kernel on all arm64 by
2025-05-08 16:40:40 +0200
27a4084a1
make sgemm_direct unconditionally available on all arm64 by
2025-05-08 16:30:37 +0200
6546da6f2
Merge branch 'OpenMathLib:develop' into fix_dyn_armv9sme by
2025-05-08 07:28:09 -0700
7b47c9790
deploy: 3e961c2771 by
2025-05-05 12:57:27 +0000
3e961c277
Merge pull request #5251 from martin-frbg/issue5250 by
2025-05-05 05:49:06 -0700
0ea9205a6
Merge pull request #5249 from scottt/fix-build-on-intel-arrow-lake by
2025-05-01 13:17:34 -0700
cba32d001
Merge pull request #5245 from guoyuanplct/develop by
2025-05-01 03:04:38 -0700
5c958dfe1
(refs/pull/5251/head)
Avoid of out of bounds accesses in SCAL when INFO<0 by
2025-05-01 11:58:21 +0200
4c0445aed
Avoid out of bounds accesses in SCAL when INFO <0 by
2025-05-01 11:56:07 +0200
d48a2fc46
Avoid out of bounds accesses in SCAL when INFO<0 by
2025-05-01 11:53:50 +0200
47b43054f
Avoid out of bounds accesses in SCAL when INFO<0 by
2025-05-01 11:52:22 +0200
982baaa1e
deploy: 52367eac67 by
2025-05-01 08:10:06 +0000
52367eac6
Merge pull request #5248 from ErnstPeng/fix-lasx by
2025-05-01 01:09:38 -0700
4bee135cc
(refs/pull/5249/head)
cpuid_x86: improve Intel Arrow Lake detection by
2025-04-30 20:53:11 +0800
f19e72c40
(refs/pull/5248/head)
Loongarch64: fixed swap_lasx by
2025-04-30 16:42:52 +0800
b471fa337
Loongarch64: fixed snrm2_lasx by
2025-04-30 16:42:36 +0800
57bb46bed
Loongarch64: fixed rot_lasx by
2025-04-30 16:42:22 +0800
6dc4ca239
Loongarch64: fixed icamax_lasx by
2025-04-30 16:42:12 +0800
b528b1b8e
Loongarch64: fixed iamax_lasx by
2025-04-30 16:41:58 +0800
ba9569e38
Loongarch64: fixed dot_lasx by
2025-04-30 16:41:48 +0800
dc5fa2985
Loongarch64: fixed cscal_lasx by
2025-04-30 16:41:39 +0800
a98dd6d91
Loongarch64: fixed copy_lasx by
2025-04-30 16:41:28 +0800
d49319c2d
Loongarch64: fixed cnrm2_lasx by
2025-04-30 16:41:18 +0800
74c97ef81
Loongarch64: fixed cdot_lasx by
2025-04-30 16:41:05 +0800
be525521a
Loongarch64: fixed asum_lasx by
2025-04-30 16:40:55 +0800
0cd5ca552
Loongarch64: fixed amax_lasx by
2025-04-30 16:40:44 +0800
11ffc8680
(refs/pull/5245/head)
Format the code by
2025-04-25 00:27:27 +0800
7616c4209
Optimized RVV_ZVL256B Implementation of zgemv_n by
2025-04-25 00:05:15 +0800
e1bd63159
(refs/pull/5244/head)
allow the use of LAPACK_COMPLEX_CPP when using MSVC compiler by
2025-04-24 18:59:10 +0800
31b828b09
deploy: 70dff3b84f by
2025-04-23 20:54:27 +0000
70dff3b84
Merge pull request #5242 from abhishek-iitmadras/abhishekk_dot by
2025-04-23 13:53:55 -0700
0c239c9d4
(refs/pull/5242/head)
update contribution list by
2025-04-22 21:56:05 +0530
9c02cdb07
optimise dot using thread throttling for NEOVERSE V1 by
2025-03-24 01:00:50 +0530
d0e8fd6d4
Merge pull request #5239 from annop-w/gemv_n_sve by
2025-04-22 10:19:49 -0700
ddfefd9bf
Merge pull request #5240 from iha-taisei/fixedIssue5231 by
2025-04-22 06:04:26 -0700
08b5c18d7
(refs/pull/5240/head)
fixed a potential out-of-bounds on gemv. by
2025-04-22 19:56:44 +0900
e11744a41
(refs/pull/5239/head)
Use SVE kernel for S/DGEMVN for SVE machines by
2025-04-22 09:40:13 +0000
db0abfa90
Merge pull request #5238 from martin-frbg/revert5125 by
2025-04-22 02:12:19 -0700
1eadb0ad6
deploy: 7389b6c483 by
2025-04-22 06:36:51 +0000
7389b6c48
Merge pull request #5237 from martin-frbg/revert5219 by
2025-04-21 23:36:23 -0700
4ec62d7f7
(refs/pull/5238/head)
remove non-vectorized code path for power8, restoring PR4880 by
2025-04-21 23:14:10 +0200
1df8738f2
Merge pull request #5235 from quickwritereader/issue_unaligned_ppc64le by
2025-04-21 14:03:56 -0700
99d9f1ff3
(refs/pull/5237/head)
Fix conditional by
2025-04-21 22:55:45 +0200
96d80801b
Reinstate the CooperLake microkernel by
2025-04-21 22:53:26 +0200
f5bc97c37
Merge pull request #5227 from zanpeeters/develop by
2025-04-21 10:28:53 -0700
dff68ecd8
deploy: 050c3b26ae by
2025-04-21 10:04:46 +0000
050c3b26a
Merge pull request #5236 from ywwry66/apple_workaround by
2025-04-21 03:04:14 -0700
9aa7a0b2a
(refs/pull/5236/head)
Follow-up to d659f3c by
2025-04-20 22:55:19 -0400
94fceaeac
Merge pull request #5233 from ywwry66/apple_workaround by
2025-04-20 13:38:43 -0700
d659f3c3f
(refs/pull/5233/head)
Fix "Argument list too long" compilation error for Intel macOS by
2025-04-16 12:15:44 -0400
2e4309315
Merge pull request #5219 from martin-frbg/sbgemvn_cooper by
2025-04-20 07:29:20 -0700
d4334573b
deploy: afc1dc69cd by
2025-04-20 14:28:52 +0000
afc1dc69c
Merge pull request #5234 from RevySR/bump-xuantie-qemu by
2025-04-20 07:28:21 -0700
0cc248559
(refs/pull/5235/head)
Explicit unaligned vector load/stores in PPC64LE GEMV kernels by
2025-04-20 07:50:04 +0000
1f687b2f6
(refs/pull/5234/head)
Bump xuantie qemu for c910v by
2025-04-20 14:20:49 +0800
dd38b4e81
Merge pull request #5225 from annop-w/gemv_n by
2025-04-17 01:54:10 -0700
c14a329a6
deploy: 3a088de2d1 by
2025-04-17 07:51:55 +0000
3a088de2d
Merge pull request #5228 from martin-frbg/cmakecrossarm by
2025-04-17 00:51:26 -0700
0241d516f
Merge pull request #5220 from iha-taisei/sdgemv_n_unroll by
2025-04-16 12:55:55 -0700
b26453802
deploy: afb664527f by
2025-04-16 17:02:09 +0000
afb664527
Merge pull request #5221 from tetsuzo-usui/tune_symv_for_arm64 by
2025-04-16 10:01:38 -0700
d53572880
(refs/pull/5225/head)
Improve performance for SGEMVN on NEONVERSEN1 by
2025-04-09 12:54:57 +0000
d9369bda1
(refs/pull/5228/head)
Update and amend parameters for Neoverse cpus by
2025-04-16 01:09:57 -0700
acef78c77
(refs/pull/5227/head)
Reset buffer length before every call to sysctlbyname. by
2025-04-15 17:17:17 -0700
d1c2528ae
Add L1_DATA_LINESIZE for ifdef __APPLE__ by
2025-04-15 17:14:19 -0700
7b66330de
hw.perflevel[01].cpusperl changed to hw.perflevel[01].cpusperl2 by
2025-04-15 17:12:03 -0700
e5ffb7c0a
Fix ARMV9SME target and add support_sme1 code for MacOS by
2025-04-11 08:09:52 -0700
d711906e3
(refs/pull/5221/head)
Add symv kernels for arm64 by
2025-04-11 20:39:52 +0900
f1e628b88
(refs/pull/5220/head)
Further performance improvements to [SD]GEMV. by
2025-04-11 20:00:33 +0900
931b60617
deploy: 39718cd28e by
2025-04-11 05:40:38 +0000
39718cd28
Merge pull request #5218 from martin-frbg/lapacke_mangling by
2025-04-10 22:40:10 -0700
211dfd075
(refs/pull/5219/head)
disable the CooperLake microkernel as it produces wrong results by
2025-04-10 22:21:57 +0200