0d1f30a29
Merge pull request #81 from xianyi/develop by
2020-09-05 12:47:03 +0200
70a254d50
Merge pull request #2822 from martin-frbg/issue2821 by
2020-09-05 12:39:32 +0200
330044d82
(refs/pull/2822/head)
Fix potentiol domain error in sqrt by
2020-09-05 09:44:33 +0200
97636b2c8
Merge pull request #2819 from h-vetinari/carry_lapack_437 by
2020-09-04 23:50:43 +0200
4d3671154
Merge pull request #2820 from RajalakshmiSR/clang by
2020-09-04 23:09:31 +0200
718f67421
(refs/pull/2820/head)
POWER9: Fix mcpu option with clang by
2020-09-04 10:36:19 -0500
3426519ae
(refs/pull/2819/head)
adapt ?ggsv?-functions to ambient code style in LAPACKE/include/lapack.h by
2020-09-02 22:46:47 +0200
1c6c71fa8
Follow-up to lapack#434 & lapack#409: add missing 'const' in signatures by
2020-09-02 22:41:50 +0200
860247b5d
Follow-up to lapack#434 & lapack#409: fix signature mismatches by
2020-09-02 22:38:56 +0200
c61771e33
Merge pull request #2778 from martin-frbg/lapackeig by
2020-09-04 10:06:02 +0200
deaeb6c5b
(refs/pull/2796/head)
Add bfloat16 based dot and conversion with single/double by
2020-08-27 06:42:28 +0800
c7ef7174e
Merge pull request #2817 from martin-frbg/lapack436 by
2020-09-03 17:10:23 +0200
775a87242
(refs/pull/2816/head)
Rename KERNEL.SILICON to KERNEL.VORTEX by
2020-09-03 08:44:20 +0200
af5bc9550
Rename SILICON to VORTEX and fix duplicate numbering by
2020-09-03 08:43:26 +0200
ea3a58c84
Rename SILICON to VORTEX by
2020-09-03 08:38:53 +0200
17dca035d
rename SILICON to VORTEX by
2020-09-03 08:38:08 +0200
1b0f17eee
(refs/pull/2803/head)
align to 64, using SSE when input size is small by
2020-09-01 15:41:48 +0800
c31b72965
(refs/pull/2817/head)
Fix data type of work array in zgesvdq prototype by
2020-09-02 23:44:44 +0200
0ce2aa316
Fix data type of rwork array by
2020-09-02 23:41:51 +0200
80794fe8f
Create KERNEL.SILICON by
2020-09-02 22:56:58 +0200
4a4d1ca6e
Add AppleSIlicon cpu by
2020-09-02 22:52:12 +0200
b37d17382
Add Apple Silicon by
2020-09-02 22:48:49 +0200
029fd01cf
Detect AppleSilicon cpu on OSX by
2020-09-02 22:47:38 +0200
9d1ea75aa
Merge pull request #80 from xianyi/develop by
2020-09-02 22:16:41 +0200
776d005f4
Merge pull request #2815 from mhillenibm/clang_s390x by
2020-09-02 16:56:01 +0200
2ee5b899c
(refs/pull/2815/head)
s390x: enable S/DGEMM block with explicit loop unrolling + interleaving with clang by
2020-09-01 16:16:53 +0200
095f4e696
s390x: allow clang to emit fused multiply-adds (replicates gcc's default behavior) by
2020-09-01 15:09:32 +0200
87e5bbd88
s390x: avoid variable-length arrays in struct for asm operands by
2020-09-01 12:08:05 +0200
b9b3265ec
s390x: avoid inline assembly for vector loads for clang by
2020-09-01 12:04:28 +0200
a1616a0b8
s390x: replace nop with "nop 0" in inline assembly by
2020-09-01 11:58:48 +0200
60ef19325
s390x: use "lghi" for immediate values to fix build with clang by
2020-09-01 13:59:06 +0200
18bfb6d6f
Merge pull request #2813 from martin-frbg/issue2804-2 by
2020-09-01 23:39:46 +0200
e4900caa1
(refs/pull/2813/head)
Fix c_check misinterpreting arm64 in uname output to mean armv7 by
2020-09-01 19:54:08 +0200
be9a20fb6
(refs/pull/2812/head)
Accept uname output of arm64 as such by
2020-09-01 17:45:41 +0200
68b1713c3
Merge pull request #2811 from martin-frbg/issue2806 by
2020-09-01 17:19:14 +0200
4074770d0
Merge pull request #2797 from martin-frbg/relafixes1 by
2020-09-01 16:04:03 +0200
88bf71d02
(refs/pull/2811/head)
Fix accidental deletion of Cooperlake entries with the preceding commit by
2020-09-01 14:15:20 +0200
a76e56f91
Report NO_AVX512 being set (as it is already done for NO_AVX, NO_AVX2) by
2020-09-01 13:34:59 +0200
47e75f0ac
Allow overriding the AVX512 check with a NO_AVX512 define by
2020-09-01 12:09:25 +0200
b87a77da0
Merge pull request #79 from xianyi/develop by
2020-09-01 12:03:53 +0200
f42e84d46
Fix misnaming of LAPACK_?ggsvp function prototypes as LAPACKE_ (#2808) by
2020-09-01 10:44:48 +0200
0a4c5c4c4
Merge pull request #2807 from martin-frbg/issue2804 by
2020-08-31 23:44:56 +0200
a5f7626bd
(refs/pull/2808/head)
missing comma by
2020-08-31 23:29:20 +0200
3aaa6a47b
fix argument lists of LAPACK_?ggsvp prototypes by
2020-08-31 23:18:04 +0200
16b805b25
Update lapack.h by
2020-08-31 22:53:41 +0200
deb119992
Update lapack.h by
2020-08-31 22:38:15 +0200
defb15e71
Update lapack.h by
2020-08-31 21:36:21 +0200
cd4fbf124
Need to drop the LAPACKE matrix_layout parameter for LAPACK ?ggsvp as well by
2020-08-31 21:18:45 +0200
1a0552370
Fix misnaming of LAPACK_?ggsvp function prototypes as LAPACKE_ by
2020-08-31 20:08:35 +0200
3210a4273
(refs/pull/2807/head)
Report cpu as ARMV8 instead of just giving up on non-Linux hosts by
2020-08-31 20:03:21 +0200
5feb087c0
Handle Apple labeling armv8 as arm64 rather than aarch64 by
2020-08-31 20:02:08 +0200
448152cdd
define __AVX2__ to ensure the haswell code compiled with avx2 by
2020-08-31 14:39:08 +0800
cb3c190a3
Implementaion of dasum, sasum with AVX2 & AVX512 intrinsic by
2020-08-21 14:44:36 +0800
59e01b1ae
Merge pull request #2799 from RajalakshmiSR/p10_ger by
2020-08-28 22:52:11 +0200
317ff27cd
(refs/pull/2799/head)
POWER10: Avoid setting accumulators to zero in gemm kernels by
2020-08-28 10:42:54 -0500
4130d1732
Refs #2587 fix small matrix c/zgemm bug. by
2020-08-28 22:36:36 +0800
255b6dd0f
Merge branch 'develop' into small_matrices by
2020-08-28 21:38:58 +0800
741d6c5cb
Refs #2587 Add small matrix optimization reference kernel for c/zgemm. by
2020-08-28 21:00:54 +0800
514a3d7d6
Merge pull request #2798 from kadler/aix-cpuid by
2020-08-28 08:30:59 +0200
085aae8bd
(refs/pull/2798/head)
Fix compile error on AIX cpuid detection by
2020-08-27 23:08:33 -0500
712ca4306
Change a1b0 gemm to b0 gemm. by
2020-08-28 07:55:27 +0800
de6367571
(refs/pull/2797/head)
Add early returns and fix sign errors in workspace calculations by
2020-08-27 11:25:18 +0200
d64cc2be8
Add early returns by
2020-08-27 11:22:50 +0200
c9b67141f
Add early returns by
2020-08-27 11:20:31 +0200
6797a3a1e
Add early returns by
2020-08-27 11:15:12 +0200
936966a42
Make ILAENV and xGETRF2 functions available by
2020-08-27 10:59:08 +0200
5c6c2cd4f
Merge pull request #2775 from Guobing-Chen/Fix_OMP_threads_specify by
2020-08-24 20:18:09 +0200
e54be4ba1
Merge pull request #2792 from pkubaj/patch-1 by
2020-08-24 08:03:39 +0200
48a1364e1
(refs/pull/2792/head)
Add aliases for armv6, armv7 by
2020-08-23 18:50:19 +0000
0c1c903f1
(refs/pull/2775/head)
Fix OMP num specify issue by
2020-08-12 03:28:25 +0800
a073fa870
Merge pull request #2791 from martin-frbg/issue2787 by
2020-08-23 19:33:03 +0200
b2053239f
(refs/pull/2791/head)
Fix mssing dummy parameter (imag part of alpha) of zdot_thread_function by
2020-08-23 15:08:16 +0200
b11bb6e72
Merge pull request #2790 from martin-frbg/issue2789 by
2020-08-23 14:42:35 +0200
1840bc5b5
(refs/pull/2790/head)
Add OpenMP dependency to pkgconfig file if needed by
2020-08-22 13:55:18 +0200
7c0977c26
Add OpenMP dependency to pkgconfig file if needed by
2020-08-22 13:53:44 +0200
fb3d80c42
Merge pull request #78 from xianyi/develop by
2020-08-22 13:52:29 +0200
9ee21a0a3
Merge pull request #2780 from Guobing-Chen/CPL_build_support by
2020-08-20 19:54:29 +0200
35557ec92
(refs/pull/2788/head)
Add R benchmarks at higher core counts by
2020-08-20 16:42:27 +0200
bd3207b4b
(refs/pull/2780/head)
Update system.cmake by
2020-08-19 22:51:10 +0200
b8ebfc933
Update system.cmake by
2020-08-19 22:30:19 +0200
7c1986640
fallback from cooperlake to skylake if gcc<10 by
2020-08-19 20:48:39 +0200
71d33c952
Typo fix by
2020-08-19 17:44:23 +0200
6a3c07478
-march=cooperlake requires gcc10 by
2020-08-19 17:22:12 +0200
430f741b3
-march=cooperlake requires gcc10 by
2020-08-19 17:17:53 +0200
6f4dc7445
Fix typo by
2020-08-19 16:36:55 +0200
81fbe8d08
-march=cooperlake only available in gcc >= 10 by
2020-08-19 16:10:15 +0200
bb9cf766f
make march=cooperlake option conditional on gcc >= 10.1 by
2020-08-19 15:06:30 +0200
75eeb265d
[WIP] Refactor the driver code for direct SGEMM (#2782) by
2020-08-19 14:51:09 +0200
2c7297257
Merge pull request #2785 from albertziegenhagel/always-generate-pkg-config by
2020-08-19 14:42:58 +0200
416ee2602
(refs/pull/2782/head)
revert the unrelated drone.io CI config change by
2020-08-18 16:17:19 +0200
a7fc14c50
Limit direct sgemm to x86_64 by
2020-08-18 14:13:15 +0200
b86214e43
Limit direct sgemm to x86_64 by
2020-08-18 14:12:19 +0200
1ba18212d
Update common_s.h by
2020-08-18 09:36:59 +0200
6b731d917
(refs/pull/2785/head)
Do not require pkg-config to generate the *.pc file by
2020-08-18 08:48:48 +0200
5a0e9e8de
Update setparam-ref.c by
2020-08-17 22:38:02 +0200
e46d761bc
Update setparam-ref.c by
2020-08-17 22:16:20 +0200
6c279ef55
Update setparam-ref.c by
2020-08-17 21:55:54 +0200
7996458ea
Update common_s.h by
2020-08-17 20:06:59 +0200
5dcf47cd9
Merge pull request #2784 from martin-frbg/issue2783 by
2020-08-17 19:06:13 +0200
7fe38daee
use macros for sgemm_direct to support dynamic_arch naming via common_s,h by
2020-08-17 18:56:05 +0200