ee90f3038
Increase BUFFERSIZE for POWER8-10 and use same value for POWER6 by
2020-10-22 18:47:07 +0200
2e48d560b
(refs/pull/2936/head)
Fix compiler version check by
2020-10-22 16:23:29 +0200
ab7f46646
Merge pull request #106 from xianyi/develop by
2020-10-22 16:21:09 +0200
f95031204
(refs/pull/2935/head)
Fix macro used in argument conversion (LAPACK PR 458) by
2020-10-22 16:19:26 +0200
909068fac
Merge pull request #2932 from RajalakshmiSR/copyp10 by
2020-10-22 00:29:46 +0200
5b7438fdd
Merge pull request #2934 from thrasibule/improve_version_check by
2020-10-22 00:29:02 +0200
47696b43e
(refs/pull/2934/head)
actually check that version is greater than 4.7 by
2020-10-21 16:42:37 -0400
ad745c0ba
(refs/pull/2932/head)
Optimize scopy/ccopy for POWER10 by
2020-10-21 09:53:45 -0500
17c46bf06
Merge pull request #2930 from ismail/fix-no-return by
2020-10-21 11:43:01 +0200
28242096c
Merge pull request #2928 from martin-frbg/issue2917 by
2020-10-21 10:11:02 +0200
4a1d00f58
(refs/pull/2930/head)
Fix build with -Werror=return-type dgemm_tcopy_16_skylakex.c CNAME function should return an int, add a return 0 similar to other files. by
2020-10-21 08:43:39 +0200
00813363b
(refs/pull/2928/head)
Enable -mavx2 for flang as well by
2020-10-20 23:56:30 +0200
336e35469
Merge pull request #105 from xianyi/develop by
2020-10-20 23:48:53 +0200
29668458f
Merge pull request #2925 from martin-frbg/issue2911-2 by
2020-10-20 11:27:36 +0200
ee83e2904
Merge pull request #2926 from bartoldeman/vzeroupper-clobber-all by
2020-10-20 09:24:47 +0200
1a0f57c8f
(refs/pull/2925/head)
Fix missing backquotes by
2020-10-20 08:37:53 +0200
b073d759d
(refs/pull/2926/head)
x86_64: clobber all xmm registers after vzeroupper by
2020-10-20 02:16:47 +0000
eddc65c7b
Add POWER10 support flag (unconditionally for now) by
2020-10-20 01:09:49 +0200
bb8c3f686
Add ld/binutils version check for POWER10 support by
2020-10-20 01:04:20 +0200
ff65952e4
Move HAVE_P10_SUPPORT to the build system by
2020-10-20 00:55:41 +0200
6208c9899
Merge pull request #104 from xianyi/develop by
2020-10-20 00:52:08 +0200
8e20ab21c
Merge pull request #2924 from martin-frbg/issue2920 by
2020-10-19 23:33:45 +0200
dc6e44c3f
Merge pull request #2916 from martin-frbg/issue2911 by
2020-10-19 23:33:31 +0200
4ad33c46b
(refs/pull/2924/head)
Add back symbols that got dropped when splitting by type by
2020-10-19 20:37:52 +0200
fe2a922ad
(refs/pull/2916/head)
Add POWER10 compiler options to CCOMMON_OPT rather than COMMON_OPT by
2020-10-19 17:43:53 +0200
9cac37965
Merge pull request #103 from xianyi/develop by
2020-10-19 15:56:20 +0200
a61c08640
Fix spurious trailing whitespace in comment by
2020-10-19 09:12:12 +0200
5b9ebe4f8
Merge pull request #2919 from isuruf/export by
2020-10-19 08:14:27 +0200
7eddaf0d6
Remove -mmma again (reduntant with cpu=power10) and add override statements by
2020-10-19 08:11:22 +0200
14b1d3393
(refs/pull/2919/head)
Fix exporting some lapack and cblas by
2020-10-18 21:42:32 -0500
77669b019
Merge pull request #2915 from bartoldeman/no-empty_sgemm_direct_skylakex by
2020-10-19 00:09:54 +0200
5e8ddc900
Merge pull request #2913 from martin-frbg/issue2910 by
2020-10-18 23:04:56 +0200
03e781b76
(refs/pull/2915/head)
sgemm_direct_skylakex: fix 75eeb26 regression. by
2020-10-18 19:50:38 +0000
f1a4071d8
Clean up STACKSIZE redefinition by
2020-10-18 19:41:43 +0200
97cf10062
Clean up STACKSIZE redefinition by
2020-10-18 19:39:18 +0200
17e288e18
Clean up STACKSIZE redefinition by
2020-10-18 19:37:04 +0200
c1422f3e4
Clean up STACKSIZE redefinition by
2020-10-18 19:31:01 +0200
d85b24e10
Clean up STACKSIZE redefinition by
2020-10-18 19:29:45 +0200
7d6c85f9d
Add compiler option -mmma for POWER10 by
2020-10-18 19:27:51 +0200
2e7ee7c71
(refs/pull/2913/head)
Fix naming of L2 cache size item reported for Vortex by
2020-10-18 19:22:05 +0200
efd47b010
Merge pull request #2909 from isuruf/patch-1 by
2020-10-18 19:16:08 +0200
f5902ab0a
Support cross-compiling for Apple Vortex by
2020-10-18 19:10:58 +0200
bf1f1c66b
(refs/pull/2912/head)
VORTEX by
2020-10-18 12:08:35 -0500
1a0c18512
Support cross-compiling for Apple Vortex by
2020-10-18 18:54:54 +0200
89eea6b45
Merge pull request #102 from xianyi/develop by
2020-10-18 18:49:59 +0200
a5c667b55
(refs/pull/2909/head)
Need a space when redirecting to file by
2020-10-18 09:40:31 -0500
0ac610270
Update version string to 0.3.11.dev by
2020-10-17 22:40:47 +0200
26a701f4a
Update version string to 0.3.11.dev by
2020-10-17 22:40:06 +0200
fcd0fa1a3
Merge pull request #2908 from xianyi/release-0.3.0 by
2020-10-17 22:38:58 +0200
51c22612e
(tag: v0.3.11, refs/pull/2908/head)
Merge pull request #2907 from xianyi/develop by
2020-10-17 22:14:12 +0200
b8f689200
(refs/pull/2907/head)
Update version number to 0.3.11 by
2020-10-17 22:11:34 +0200
fe9015b61
Update version for 0.3.11 release by
2020-10-17 22:10:50 +0200
f99b8c150
Merge pull request #2906 from martin-frbg/changelog-0311 by
2020-10-17 22:07:14 +0200
5381a1805
(refs/pull/2906/head)
Update Changelog.txt with the 0.3.11 changes by
2020-10-17 22:05:36 +0200
e35576c6f
Merge pull request #2905 from martin-frbg/aocc-clang by
2020-10-17 09:45:22 +0200
f1bb85d37
(refs/pull/2905/head)
Add AVX flags for clang/aocc as well by
2020-10-16 20:52:15 +0200
25907e672
Merge pull request #101 from xianyi/develop by
2020-10-16 20:48:58 +0200
d7ba7679b
Merge branch 'develop' into risc-v by
2020-10-16 23:27:38 +0800
978937538
Merge pull request #2900 from martin-frbg/fixcmake_sse by
2020-10-16 16:17:36 +0200
0eda7ac2c
(refs/pull/2903/head)
Merge 'origin/release-0.3.0' into develop to get the 0.3.10 tag by
2020-10-16 13:15:43 +0300
f64243ff5
(refs/pull/2900/head)
Add compiler options for sse/sse2/ssse3/sse4.1 by
2020-10-16 10:47:06 +0200
786c0a3ce
Add sse options for use of intrinics with older compilers by
2020-10-16 10:41:53 +0200
df7066704
fix core list for sse/sse2 by
2020-10-16 09:55:48 +0200
e6c5b13a1
Merge pull request #2898 from martin-frbg/morefixes by
2020-10-16 07:26:39 +0200
f071d1207
(refs/pull/2898/head)
add sse2 by
2020-10-15 22:10:32 +0200
dc6cefd2f
Expressly enable -msse for 32bit DYNAMIC_ARCH kernels by
2020-10-15 20:16:15 +0200
c339c40c0
Silence a redefinition warning by
2020-10-15 19:08:12 +0200
ac8af9cec
Add -msse where supported, apparently required for older gcc by
2020-10-15 19:06:45 +0200
10379fc83
Use ifdef instead of if by
2020-10-15 19:05:37 +0200
a85ac7163
Merge pull request #100 from xianyi/develop by
2020-10-15 18:54:20 +0200
4c25910da
Merge pull request #2896 from martin-frbg/intrin-double by
2020-10-15 11:12:35 +0200
ef8e7d027
(refs/pull/2899/head)
Add the support for RISC-V Vector. by
2020-10-15 16:05:37 +0800
9b9ee92d5
Merge pull request #2897 from Qiyu8/usimd-double by
2020-10-15 08:38:24 +0200
ae6ac8399
(refs/pull/2896/head)
Revert "add double precision SSE" by
2020-10-15 08:37:02 +0200
4fac91ef3
(refs/pull/2897/head)
adapt arm platform by
2020-10-15 11:08:10 +0800
bfdf4b56d
Add double precision universal intrinsics for X86/ARM by
2020-10-15 10:29:42 +0800
ebf0470fc
add sse4.1 for DYNAMIC_ARCH kernels by
2020-10-14 20:34:33 +0200
ca160bb44
Add -msse4.1 when SSE4.1 is supported by
2020-10-14 19:18:07 +0200
c9c3ae07a
Add double precision operations by
2020-10-14 18:10:45 +0200
a897bc3bd
Merge pull request #99 from xianyi/develop by
2020-10-14 18:09:20 +0200
756802df6
Merge pull request #2890 from martin-frbg/s-d-sum by
2020-10-14 09:02:03 +0200
01492decf
Merge pull request #2895 from martin-frbg/sb-tests by
2020-10-14 09:01:16 +0200
bd0752444
Merge pull request #2894 from RajalakshmiSR/bf16_packing by
2020-10-14 08:12:08 +0200
c1f4f5d4e
(refs/pull/2895/head)
Replace Makefile with simplified version again by
2020-10-14 01:08:50 +0200
75e3a92df
(refs/pull/2890/head)
Add express -mavx and -msse options (and fix a stray = for cooperlake) by
2020-10-14 01:01:58 +0200
2a329baa8
Add the BFLOAT16 functions to cmake builds by
2020-10-13 23:21:38 +0200
0826d68f9
(refs/pull/2894/head)
POWER10: Change the packing format for bfloat16 by
2020-10-13 16:05:10 -0500
4bb73c017
Rename "HALF" type to "BFLOAT16" by
2020-10-13 20:07:19 +0200
bc5c7f957
Cleanup by
2020-10-13 19:56:09 +0200
437b7fe26
sh prefix renamed to sb by
2020-10-13 19:55:14 +0200
a0ada4bcb
Merge pull request #98 from xianyi/develop by
2020-10-13 18:50:30 +0200
602a0c7a6
Merge pull request #2892 from RajalakshmiSR/bf16_make by
2020-10-13 18:48:37 +0200
b5d30b390
(refs/pull/2892/head)
Fix build issues with bfloat16 by
2020-10-13 11:00:22 -0500
137ae618d
Fix typo by
2020-10-13 15:02:17 +0200
9e3cff5cf
Expressly enable -mavx2 on Zen, SkylakeX and Cooperlake as well by
2020-10-13 14:41:25 +0200
d85b96842
Merge pull request #2891 from martin-frbg/fix-2886 by
2020-10-13 13:46:17 +0200
5f60a32ca
Add -mssse3 if supported by the hardware by
2020-10-13 11:57:04 +0200
fecedc9c6
Add -mssse3 by
2020-10-13 11:55:41 +0200
0eacbca85
Add Haswell and Zen to temporary sse3 whitelist by
2020-10-13 11:42:39 +0200
6999086a2
whitelist SANDYBRIDGE for SSE3 by
2020-10-13 10:32:19 +0200