213c0e7ab
(refs/pull/3021/head)
Added special unrolled vectorized versions of "Solve" for specific sizes, in DTRSM and STRSM, to improve performance in Power9 and Power10. by
2020-12-04 17:07:06 -0600
f21618684
Merge pull request #3018 from martin-frbg/issue3015 by
2020-12-04 22:08:17 +0100
441c08c9f
Merge pull request #3016 from xiegengxin/complex-asum by
2020-12-04 22:07:16 +0100
66302b3c0
Merge pull request #3013 from martin-frbg/gcc46 by
2020-12-04 08:54:11 +0100
07e9a1234
Merge pull request #3011 from cyyever/fix_link by
2020-12-04 08:50:59 +0100
dd1adbdec
Merge pull request #3019 from RajalakshmiSR/dgemm_param by
2020-12-04 08:49:28 +0100
a1eecccda
(refs/pull/3018/head)
Update f_check by
2020-12-03 23:43:17 +0100
41fe6e864
(refs/pull/3019/head)
POWER10: Update param.h by
2020-12-03 14:40:11 -0600
74b585058
Add libomp to the LAPACK(-test) dependencies in clang/gfortran builds by
2020-12-03 21:28:10 +0100
da0c94c76
Avoid linking both GNU libgomp and LLVM libomp in clang/gfortran builds by
2020-12-03 21:25:57 +0100
a6692dc12
(refs/pull/3013/head)
use gfortran-10 with xcode 12 by
2020-12-03 14:32:21 +0100
72a553f5b
Update .travis.yml by
2020-12-03 09:17:27 +0100
dcbb3b5ef
fix misplaced lines by
2020-12-02 23:13:13 +0100
57456c248
fix gfortran requirement in osx interface64 test by
2020-12-02 15:56:21 +0100
c36131356
Disable deprecated 32bit xcode by
2020-12-02 07:49:43 +0100
0cb7a403b
(refs/pull/3016/head)
fix error declare function blas_level1_thread_with_return_value by
2020-12-02 09:51:52 +0800
77a538d4b
Update an overlooked instance of xcode 10.0 as well by
2020-12-01 22:05:35 +0100
9621062eb
Update OSX xcode version to 11.5 by
2020-12-01 12:23:30 +0100
b766c1e9b
Improve the performance of zasum and casum with AVX512 intrinsic by
2020-12-01 16:49:26 +0800
22574b474
Suppress -mfma as well for gcc 4.6 by
2020-11-30 21:41:51 +0100
f66202299
Move the version check to avoid overwriting unprocessed compiler data by
2020-11-30 17:24:27 +0100
5e81e8147
Merge pull request #3014 from RajalakshmiSR/dgemvnp10 by
2020-11-30 08:18:24 +0100
7d46e31de
(refs/pull/3014/head)
POWER10: Optimize dgemv_n by
2020-11-29 15:28:28 -0600
62a2eb884
Add SSE flags for x86 by
2020-11-29 15:33:07 +0100
2e99e2699
Add workaround for gcc 4.6 miscompiling assembly kernels with -mavx by
2020-11-29 15:32:17 +0100
006b13299
Merge pull request #3012 from martin-frbg/restore-getarch by
2020-11-29 13:27:47 +0100
ca17d3dc3
(refs/pull/3012/head)
Restore RISCV entries accidentally trashed by my PR 3005 by
2020-11-29 13:19:51 +0100
52ed2741c
Merge pull request #3010 from ggouaillardet/topic/fj_compilers by
2020-11-29 11:36:43 +0100
3b4c01611
(refs/pull/3011/head)
link math lib on FreeBSD by
2020-11-29 17:17:07 +0800
358100ec1
(refs/pull/3010/head)
add Fujitsu compilers by
2020-11-29 13:57:57 +0900
3e6d10612
(refs/pull/3008/head)
Do not pass -mavx for gcc 4.6 by
2020-11-28 23:22:26 +0100
24c52ff34
Add -msse2 by
2020-11-27 19:45:56 +0100
6e8439143
remove DYNAMIC_ARCH restriction on -msse3 by
2020-11-27 13:30:38 +0100
30db556da
export NO_AVX2 by
2020-11-27 10:39:04 +0100
c903518e8
Downgrade HASWELL/ZEN targets to SANDYBRIDGE if no AVX2 support by
2020-11-27 10:07:53 +0100
ca793f6db
Make -mavx2 -mfma conditional on compiler support by
2020-11-27 10:05:47 +0100
18a5520a3
Add check for pre-AVX2 gcc versions on x86 by
2020-11-27 10:04:45 +0100
953c4ae1a
(refs/pull/3007/head)
remove quiet to debug piledriver build failure by
2020-11-23 17:07:24 +0100
9fb80b9e4
try to update the ancient binutils in Ubuntu Precise for fma support by
2020-11-23 14:57:36 +0100
3788b6d15
Merge pull request #3005 from martin-frbg/ssefix by
2020-11-23 08:35:32 +0100
bc5b1ddf0
Merge pull request #3004 from martin-frbg/bsd_getauxval by
2020-11-23 08:35:12 +0100
2f42d2310
Merge pull request #3002 from martin-frbg/issue3000 by
2020-11-22 22:51:26 +0100
b72dd007d
Merge pull request #3001 from martin-frbg/issue2996 by
2020-11-22 22:50:41 +0100
11ebe5fa2
(refs/pull/3005/head)
Avoid redefinition warning by
2020-11-22 21:16:07 +0100
01f01dae9
Add -msse if supported by
2020-11-22 21:15:08 +0100
e7bf8ced6
(refs/pull/3004/head)
Build fix for systems that do not support getauxval by
2020-11-22 20:20:28 +0100
5df09f845
(refs/pull/3003/head)
define inf if needed by
2020-11-22 19:35:43 +0100
c38bb5d51
Add utest for NRM2 behaviour with an inf value in the input by
2020-11-22 19:08:40 +0100
025629492
(refs/pull/3002/head)
Fix syntax mixup by
2020-11-22 17:41:44 +0100
2b114c3f3
(refs/pull/3001/head)
Restore proper Makefile by
2020-11-22 17:16:22 +0100
60e1fddca
Ensure that the same (large) BUFFERSIZE is used for all cpus in DYNAMIC_ARCH builds by
2020-11-22 16:48:22 +0100
ebb878869
Use ifneq instead of ifdef for CROSS option by
2020-11-22 16:33:34 +0100
857afcc41
Use ifeq instead of ifdef for user-definable build options by
2020-11-22 16:31:44 +0100
5fa305172
Use ifeq instead of ifdef for user-definable options by
2020-11-22 16:29:56 +0100
d3ff1f889
Convert ifndefs to ifneq by
2020-11-22 16:27:17 +0100
65eb7afaf
Change ifndef CROSS to ifneq by
2020-11-22 16:25:36 +0100
8a6b17f97
Change ifndefs to ifneq by
2020-11-22 16:19:31 +0100
0f863f96e
Merge pull request #112 from xianyi/develop by
2020-11-22 16:17:19 +0100
437702e0e
Merge pull request #2965 from epsilon-0/develop by
2020-11-22 12:25:33 +0100
f1bf040b2
Merge pull request #2988 from xiegengxin/smp-asum by
2020-11-22 12:24:13 +0100
613e3b2ba
Merge pull request #2997 from Flamefire/reproduce_crash by
2020-11-22 12:22:57 +0100
05a0ea234
Merge branch 'risc-v' into develop by
2020-11-22 16:05:32 +0800
703784949
Merge branch 'develop' into risc-v by
2020-11-22 16:04:50 +0800
c6c9c24d1
Update doc for C910. by
2020-11-22 16:02:19 +0800
bed01f47c
(refs/pull/2999/head)
Cast arguments of `_mm512_abs_pd` to `__m512` by
2020-11-21 15:02:59 +0000
6dd71af0c
Merge pull request #2995 from Flamefire/fix_thread_buffer_init by
2020-11-20 09:42:10 +0100
a05dc6e62
(refs/pull/2997/head)
Add reproducer test for crash after fork by
2020-11-19 15:24:57 +0100
60005eb47
(refs/pull/2995/head)
Don't overwrite blas_thread_buffer if already set by
2020-11-19 14:39:00 +0100
043f3d6fa
(refs/pull/2994/head)
POWER10: Use POWER9 as a fallback by
2020-11-19 21:04:10 +1100
fdf71d66b
POWER10: Fix ld version detection by
2020-11-19 20:50:42 +1100
8917203eb
(refs/pull/2992/head)
Update common_thread.h by
2020-11-17 20:57:16 +0100
1592c1f70
Compare environment variables for NUM_THREADS against compile-time maximum by
2020-11-17 19:21:12 +0100
4639c9ae4
Update common_thread.h by
2020-11-17 18:49:59 +0100
811629963
Handle runtime OMP thread count exceeding build-time NUM_THREADS by
2020-11-17 18:18:35 +0100
26ce2705f
reduce num_threads by
2020-11-17 17:56:10 +0100
cfe35efbf
activate testcase by
2020-11-17 15:43:42 +0100
906b23638
typo by
2020-11-17 15:18:04 +0100
c8a32d0a9
Add alternative OpenMP thread safety test from old issue 602 by
2020-11-17 14:47:51 +0100
1748f40cb
Add testcase from issue 602 by
2020-11-17 14:45:20 +0100
e607d8de1
Add C version of testcase from issue 602 by
2020-11-17 14:43:26 +0100
c1f52d358
Add original testcase from issue 602 by
2020-11-17 14:42:15 +0100
eead529d3
Create test_dgemm_f90.f by
2020-11-17 14:41:29 +0100
4293b4b65
(refs/pull/2974/head)
Create test_dgemm_omp.c by
2020-11-17 00:02:49 +0100
7e9cb39a2
Merge pull request #2981 from Qiyu8/fix-sum by
2020-11-16 08:40:46 +0100
be075d53c
Merge pull request #2983 from Qiyu8/optimize-srot by
2020-11-16 08:38:37 +0100
b00a0de13
(refs/pull/2983/head)
remove the -mfma flag in when the host has AVX. by
2020-11-16 09:14:56 +0800
1425abc27
(refs/pull/2990/head)
Reduce the default BUFFERSIZE for x86_64 to its 0.3.9 value by
2020-11-15 19:39:18 +0100
d341a0fea
Merge pull request #2989 from martin-frbg/cmake-fma by
2020-11-13 12:35:09 +0100
ec4d77c47
(refs/pull/2989/head)
Add -mfma for HAVE_FMA3 in the non-DYNAMIC_ARCH case as well by
2020-11-13 09:16:34 +0100
02699226d
Merge pull request #111 from xianyi/develop by
2020-11-13 09:14:23 +0100
d6e7e05bb
(refs/pull/2988/head)
Improve the performance of dasum and sasum when SMP is defined by
2020-11-13 14:20:52 +0800
ae0b1dea1
modify system.cmake to enable fma flag by
2020-11-13 10:20:24 +0800
e0dac6b53
fix the CI failure of target specific option mismatch by
2020-11-12 20:31:03 +0800
e5c2ceb67
fix the CI failure of lack the head by
2020-11-12 17:35:17 +0800
a87e537b8
modify macro by
2020-11-11 15:53:48 +0800
5bc0a7583
only FMA3 and vector larger than 128 have positive effects. by
2020-11-11 15:18:01 +0800
8c0b206d4
Optimize the performance of rot by using universal intrinsics by
2020-11-11 14:33:12 +0800
7c71f9448
(refs/pull/2982/head)
Revert "Lazyly reinit threads after a fork in OMP mode" by
2020-11-10 17:00:28 -0800
c4c591ac5
(refs/pull/2981/head)
fix sum optimize issues by
2020-11-10 16:16:38 +0800
1ea6cfefd
Refs #2899. Merge branch 'damonyu1989-openblas-open-910' into risc-v by
2020-11-10 09:38:43 +0800