Martin Kroeker
952541e840
Need to use filter-out to handle NOFORTRAN not set
7 years ago
Martin Kroeker
9369d3e6e5
Modify NOFORTRAN tests to always check the value; fix rewriting of NO_FORTRAN
7 years ago
Martin Kroeker
10b70c904d
Handle erroneous user settings NOFORTRAN=0 and NO_FORTRAN
7 years ago
Martin Kroeker
6a5ab083b7
Handle special case of gfortran+clang+OpenMP
7 years ago
Martin Kroeker
1f9e4f3193
Handle special case of gfortran+clang+OpenMP
7 years ago
Martin Kroeker
26e1cfb653
Merge pull request #1607 from martin-frbg/dynarch
Move some x86_64 DYNAMIC_ARCH targets to new DYNAMIC_OLDER option
7 years ago
Martin Kroeker
c628c6fa59
Merge pull request #1612 from oon3m0oo/cpus
Fixed a few more unnecessary calls to num_cpu_avail.
7 years ago
Martin Kroeker
67d81ab49d
Merge pull request #1609 from martin-frbg/issue1529
Create OpenBLASConfig.cmake in cmake builds as well
7 years ago
Martin Kroeker
2f957947a6
Merge pull request #1613 from xianyi/revert-1600-noyield
Revert "Use usleep instead of sched_yield by default"
7 years ago
Martin Kroeker
de8fff671d
Revert "Use usleep instead of sched_yield by default"
7 years ago
Martin Kroeker
6f71c0fce4
Return a somewhat sane default value for L2 cache size if cpuid retur… ( #1611 )
* Return a somewhat sane default value for L2 cache size if cpuid returned something unexpected
Fixes #1610 , the KVM hypervisor on Google Chromebooks returning zero for CPUID 0x80000006, causing DYNAMIC_ARCH
builds of OpenBLAS to hang
7 years ago
Craig Donner
c2545b0fd6
Fixed a few more unnecessary calls to num_cpu_avail.
I don't have as many benchmarks for these as for gemm, but it should still
make a difference for small matrices.
7 years ago
Martin Kroeker
e65f451409
include CMakePackageConfigHelpers
7 years ago
Martin Kroeker
02634b549b
Add template for OpenBLASConfig.cmake
7 years ago
Martin Kroeker
0bea6bb9e7
Create OpenBLASConfig.cmake from cmake as well
7 years ago
Martin Kroeker
3313e4b946
Merge pull request #1608 from martin-frbg/issue874
Enable parallel make on MS Windows by default
7 years ago
Martin Kroeker
e9cd11768c
Enable parallel make on MS Windows by default
fixes #874
7 years ago
Martin Kroeker
63f7395fb4
Move some DYNAMIC_ARCH targets to new DYNAMIC_OLDER option
7 years ago
Martin Kroeker
1cbd8f3ae4
Move some DYNAMIC_ARCH targets to new DYNAMIC_OLDER option
7 years ago
Martin Kroeker
6c2d90ba77
Move some DYNAMIC_ARCH targets to new DYNAMIC_OLDER option
7 years ago
Martin Kroeker
0297b3211a
Merge pull request #1605 from oon3m0oo/develop
Improve performance of GEMM for small matrices when SMP is defined.
7 years ago
Craig Donner
66316b9f4c
Improve performance of GEMM for small matrices when SMP is defined.
Always checking num_cpu_avail() regardless of whether threading will actually
be used adds noticeable overhead for small matrices. Most other uses of
num_cpu_avail() do so only if threading will be used, so do the same here.
7 years ago
Martin Kroeker
6adc4b7b36
Merge pull request #1601 from martin-frbg/zaxpy
Use a single thread for small input size in zaxpy
7 years ago
Martin Kroeker
2ade0ef085
Merge pull request #1600 from martin-frbg/noyield
Use usleep instead of sched_yield by default
7 years ago
Martin Kroeker
e8880c1699
Use a single thread for small input size
copies daxpy improvement from #27 , see #1560
7 years ago
Martin Kroeker
ed7c4a043b
Use usleep instead of sched_yield by default
sched_yield only burns cpu cycles, fixes #900 , see also #923 , #1560
7 years ago
Martin Kroeker
cf234a0561
Merge pull request #1589 from fenrus75/skylakex
Initial support for SkylakeX / AVX512
7 years ago
Martin Kroeker
ae2a33128b
Merge pull request #1599 from martin-frbg/c_check_avx512
Improved AVX512 test case for c_check
7 years ago
Martin Kroeker
e4718b1fee
Better AVX512 test case
7 years ago
Martin Kroeker
9b87b64262
Improve AVX512 testcase
clang 3.4 managed to accept the original test code, only to fail on the actual Skylake asm later
7 years ago
Martin Kroeker
0218b884c1
Merge pull request #1598 from martin-frbg/issue1593-2
Restore _Atomic define before stdatomic.h for old gcc
7 years ago
Martin Kroeker
83da278093
Update common.h
7 years ago
Martin Kroeker
358d4df2bd
Merge branch 'develop' into issue1593-2
7 years ago
Martin Kroeker
06d43760e4
Restore _Atomic define before stdatomic.h for old gcc
see #1593
7 years ago
Martin Kroeker
a4af8861ff
Merge pull request #1597 from martin-frbg/cmake-avx512
Check build system support for AVX512 instructions
7 years ago
Martin Kroeker
7fb62aed7e
Check build system support for AVX512 instructions
7 years ago
Martin Kroeker
f6021c798d
Re-enable QUIET_MAKE
7 years ago
Martin Kroeker
e8002536ec
disable quiet_make for the moment
7 years ago
Martin Kroeker
ce6317f6c0
Merge pull request #1594 from martin-frbg/issue1593
Fix inverted condition in _Atomic declaration
7 years ago
Martin Kroeker
15a78d6b66
export NO_AVX512 setting
7 years ago
Martin Kroeker
354a976a59
Fix inverted condition in _Atomic declaration
fixes #1593
7 years ago
Martin Kroeker
38ad05bd04
Extend loop range to find SkylakeX in force_coretype
7 years ago
Martin Kroeker
b7feded85a
Propagate NO_AVX512 via CCOMMON_OPT
7 years ago
Martin Kroeker
dc9fe05ab5
Update cpuid_x86.c
7 years ago
Martin Kroeker
8be027e4c6
Update dynamic.c
7 years ago
Martin Kroeker
ac7b6e3e9a
Fix misplaced endif
7 years ago
Martin Kroeker
fc66a0ec0b
Merge pull request #1590 from martin-frbg/avx512_check
Disable AVX512 (Skylake X) support if the build system is too old
7 years ago
Arjan van de Ven
89372e0993
Use AVX512 also for DGEMM
this required switching to the generic gemm_beta code (which is faster anyway on SKX)
for both DGEMM and SGEMM
Performance for the not-retuned version is in the 30% range
7 years ago
Martin Kroeker
ef626c6824
typo fix
7 years ago
Martin Kroeker
83fec56a3f
Disable AVX512 (Skylake X) support if the build system is too old
7 years ago