Martin Kroeker
7c1925acec
Use .p2align instead of .align for compatibility on Sandybridge as well
8 years ago
Martin Kroeker
2359c7c1a9
Use .p2align instead of .align for portability
The OSX assembler apparently mishandles the argument to decimal .align, leading to a significant loss of performance
as observed in #730 , #901 and most recently #1470
8 years ago
Martin Kroeker
e3a80e6aa8
Merge pull request #1466 from xianyi/revert-1464-issue1461
Revert "Add locks only for non-OPENMP multithreading"
8 years ago
Martin Kroeker
8866e393a2
Revert "Add locks only for non-OPENMP multithreading"
8 years ago
Martin Kroeker
9e87e6f3d8
Merge pull request #1464 from martin-frbg/issue1461
Add locks only for non-OPENMP multithreading
8 years ago
Martin Kroeker
3119b2ab4c
Add locks only for non-OPENMP multithreading
to migitate performance problems caused by #1052 and #1299 as seen in #1461
8 years ago
Martin Kroeker
e7366a4161
Restore the remaining utests ( #1462 )
* Restore the remaining utests
* Try fork test on Cygwin and Linux only, it hangs on at least ARMv8/Android as well
* Use generic sswap/dswap kernels for NEHALEM 32bit to fix fault found by the restored swap utest
* Disable zdotu test for MS cl to work around runtime error -1073741819 on AppVeyor for now
(probably coding error in the initialization of the complex numbers or wrong choice of zdotu API)
8 years ago
Martin Kroeker
6c0f79b787
Merge pull request #1463 from martin-frbg/rotmg2
Fix wrong conditionals in scaling loops of rotmg and update BLAS1 tests from netlib
8 years ago
Martin Kroeker
7cd7acf71e
Merge pull request #1460 from martin-frbg/issue1425
Revert insidious suppression of the -fopenmp flag in the LAPACK subtree
8 years ago
Martin Kroeker
72f14a0363
Fix conditionals in the rescaling against GAMSQ
8 years ago
Martin Kroeker
53026dc63a
Update single and double precision BLAS1 tests from LAPACK 3.8.0
adding tests for SROTMG, SROTM, SDSDOT, DROTMG, DROTM, DSDOT
8 years ago
Martin Kroeker
798f1595d5
Fix condition in both second scaling loops
8 years ago
Martin Kroeker
eaab622f03
Make "OMP task depend" sections conditional on OpenMP4, not just OpenMP
To allow compiling with gcc versions older than 4.9
8 years ago
Martin Kroeker
3cda1ce50a
Revert insiduous suppression of the -fopenmp flag in the LAPACK subtree
This was added in #1046 citing a problem with mingw, but in effect it quietly reduces thread safety on all non-Windows platforms (while -fopenmp is already disabled for Windows builds through the toplevel Makefile.system). Removing the filter fixes #1425
8 years ago
Martin Kroeker
0391c07b17
Merge pull request #1457 from martin-frbg/issue1456
test_fork is not meant to be run (nor expected to work) with OpenMP
8 years ago
Martin Kroeker
f4b095b1bb
test_fork is not meant (nor expected) to be run with OpenMP
Fixes 1456
8 years ago
Martin Kroeker
6940c59a88
Merge pull request #1454 from martin-frbg/issue1452
Keep the flag handling separate from the scaling loops in rotmg
8 years ago
Martin Kroeker
42514fa24c
Merge pull request #1449 from martin-frbg/armv8
Enable assembly kernels for the generic ARMV8 target and treat CortexA53,A72 as A57
8 years ago
Martin Kroeker
650077074a
Add tests for rotmg
8 years ago
Martin Kroeker
fe16a94fc2
Add rotmg tests for CMAKE MSVC+CLANG build
8 years ago
Martin Kroeker
632b8e0f05
Merge current Makefile from develop
8 years ago
Martin Kroeker
a1bc0fcf07
Resurrect utest for rotmg and add testcase for issue 1452
8 years ago
Martin Kroeker
0464aa6784
Remove debug printfs
8 years ago
Martin Kroeker
55840f0bc9
Keep the flag handling separate from the scaling loops
Fixes #1452 and is more in line with how ATLAS does it. The earlier fix from #356 only moved the bug elsewhere, but we will never want the iterative rescaling to change the dflag setting and variable associations with each cycle.
8 years ago
Martin Kroeker
a7485f3222
Merge pull request #1453 from martin-frbg/netlib228
Remove spurious EXTERNAL reference
8 years ago
Martin Kroeker
d31f62cf02
Merge pull request #1450 from embray/cygwin/forking
Fix issues with forking on Cygwin
8 years ago
Martin Kroeker
150c7294a6
Remove spurious EXTERNAL reference
From Reference-LAPACK issue 228, remove spurious EXTERNAL reference to unused and nonexistent function xLACGV that could cause linking problems.
8 years ago
Erik M. Bray
8f5f614615
On Cygwin use mmap instead of Windows native allocation functions, which are not fork-safe.
8 years ago
Erik M. Bray
f5fc109fbd
Perform blas_thread_shutdown with pthread_atfork() on Cygwin
Even if we're directly using the win32 threading driver and not pthreads,
pthread_atfork still works fine to register a pre-fork handler, and is
necessary to restore the threading server to a pre-initialized state.
8 years ago
Erik M. Bray
ce2028b425
Rewrite this test to work with ctest and re-enable it on the appropriate platforms (including Cygwin, which has fork())
8 years ago
Martin Kroeker
b47e6822aa
Enable most assembly kernels in the generic ARMV8 target
ref #1439
8 years ago
Martin Kroeker
0ae5e14923
Detect CORTEX A53 and A72 as CORTEXA57
8 years ago
Martin Kroeker
95f2cea45b
Merge pull request #1447 from martin-frbg/sparcfix
Generate CHAR_CORENAME for SPARC
8 years ago
Martin Kroeker
e3c50643bb
Fix my copypaste blunder with get_corename
8 years ago
Martin Kroeker
efa84afd00
Use get_corename for SPARC as well
8 years ago
Martin Kroeker
d1b512a01a
Return a corename for SPARC
8 years ago
Martin Kroeker
e0b02789ff
Merge pull request #1445 from quickwritereader/develop
small fix inside ifdef z13mvc . (z13mvc code is not used in production)
8 years ago
Abdelrauf
60596a1abc
Merge branch 'develop' into develop
8 years ago
Abdelrauf
afd514c25d
small fix inside ifdef z13mvc . (z13mvc code is not used in production)
8 years ago
Martin Kroeker
25c7e3992f
Merge pull request #1443 from martin-frbg/sparcfix
Also #define SPARC in config.h when autodetecting
8 years ago
Martin Kroeker
f45776ec1f
Merge pull request #1440 from quickwritereader/develop
small corrections
8 years ago
Martin Kroeker
e388459a27
Merge pull request #1419 from brada4/develop
Initialize unitialized values for repeated calls
8 years ago
Martin Kroeker
0ac824f6a5
Also #define SPARC in config.h when autodetecting
Fixes #1442
8 years ago
Abdelrauf
f653e7a18d
small fix
small fix inside ifdef z13mvc . (z13mvc code is not used in production)
8 years ago
Martin Kroeker
09e397e4f1
Merge pull request #1434 from xoviat/flang-wall
CMake: Remove unused wall option when FC=flang
8 years ago
Martin Kroeker
913e9546c0
Merge pull request #1436 from martin-frbg/cmaketrmm
Make USE_TRMM depend on TARGET_CORE not TARGET
8 years ago
the mslm
f946a89432
zscal (case: real alpha=0 ) mikrokernel shift&mem fix , da_i as input reg. small typo fixes
8 years ago
Martin Kroeker
485df77612
Make USE_TRMM depend on TARGET_CORE not TARGET
Fixes #1432 (and possibly other DTRMM-related failures on Haswell and related architectures when built with cmake)
8 years ago
xoviat
038bfbb86c
CMake: Remove unused wall option when FC=flang
8 years ago
Martin Kroeker
114fc0bae3
Merge pull request #1429 from martin-frbg/override_omp
When forcing USE_THREAD to zero, override USE_OPENMP as well
8 years ago