Martin Kroeker
1e49771660
Fix INTERFACE64 builds on Loongarch64
2 years ago
Evgeni Burovski
af1e6fab50
DOC: add a readme for benchmarks/pybench
2 years ago
Evgeni Burovski
e15a90a055
BENCH: add benchmarks using codspeed.io
2 years ago
Martin Kroeker
c22c4e3510
remove spurious brace
2 years ago
Chip Kerchner
0dac9ae7d3
POWER: Fixing endianness issue in cswap/zswap kernel for AIX
2 years ago
Martin Kroeker
cb38cceb8b
Use CMAKE_C_COMPILER_VERSION instead of dumpversion calls ( #4698 )
* Use CMAKE_C_COMPILER_VERSION throughout
2 years ago
Martin Kroeker
a6d17da568
Fix CMAKE syntax in kernel file parsing of IFNEQ conditionals ( #4695 )
* Fix syntax in parsing of IFNEQ
2 years ago
frjohnst
05e494051b
Revert "fix conlict between PR 4515 and AIX shared obj support"
This reverts commit bdaa6705ca .
It turns out that PRs 4515 and 4520 break the tests under
lapack-netlib/TESTING which require SECOND and DSECND. IBM
has decided this is a bigger biger problem than the conflict
between lapack second_ and the xlf run time.
2 years ago
frjohnst
7ab11bb730
Revert "resolve second_ conflict which breaks xlf timef"
This reverts commit 9b24b31419 .
It turns out that PRs 4515 and 4520 break the tests under
lapack-netlib/TESTING which require SECOND and DSECND. IBM
has decided this is a bigger biger problem than the conflict
between lapack second_ and the xlf run time.
2 years ago
Martin Kroeker
775ab942dc
Introduce a lower limit for multithreading
2 years ago
Martin Kroeker
8e1e4582d8
Introduce a lower limit for multithreading
2 years ago
Martin Kroeker
ab05a25cd5
fix zdotu argument passing in utest_ext on windows ( #4691 )
* fix passing of results on windows
2 years ago
Amrita H S
d82494f484
Fix regression SAXPY when compiler with OpenXL compiler.
SAXPY built with OpenXL regresses when compared to SAXPY
built with gcc. OpenXL compiler doesn't know that the
SAXPY inner kernel assembly is a 64 element loop and
to it the remainder loop is the main loop. It vectorizes
and interleaves the remainder to be a 48 elements per
iteration loop. With a max of 63 iterations, a 48 element
loop is mostly not going to get executed, so the 1 element
scalar loop that is the remainder after that is probably
mostly what gets executed.
This can be fixed by adding a pragma, loop interleave_count(2)
which will result in 8 element loop.
Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com>
2 years ago
Matti Picus
a1d124094c
use blasint instead of int to quiet warnings
2 years ago
Martin Kroeker
03933483be
Support compilation without CBLAS
2 years ago
Martin Kroeker
8f6ccfa6f4
forward NO_CFLAGS to the CFLAGS, if set
2 years ago
gxw
cb1b6bbcf9
loongarch64: Fixed GCC14 compilation issue
2 years ago
gxw
b77a239b01
loongarch64: Fixed icamax_lsx
2 years ago
gxw
b0d990a1d3
loongarch64: Fixed utest fork:safety
2 years ago
gxw
995cbcf0e4
loongarch64: Add buffer offset for target LOONGSON3R5
2 years ago
Martin Kroeker
95fb386a33
remove stray comma
2 years ago
gxw
f85afaae6d
loongarch64: Update dgemm_kernel_16x4 to dgemm_kernel_16x6
2 years ago
Andrew Robbins
2bfb0cbb89
Expose whether locking is enabled in get_config
2 years ago
yamazaki-mitsufumi
91ce6dc9c1
Expanding the scop of 2D thread distribution
2 years ago
Martin Kroeker
d66aa63478
Merge pull request #4681 from martin-frbg/fix4662-2
fix HUGETLB allocation for TLS mode as well
2 years ago
Martin Kroeker
f0f1ff7820
fix HUGETLB allocation for TLS mode as well
2 years ago
Martin Kroeker
edeb5259a1
Merge pull request #4679 from martin-frbg/fix4662
Restore Loongson LA64ARCH handling
2 years ago
Martin Kroeker
4376b6f7d2
Restore Loongson LA64ARCH handling
2 years ago
Martin Kroeker
8735b54fa8
Merge pull request #4662 from martin-frbg/hugetlb-doc
Fix and document the two HUGETLB options for buffer allocation in Makefile.rule
2 years ago
Martin Kroeker
fc10673fd3
Merge branch 'develop' into hugetlb-doc
2 years ago
Martin Kroeker
c20189cc82
Merge pull request #4677 from martin-frbg/issue4676
Add autodetection of Intel Meteor Lake and Emerald Rapids
2 years ago
Martin Kroeker
bbd227ce4a
Add Intel Meteor Lake and Emerald Rapids
2 years ago
Martin Kroeker
f034745ce6
Merge pull request #4675 from martin-frbg/issue4619
Mention LD_LIBRARY_PATH in user documentation
2 years ago
Martin Kroeker
a82ecadc11
mention LD_LIBRARY_PATH
2 years ago
Martin Kroeker
b859f6f191
Merge pull request #4617 from cyk2018/patch-1
[Doc]Update user_manual.md for static linker
2 years ago
Martin Kroeker
dc99b61380
sort unwanted interdependencies of alloc_shm and alloc_hugetlb
2 years ago
Martin Kroeker
9c4e10fbd1
sort hugetlb and shm alloc options
2 years ago
Martin Kroeker
a63d71129c
Merge pull request #4671 from martin-frbg/issue4668
Silence a GCC14 warning/error in the f2c-converted LAPACK
2 years ago
Martin Kroeker
3d26837a35
Suppress GCC14 error exit in the f2c-converted LAPACK
2 years ago
Martin Kroeker
7c915e64ca
Silence a GCC14 warning/error in the f2c-converted LAPACK
2 years ago
Martin Kroeker
edacf9b397
Work around spurious BLAS3 test errors on LOONGSON3R3/4 ( #4667 )
Force compilation with gfortran to use O0 on older Loongson hardware to avoid spurious test failures
2 years ago
Martin Kroeker
89e3fd0821
Merge pull request #4666 from martin-frbg/issue4633
Fix spurious errors in the extended utest for INTERFACE64=1 on big-endian systems
2 years ago
Martin Kroeker
b1d722fc0c
Fix cast to work with INTERFACE64 (especially on big-endian)
2 years ago
Martin Kroeker
1031d161f6
Merge pull request #4663 from ayappanec/develop
Fix openblas_utest_ext build in AIX
2 years ago
Ayappan P
f4ee0a423b
Fix openblas_utest_ext build in AIX
2 years ago
Martin Kroeker
faf7b3d1bb
Document the two HUGETLB options for buffer allocation
2 years ago
Martin Kroeker
ab5882ebf0
Merge pull request #4661 from martin-frbg/issue4660
Fix CMAKE builds for Loongarch64
2 years ago
Martin Kroeker
69aa93e34f
Fix Loongson compiler flag check
2 years ago
Martin Kroeker
015042f7b5
Fix Loongson compiler flag test
2 years ago
Martin Kroeker
992b71fea2
remove stray comma
2 years ago