Martin Kroeker
e5ffb7c0a3
Fix ARMV9SME target and add support_sme1 code for MacOS
9 months ago
Martin Kroeker
39718cd28e
Merge pull request #5218 from martin-frbg/lapacke_mangling
lapacke_mangling.h is no longer generated, so don't delete on make clean
9 months ago
Martin Kroeker
fd3afef122
lapacke_mangling.h is no longer generated, so don't delete on make clean
9 months ago
Martin Kroeker
b30dc9701f
Merge pull request #5215 from annop-w/gemv_t
Use SVE kernel for S/DGEMVT for SVE machines
9 months ago
Martin Kroeker
2893d0add4
Merge pull request #5211 from guoyuanplct/develop
Optimizing the Implementation of GEMV on the RISC-V V Extension
9 months ago
Martin Kroeker
ed1e470663
Merge pull request #5217 from haampie/hs/fix/darwin-gcc
test_potrs.c: do not use GCC pragma on darwin-aarch64
9 months ago
Harmen Stoppels
3d6d026fe1
no-gcse when loongarch64
9 months ago
Harmen Stoppels
51ba70f47b
test_potrs.c: remove pragma darwin-aarch64 support
Using GCC 14.2.0 on Darwin, the pragma ultimately causes a linker error
"ld: invalid r_symbolnum=". The current workaround is to use the old
linker, but (a) it's deprecated and (b) it can produce libraries that
are subsequently not linkable with the newer linker in dependents: the
new ld64 does not link to libraries with duplicate rpaths created by the
classic linker.
9 months ago
Annop Wongwathanarat
ec146157d3
Use SVE kernel for S/DGEMVT for SVE machines
10 months ago
Martin Kroeker
de2380e5a6
Merge pull request #5214 from martin-frbg/issue5200
Remove spurious cast from Alpha and Cell's DEFAULT_ALIGN
9 months ago
Martin Kroeker
a34b487f22
Remove spurious cast from Alpha and Cell's DEFAULT_ALIGN
9 months ago
Martin Kroeker
1b3e7cc491
Merge pull request #5212 from martin-frbg/lapack1119
Fix incomplete error message in EIG test (Reference-LAPACK PR 1119)
9 months ago
Martin Kroeker
4270d5bc43
Merge pull request #5204 from martin-frbg/issue4692
Repeat the libs target's "ln" in the all target to ensure completeness of copy on Windows
9 months ago
Martin Kroeker
880e43ee54
Merge pull request #5198 from martin-frbg/woadlldebug
Fix pdb file creation in debug dll builds with CMake on Windows/WoA
9 months ago
Martin Kroeker
70865a894e
Merge pull request #5180 from ywwry66/openmp_use_cmake
CMake: Pass `OpenMP` compiler and linker flags through CMake targets
9 months ago
Martin Kroeker
f0f274725d
Merge pull request #5207 from martin-frbg/issue5202
Fix MacOS compilation with xcode16.3/clang17/gcc14
9 months ago
Martin Kroeker
94fb7033a4
Fix incomplete error message (Reference-LAPACK PR 1119)
9 months ago
lglglglgy
1ff303f36e
Optimizing the Implementation of GEMV on the RISC-V V Extension
Specialized some scenarios, performed loop unrolling, and reduced the
number of multiplications.
9 months ago
Martin Kroeker
fc8090b607
Move additional omp dependency to EXTRALIB
10 months ago
Martin Kroeker
1c5d0d5539
move libomp to extralib
10 months ago
Martin Kroeker
67c5bdd639
Azure CI: Update flang call in OSX_LLVM_flangnew job ( #5208 )
* Update flang call in OSX_LLVM_flangnew job
10 months ago
Martin Kroeker
1ed962d259
Fix compilation with xcode16.3/clang17/gcc14
10 months ago
Martin Kroeker
f0008f50cc
Merge pull request #5206 from ColumbusAI/develop
Update zsum.c -- fixed spelling error to successfully compile
10 months ago
ColumbusAI
7bf848454d
Update zsum.c -- fixed spelling error to successfully compile
spelling error where zsum_kernel is used and it should be zasum_kernel. Will not compile without fix.
10 months ago
Martin Kroeker
0aa5ef29ec
Repeat the libs target's "ln" in the all target to ensure completeness
10 months ago
Martin Kroeker
f90eff306d
Merge pull request #5197 from e4t/z-arch-exec-stack
On zarch don't produce objects from assembler with a writable stack s…
10 months ago
Martin Kroeker
3fc15ad81c
Fix pdb file creation in debug dll builds with CMake on Windows/WoA
10 months ago
Egbert Eich
61b9339d3a
getarch/cpuid.S: Fix warning about executable stack
When using the GNU toolchain a warning is printed about an executible
stack:
/usr/lib64/gcc/.../x86_64-suse-linux/bin/ld: warning: /tmp/ccyG3xBB.o: missing .note.GNU-stack section implies executable stack
[ 15s] /usr/lib64/gcc/.../x86_64-suse-linux/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
to prevent this warning, add:
```
.section .note.GNU-stack,"",@progbits
```
Signed-off-by: Egbert Eich <eich@suse.com>
10 months ago
Egbert Eich
ea6515c4b3
On zarch don't produce objects from assembler with a writable stack section
On z-series, the current version of the GNU toolchain produces warnings
such as:
```
/usr/lib64/gcc/[...]/s390x-suse-linux/bin/ld: warning: ztrmm_kernel_RC_Z14.o: missing .note.GNU-stack section implies
executable stack
/usr/lib64/[...]/s390x-suse-linux/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
```
To prevent this message and make sure we are future proof, add
```
.section .note.GNU-stack,"",@progbits
```
Also add the `.size` bit to give the asm defined functions a proper size
in the symbol table.
Signed-off-by: Egbert Eich <eich@suse.com>
10 months ago
Martin Kroeker
f33943d73e
Merge pull request #5196 from martin-frbg/issue5193
Fix misinterpretation of NO_LAPACK=0 and SPMV settings in CMake builds
10 months ago
Ruiyang Wu
251c3f857d
gh m1: fix mixed linkage when built with OpenMP and clang+gfortran
10 months ago
Ruiyang Wu
1b0c0f00e9
CMake: Avoid mixed OpenMP linkage
10 months ago
Ruiyang Wu
02fd1df10b
CMake: Pass `OpenMP` compiler and linker flags through CMake targets
Using `OpenMP::OpenMP_LANG` targets for CMake is less error-prone than
passing the compiler and linker flags manually. Furthermore, it allows
the user to customize those flags by setting `OpenMP_LANG_FLAGS`,
`OpenMP_LANG_LIB_NAMES`, and `OpenMP_omp_LIBRARY`.
10 months ago
Martin Kroeker
8b35534201
Merge pull request #5195 from martin-frbg/update-gensymbolpl
Re-synchronize gensymbol.pl with the posix shell version
10 months ago
Martin Kroeker
51c1fb1f93
Fix ?spmv build and misinterpretation of NO_LAPACK=0
10 months ago
Martin Kroeker
3ca1ba1be3
resynchronize with the posix shell version
10 months ago
Martin Kroeker
72f0abeed5
Merge pull request #5191 from Harishmcw/CMake_Symbol_Fix
Fix DLL symbol name pre/postfixing in CMake builds on Windows
10 months ago
Harishmcw
1724b3f104
DLL symbol pre/postfixing in CMake builds
10 months ago
Harishmcw
c2e7ab5351
DLL symbol pre/postfixing in CMake builds
10 months ago
Martin Kroeker
200771078f
Merge pull request #5190 from Harishmcw/develop
Fix missing commas in gensymbol.pl and DLL symbol pre/postfixing in CMake builds
10 months ago
Martin Kroeker
4e3afa7beb
Merge pull request #5175 from shubhamsvc/dgemv_thread_throttling
Add thread throttling profile for DGEMV on NEOVERSEV1
10 months ago
Harishmcw
c0a5c9655e
Fix missing commas in gensymbol.pl
10 months ago
shubham.chaudhari
8e289ecddc
Simplified thread throttling function in gemv
10 months ago
shubham.chaudhari
189dbbc04f
Add thread throttling for dynamic arch neoversev1
11 months ago
shubham.chaudhari
b6cb5ece58
Add thread throttling profile for DGEMV on NEOVERSEV1
11 months ago
Martin Kroeker
51c244a098
Merge pull request #5184 from taoye9/fix_sbgemv_n_bug
fix bugs in aarch64 sbgemv_n kernel
10 months ago
Ye Tao
f27ba5efd1
fix bugs in aarch64 sbgemv_n kernel
10 months ago
Martin Kroeker
e9fbe0a838
Merge pull request #5183 from annop-w/fix_sbgemv_t
Fix bug in ARM64 sbgemv_t
10 months ago
Annop Wongwathanarat
edef2e4441
Fix bug in ARM64 sbgemv_t
10 months ago
Martin Kroeker
b55ca71d5b
Merge pull request #5182 from annop-w/sgemm_ncopy
Optimize aarch64 sgemm_ncopy
10 months ago