Martin Kroeker
70865a894e
Merge pull request #5180 from ywwry66/openmp_use_cmake
CMake: Pass `OpenMP` compiler and linker flags through CMake targets
10 months ago
Martin Kroeker
f0f274725d
Merge pull request #5207 from martin-frbg/issue5202
Fix MacOS compilation with xcode16.3/clang17/gcc14
10 months ago
Martin Kroeker
fc8090b607
Move additional omp dependency to EXTRALIB
10 months ago
Martin Kroeker
1c5d0d5539
move libomp to extralib
10 months ago
Martin Kroeker
67c5bdd639
Azure CI: Update flang call in OSX_LLVM_flangnew job ( #5208 )
* Update flang call in OSX_LLVM_flangnew job
10 months ago
Martin Kroeker
1ed962d259
Fix compilation with xcode16.3/clang17/gcc14
10 months ago
Martin Kroeker
f0008f50cc
Merge pull request #5206 from ColumbusAI/develop
Update zsum.c -- fixed spelling error to successfully compile
10 months ago
ColumbusAI
7bf848454d
Update zsum.c -- fixed spelling error to successfully compile
spelling error where zsum_kernel is used and it should be zasum_kernel. Will not compile without fix.
10 months ago
Martin Kroeker
f90eff306d
Merge pull request #5197 from e4t/z-arch-exec-stack
On zarch don't produce objects from assembler with a writable stack s…
10 months ago
Egbert Eich
61b9339d3a
getarch/cpuid.S: Fix warning about executable stack
When using the GNU toolchain a warning is printed about an executible
stack:
/usr/lib64/gcc/.../x86_64-suse-linux/bin/ld: warning: /tmp/ccyG3xBB.o: missing .note.GNU-stack section implies executable stack
[ 15s] /usr/lib64/gcc/.../x86_64-suse-linux/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
to prevent this warning, add:
```
.section .note.GNU-stack,"",@progbits
```
Signed-off-by: Egbert Eich <eich@suse.com>
10 months ago
Egbert Eich
ea6515c4b3
On zarch don't produce objects from assembler with a writable stack section
On z-series, the current version of the GNU toolchain produces warnings
such as:
```
/usr/lib64/gcc/[...]/s390x-suse-linux/bin/ld: warning: ztrmm_kernel_RC_Z14.o: missing .note.GNU-stack section implies
executable stack
/usr/lib64/[...]/s390x-suse-linux/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
```
To prevent this message and make sure we are future proof, add
```
.section .note.GNU-stack,"",@progbits
```
Also add the `.size` bit to give the asm defined functions a proper size
in the symbol table.
Signed-off-by: Egbert Eich <eich@suse.com>
10 months ago
Martin Kroeker
f33943d73e
Merge pull request #5196 from martin-frbg/issue5193
Fix misinterpretation of NO_LAPACK=0 and SPMV settings in CMake builds
10 months ago
Ruiyang Wu
251c3f857d
gh m1: fix mixed linkage when built with OpenMP and clang+gfortran
10 months ago
Ruiyang Wu
1b0c0f00e9
CMake: Avoid mixed OpenMP linkage
11 months ago
Ruiyang Wu
02fd1df10b
CMake: Pass `OpenMP` compiler and linker flags through CMake targets
Using `OpenMP::OpenMP_LANG` targets for CMake is less error-prone than
passing the compiler and linker flags manually. Furthermore, it allows
the user to customize those flags by setting `OpenMP_LANG_FLAGS`,
`OpenMP_LANG_LIB_NAMES`, and `OpenMP_omp_LIBRARY`.
11 months ago
Martin Kroeker
8b35534201
Merge pull request #5195 from martin-frbg/update-gensymbolpl
Re-synchronize gensymbol.pl with the posix shell version
10 months ago
Martin Kroeker
51c1fb1f93
Fix ?spmv build and misinterpretation of NO_LAPACK=0
10 months ago
Martin Kroeker
3ca1ba1be3
resynchronize with the posix shell version
10 months ago
Martin Kroeker
72f0abeed5
Merge pull request #5191 from Harishmcw/CMake_Symbol_Fix
Fix DLL symbol name pre/postfixing in CMake builds on Windows
10 months ago
Harishmcw
1724b3f104
DLL symbol pre/postfixing in CMake builds
10 months ago
Harishmcw
c2e7ab5351
DLL symbol pre/postfixing in CMake builds
10 months ago
Martin Kroeker
200771078f
Merge pull request #5190 from Harishmcw/develop
Fix missing commas in gensymbol.pl and DLL symbol pre/postfixing in CMake builds
10 months ago
Martin Kroeker
4e3afa7beb
Merge pull request #5175 from shubhamsvc/dgemv_thread_throttling
Add thread throttling profile for DGEMV on NEOVERSEV1
10 months ago
Harishmcw
c0a5c9655e
Fix missing commas in gensymbol.pl
10 months ago
shubham.chaudhari
8e289ecddc
Simplified thread throttling function in gemv
11 months ago
shubham.chaudhari
189dbbc04f
Add thread throttling for dynamic arch neoversev1
11 months ago
shubham.chaudhari
b6cb5ece58
Add thread throttling profile for DGEMV on NEOVERSEV1
11 months ago
Martin Kroeker
51c244a098
Merge pull request #5184 from taoye9/fix_sbgemv_n_bug
fix bugs in aarch64 sbgemv_n kernel
11 months ago
Ye Tao
f27ba5efd1
fix bugs in aarch64 sbgemv_n kernel
11 months ago
Martin Kroeker
e9fbe0a838
Merge pull request #5183 from annop-w/fix_sbgemv_t
Fix bug in ARM64 sbgemv_t
11 months ago
Annop Wongwathanarat
edef2e4441
Fix bug in ARM64 sbgemv_t
11 months ago
Martin Kroeker
b55ca71d5b
Merge pull request #5182 from annop-w/sgemm_ncopy
Optimize aarch64 sgemm_ncopy
11 months ago
Martin Kroeker
2f778554b8
Merge pull request #5181 from taoye9/change_sbgemn_cast_bf16
replace customize bf16_to_fp32 with arm neon vcvtah_f32_bf16
11 months ago
Martin Kroeker
66e0f1e621
Merge pull request #5178 from martin-frbg/lapack_cplx_dummy
Add dummy implementations of make_complex_(float/double) to simplify Windows DLL linking
11 months ago
Annop Wongwathanarat
9807f56580
Optimize aarch64 sgemm_ncopy
11 months ago
Martin Kroeker
1ba02656e6
Merge pull request #5177 from martin-frbg/cmakelapacke
Fix omission of LAPACKE interfaces for cgesvdq,strsyl3 and deprecated functions in CMAKE builds
11 months ago
Martin Kroeker
8a418b1aab
Add dummy implementations for the LAPACK_COMPLEX_CUSTOM case
11 months ago
Martin Kroeker
b34235ca66
Fix inclusion of deprecated interfaces and cgesvdq/strsyl3
11 months ago
Martin Kroeker
37b854769b
Merge pull request #5173 from nakagawa-fj/gemm_load_imbalance
Improving Load Imbalance in Thread-Parallel GEMM
11 months ago
Martin Kroeker
a3e7b16072
Merge pull request #5157 from manaalmj/feature
Optimize gemv_n_sve kernel
11 months ago
Martin Kroeker
8865850496
Merge pull request #5176 from annop-w/fix_sbgemv_t
Fix aarch64 sbgemv_t compilation error for GCC < 13
11 months ago
Ye Tao
4c00099ed6
replace customize bf16_to_fp32 with arm neon vcvtah_f32_bf16
11 months ago
Annop Wongwathanarat
a085b6c9ec
Fix aarch64 sbgemv_t compilation error for GCC < 13
11 months ago
Masato Nakagawa
80d3c2ad95
Add Improving Load Imbalance in Thread-Parallel GEMM
11 months ago
manjam01
5c4e38ab17
Optimize gemv_n_sve kernel
11 months ago
Martin Kroeker
39eb43d441
Improve thread safety of pthreads builds that rely on C11 atomic operations for locking ( #5170 )
* Tighten memory orders for C11 atomic operations
11 months ago
Martin Kroeker
1d5ed5c46b
Merge pull request #5168 from taoye9/add_sbgemvn_on_neonversen2
Add dispatch of SBGEMVNKERNEL for NEOVERSEN2 and NEOVERSEV2
11 months ago
Martin Kroeker
7338a473a7
Merge pull request #5150 from Harishmcw/WoA-Experiments
Redefined threading logic for GESV and GEMV on WoA
11 months ago
Martin Kroeker
5f200dca54
Merge pull request #5166 from martin-frbg/issue5158
Expose the option to build without LAPACKE to ccmake
11 months ago
Martin Kroeker
8b98db13e3
Merge pull request #5167 from taoye9/fix_sbgemv_n_kernel_typo
fix minior issues of redeclaration of float x0,x1 in sbgemv_n_neon.c
11 months ago