Martin Kroeker
3bb70b8ca4
Merge pull request #4205 from martin-frbg/fixintmain
Fix missing type declaration for main() in converted LAPACK files
2 years ago
Martin Kroeker
88435104c8
Merge pull request #4204 from martin-frbg/llvm17-2
Work around LLVM17 miscompiling the AVX512 microkernels for CASUM/ZASUM
2 years ago
Martin Kroeker
be57c595aa
Merge pull request #4203 from martin-frbg/issue4201
Add support for building arm64 SVE kernels with the NVIDIA HPC compiler
2 years ago
Martin Kroeker
7a6203ffa1
restore default Neoverse SVE build instructions for non-NVIDIA compilers
2 years ago
Martin Kroeker
7f7d3896dd
Fix missing type declaration for main
2 years ago
Martin Kroeker
2c3034ff7f
Disable the C/ZASUM AVX512 microkernels when compiling with LLVM17 as well
2 years ago
Martin Kroeker
49689fbef7
Add support for compiling SVE kernels with the NVIDIA HPC compiler
2 years ago
Martin Kroeker
8794544b43
Add support for compiling the Neoverse SVE kernels with the NVIDIA HPC compiler
2 years ago
Martin Kroeker
e9f1b2d26f
Expand the SVE compatibility check for the NVIDIA HPC compiler
2 years ago
Martin Kroeker
d69f57c8c2
Merge pull request #4200 from XiWeiGu/loongarch64_sgemm
LoongArch64: Add sgemm_kernel
2 years ago
gxw
553cc1372f
LoongArch64: Add sgemm_kernel
2 years ago
Martin Kroeker
12ede72ab7
Merge pull request #4192 from imciner2/im/clangfix
Fix cooperlake and sapphire rapids march flags on clang
2 years ago
Martin Kroeker
8d9f701fbf
Merge pull request #4195 from TiborGY/BF16_ignore
Add junk from BF16 test to .gitignore
2 years ago
Martin Kroeker
7f67ba9147
Merge pull request #4198 from martin-frbg/issue4197
Correct INFO returned for too small lda in non-CBLAS s/dgeadd
2 years ago
Martin Kroeker
214be14c1d
Correct INFO returned for lda in non-CBLAS s/dgeadd
2 years ago
Martin Kroeker
1b09f4b2bb
Merge pull request #4193 from imciner2/im/ppcgnu
Fix power10 gcc intrinsic check
2 years ago
Ian McInerney
79c15db348
Fix power10 gcc intrinsic check
__builtin_vsx_assemble_pair was only in GCC 10-11.2 and was replaced by
__builtin_vsx_build_pair thereafter.
2 years ago
TiborGY
0d30daa772
Add junk from BF16 test to .gitignore
2 years ago
Ian McInerney
8a8a8479be
Fix cooperlake and sapphire rapids march flags on clang
The march=cooperlake and march=sapphirerapids flags were never getting
added when building with Clang targetting those architectures. Instead
it was falling back to the skylake AVX512 implementation.
Clang added support for these two architectures in Clang 9 and Clang 12,
so introduce new checks for those versions to enable the appropriate
march flag, and fallback to skylake otherwise.
2 years ago
Martin Kroeker
562ef5fdca
Merge pull request #4169 from felixonmars/patch-1
Use defined variable for riscv64 in arch.cmake
2 years ago
Martin Kroeker
0e5d56ae4a
Merge pull request #4170 from felixonmars/patch-2
Fix 64-bit fortran options for riscv64
2 years ago
Martin Kroeker
ebc157fcc9
Merge pull request #4190 from martin-frbg/issue4186-2
Allow negative INCX in the ?NRM2 kernels
2 years ago
Martin Kroeker
34da1a067d
Allow negative INCX (API change from version 3.10 of the reference implementation)
2 years ago
Martin Kroeker
07e32c4cb8
Allow negative INCX (API change from version 3.10 of the reference implementation)
2 years ago
Martin Kroeker
c211da0688
Allow negative INCX (API change from version 3.10 of the reference implementation)
2 years ago
Martin Kroeker
a34a0a7abc
Allow negative INCX (API change from version 3.10 of the reference implementation)
2 years ago
Martin Kroeker
54d3246fc6
Allow negative INCX (API change from version 3.10 of the reference implementation)
2 years ago
Martin Kroeker
7dd441d5db
Allow negative INCX (API change from version 3.10 of the reference implementation)
2 years ago
Martin Kroeker
f692178792
Allow negative INCX (API change from version 3.10 of the reference implementation)
2 years ago
Martin Kroeker
d15ffb7fdf
Allow negative INCX (API change from version 3.10 of the reference implementation)
2 years ago
Martin Kroeker
a2d867f4d1
Allow negative iNCX (API change from version 3.10 of the reference implementation)
2 years ago
Martin Kroeker
9a0e9c8b69
Merge pull request #4171 from boomanaiden154/clang-libomp-fixes
Fix build with some clang installations when openmp is enabled
2 years ago
Martin Kroeker
7af0f41762
Merge pull request #4189 from martin-frbg/issue4186
Prepare the interface for INCX < 0 in the new NRM2 implementation from BLAS 3.10
2 years ago
Martin Kroeker
4cc804c754
Prepare for INCX < 0 in new NRM2 implementation from BLAS 3.10
2 years ago
Martin Kroeker
afdc56a421
Merge pull request #4158 from XiWeiGu/loongarch64_update_dgemm_kernel
LoongArch64: Update dgemm kernel
2 years ago
Martin Kroeker
91e5513f3b
Merge pull request #4184 from XiWeiGu/dgemv
LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S V2
2 years ago
gxw
e8b571d245
LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S V2
2 years ago
gxw
71fcee6eef
LoongArch64: Update dgemm kernel
2 years ago
Martin Kroeker
0f521ece25
Merge pull request #4183 from martin-frbg/issue4181
Apply USE_TRMM to MIPS64_GENERIC as to GENERIC in gmake builds
2 years ago
Martin Kroeker
232420bdf5
Merge pull request #4182 from xianyi/revert-4153-dgemv
Revert "LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S"
2 years ago
Martin Kroeker
41c31bc1d4
Revert "LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S"
2 years ago
Martin Kroeker
61d803547a
Apply USE_TRMM to MIPS64_GENERIC as to GENERIC
2 years ago
Martin Kroeker
f8ee309402
Merge pull request #4153 from XiWeiGu/dgemv
LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S
2 years ago
Martin Kroeker
12e98482e9
Merge pull request #4179 from martin-frbg/jenkinsfix
Run "make clean" on Jenkins first to remove stale objects
2 years ago
Martin Kroeker
51c218d17a
Update Jenkinsfile
2 years ago
Martin Kroeker
df978c90cd
Update Jenkinsfile.pwr
2 years ago
Martin Kroeker
ef4a7e3fca
Merge pull request #4127 from XiWeiGu/LoongArch64-CI
LoongArch64 CI
2 years ago
Martin Kroeker
b63e4581a3
Merge pull request #4016 from mmuetzel/ci-msys2
Add support for LLVM Flang
2 years ago
Markus Mützel
53378296c8
CI: Build with NO_AVX512 for the runners that use Flang 16.
2 years ago
Markus Mützel
1c3fcaaf42
CI (MSYS2): Re-run failed tests verbosely.
2 years ago