Honglin Zhu
123e0dfb62
Neoverse N2 sbgemm:
1. Modify the algorithm to resolve multithreading failures
2. No memory allocation in sbgemm kernel
3. Optimize when alpha == 1.0f
3 years ago
Honglin Zhu
bc3728475f
format code
3 years ago
Honglin Zhu
55d686d41e
neoverse n2 sbgemm:
implement ncopy tcopy kernel_8x4
3 years ago
Honglin Zhu
04593bb27c
neoverse n2 sbgemm: init file
3 years ago
Martin Kroeker
1fb4259077
Merge pull request #3673 from martin-frbg/azuredynmingw
AzureCI: drop cpus from the DYNAMIC_LIST for Windows/mingw to save time
3 years ago
Martin Kroeker
47a0e53196
mingw-dynamic arch: drop Haswell too
3 years ago
Martin Kroeker
c7b3ce010e
drop NEHALEM from the DYNLIST for Windows/mingw to save time
3 years ago
Martin Kroeker
be5500e704
Merge pull request #3669 from VFerrari/fix_small_matrix_kernel
POWER: fix issues with the small matrix kernel
3 years ago
Martin Kroeker
92275a7902
Merge pull request #3642 from nursik/develop
Add ARM64 support for Windows
3 years ago
Martin Kroeker
914c4d0fe8
Add C versions of the CBLAS test sources ( #3656 )
* Add C conversions of the CBLAS tests for NOFORTRAN=1 builds
* Enable CTEST without Fortran and fix passing of BUILD_vartype options to exports/gensymbol
3 years ago
VFerrari
2062280c6f
Power: Enable SMALL_MATRIX OPT as default for dynamic arch
3 years ago
VFerrari
cac634fce3
POWER10: Fix multithreading check when USE_THREAD=0
This patch fixes an issue when OpenBLAS is compiled for TARGET=POWER10
and the flag USE_THREAD is set to 0.
The function `num_cpu_avail` is only available when USE_THREAD=1,
so SMP is defined.
3 years ago
Martin Kroeker
9283c7c0b5
Merge pull request #3655 from RajalakshmiSR/zgemmasmp10
POWER10: Fix ZGEMM testcase failures
3 years ago
Martin Kroeker
9777c59d98
Merge pull request #3653 from RajalakshmiSR/dgemvp10
POWER10: convert dgemv inline assembly
3 years ago
Rajalakshmi Srinivasaraghavan
f191bc652b
POWER10: Fix ZGEMM testcase failures
This patch fixes storing and restoring non volatile registers
in zgemm POWER10 kernel.
3 years ago
Martin Kroeker
7060ca5002
Merge pull request #3647 from martin-frbg/exports_3.10.0
Amend gensymbol with some LAPACK 3.10.0 additions
3 years ago
Martin Kroeker
72ea19d187
Amend some LAPACK 3.10.0 additions
3 years ago
Nursultan Zarlyk
1dfc4e6150
Replace with ARM64 intrinsics
3 years ago
Rajalakshmi Srinivasaraghavan
8419d538ff
POWER10: convert dgemv inline assembly
This patch makes use of compiler builtins and matches with assembly
performance. Tested with clang14 and gcc12.
3 years ago
Martin Kroeker
bfd9c1b58c
Merge pull request #3645 from martin-frbg/issue3644
Fix quotes around compiler args in C11 check
3 years ago
Martin Kroeker
79d98327e4
Fix quotes around compiler args in C11 check
3 years ago
Martin Kroeker
eb1faada19
Merge pull request #3643 from martin-frbg/fixgensymbol
Fix LAPACK path in new gensymbol script
3 years ago
Xianyi Zhang
5e9a912591
Merge branch 'develop' into risc-v
3 years ago
Xianyi Zhang
f9715605ac
Add PLCT to contributors.
3 years ago
Xianyi Zhang
3f88429bcf
Merge branch 'risc-v_fix_intrinsic' into risc-v
3 years ago
Xianyi Zhang
968e1f51d8
Update RISC-V Intrinsic API.
3 years ago
Martin Kroeker
e9c3535208
Fix LAPACK path in new gensymbol script
3 years ago
Martin Kroeker
f150c97ceb
Merge pull request #3641 from RajalakshmiSR/ppc_build
power10: Fix build issues due to perl scripts conversion
3 years ago
Nursultan Zarlyk
1bb7993a97
Fix MSVC ARM64 build. Add generic kernel for ARM64
3 years ago
Rajalakshmi Srinivasaraghavan
c98d63b637
power10: Fix build issues due to perl scripts conversion
Due to recent perl script conversion, there are some build
errors when compiling openblas with advance toolchain compilers.
3 years ago
Martin Kroeker
28a24a4d4f
Merge pull request #3637 from martin-frbg/issue3636
Add fallback value for bogus sc_nprocessors_conf in getarch
3 years ago
Martin Kroeker
14ae22bf7a
Add fallback value for bogus sc_nprocessors_conf
3 years ago
Martin Kroeker
771dc6a8d8
Merge pull request #3635 from martin-frbg/issue3634
Support compilation with the Intel ifx compiler
3 years ago
Martin Kroeker
19413624d0
Add Intel ifx compiler
3 years ago
Martin Kroeker
f56e4b620f
Merge pull request #3633 from martin-frbg/perl_fallback
Add back original PERL-based build scripts and add option USE_PERL
3 years ago
Martin Kroeker
5cb0d23027
Support USE_PERL fallback for gensymbol
3 years ago
Martin Kroeker
f5a379bf77
Add USE_PERL fallback option for gensymbol script
3 years ago
Martin Kroeker
cfc1a9ed8d
Add back original PERL-based script under new name
3 years ago
Martin Kroeker
a3e02742f2
Add USE_PERL fallback option for create script used with FUNCTION_PROFILE
3 years ago
Martin Kroeker
f1c570a5f1
Add back original PERL-based script under new name
3 years ago
Martin Kroeker
181b96257c
Add back PERL-based scripts under new name
3 years ago
Martin Kroeker
7093a34a34
Add fallback option USE_PERL for original PERL-based build scripts
3 years ago
Martin Kroeker
c4b52ef46e
Merge pull request #3624 from ioraff/no-perl
rewrite perl scripts in universal shell
3 years ago
Martin Kroeker
d0c3504255
Merge pull request #3631 from martin-frbg/revertdynskx
Revert selection of a different DGEMM kernel for SkylakeX in DYNAMIC_ARCH builds
3 years ago
Martin Kroeker
dac14a5f7d
revert "switch DGEMM parameters for SkylakeX if DYNAMIC_ARCH"
3 years ago
Martin Kroeker
dc49edd4e6
Revert "roll back DGEMM kernel ... for DYNAMIC_ARCH"
3 years ago
Martin Kroeker
faf58d2b3f
Merge pull request #3630 from martin-frbg/fixpr3629
Fix compilation of cpuid_riscv
3 years ago
Martin Kroeker
30df29c0b3
Fix compilation
3 years ago
Zhang Xianyi
a720e2ca8a
Merge pull request #3629 from Rabenda/riscv-c910
riscv: Fix machine recognition for c910v
3 years ago
Han Gao
8dd4579480
riscv: Fix machine recognition for c910v
Signed-off-by: Han Gao <gaohan@uniontech.com>
3 years ago