Martin Kroeker
7f0b14eaff
move pragma
3 years ago
Martin Kroeker
49589a6304
move pragma
3 years ago
Martin Kroeker
8c6261c1f7
move pragma
3 years ago
Martin Kroeker
002df7cdfd
Add guards around the pragma
3 years ago
Martin Kroeker
7c59443ea2
Add guards around the pragma
3 years ago
Martin Kroeker
07ef20ea76
Add guards around the pragma
3 years ago
Martin Kroeker
7a93103b47
Add guards around the pragma
3 years ago
Martin Kroeker
c4b0767785
Add guards around the pragma
3 years ago
Martin Kroeker
746f6c0230
Add guards around the pragma
3 years ago
Martin Kroeker
b2b31a8024
Add guards around the pragma
3 years ago
Martin Kroeker
8e16408219
revert
3 years ago
Martin Kroeker
8cedac9bf8
revert
3 years ago
Martin Kroeker
83568f37e4
disable gcc tree vectorizer
3 years ago
Martin Kroeker
ed030a0b0e
disable gcc tree vectorizer
3 years ago
Martin Kroeker
ce4cdd52a3
disable gcc tree vectorizer
3 years ago
Martin Kroeker
801fe19f21
disable gcc tree vectorizer
3 years ago
Martin Kroeker
45b1dc7113
disable gcc tree vectorizer
3 years ago
Martin Kroeker
62eef86c37
Update Makefile.L1
3 years ago
Martin Kroeker
e8a94437d9
try pragma to disable tree vectorizer
3 years ago
Martin Kroeker
e5d0749420
try pragma to disable tree vectorizer
3 years ago
Martin Kroeker
6d7ec34c5f
Update Makefile.L2
3 years ago
Martin Kroeker
ef4110ddda
Update Makefile.L2
3 years ago
Martin Kroeker
2a0caf8e32
Update Makefile.L1
3 years ago
Martin Kroeker
74f2da79a8
Update Makefile
3 years ago
Martin Kroeker
9b4a579158
Update Makefile
3 years ago
Martin Kroeker
bd30120ba7
Merge pull request #3720 from FlyGoat/mips64
Make it work on general MIPS64 processors
3 years ago
Jiaxun Yang
a50b29c540
Provide a fallback MIPS64_GENERIC target
It is really dangerous to fallback to Loongson core on other
MIPS64 processors.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
3 years ago
Jiaxun Yang
50c4eeb97d
alpha: Remove include of version.h
It will be defined by preprocessor argument.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
3 years ago
Ivan Pribec
802e71bf05
Add const attribute to lsame
3 years ago
gxw
fbfe1daf6e
LoongArch64: Add DYNAMIC_ARCH support
3 years ago
Martin Kroeker
cd8e57040c
Merge pull request #3691 from martin-frbg/issue3679-sparc
SPARC: fix DNRM2 returning INF instead of zero due to intermediate overflow
3 years ago
Martin Kroeker
6c118b7977
Fix DNRM2 returning INF instead of zero due to intermediate overflow
3 years ago
Martin Kroeker
c43ec53bdd
Merge pull request #3690 from RajalakshmiSR/cdotp10
POWER: Fix complex dot function failures
3 years ago
Martin Kroeker
b7c65d08cb
Merge pull request #3689 from RajalakshmiSR/dgemvgcc10
POWER10: dgemv builtin rename
3 years ago
Martin Kroeker
06ef015234
fix DNRM2 returning INF instead of zero due to intermediate overflow
3 years ago
Rajalakshmi Srinivasaraghavan
a612e78a97
POWER: Fix complex dot function failures
There are some test failures in complex dot functions when compiling with gcc12.
The machine constraints used now do not update all the four elements in the
expected result array. Fixing this with a reduced level of optimization.
This is not changing any performance numbers but will be converted to C code in future.
3 years ago
Rajalakshmi Srinivasaraghavan
432fd99445
POWER10: dgemv builtin rename
Add check to use correct builtin name for older versions
of gcc10 compilers.
3 years ago
gxw
4dd05e526b
LoongArch64: Fix dnrm2_tiny testcase failure
3 years ago
gxw
cce4b1d956
MIPS64: Fix dnrm2_tiny testcase failure
3 years ago
Martin Kroeker
e12d474780
Eliminate uses of CREAL on left-hand side of assignments
3 years ago
Martin Kroeker
9e29598575
workaround fault with ssq=inf,scale=0
3 years ago
Honglin Zhu
123e0dfb62
Neoverse N2 sbgemm:
1. Modify the algorithm to resolve multithreading failures
2. No memory allocation in sbgemm kernel
3. Optimize when alpha == 1.0f
3 years ago
Honglin Zhu
bc3728475f
format code
3 years ago
Honglin Zhu
55d686d41e
neoverse n2 sbgemm:
implement ncopy tcopy kernel_8x4
3 years ago
Honglin Zhu
04593bb27c
neoverse n2 sbgemm: init file
3 years ago
Martin Kroeker
be5500e704
Merge pull request #3669 from VFerrari/fix_small_matrix_kernel
POWER: fix issues with the small matrix kernel
3 years ago
Martin Kroeker
92275a7902
Merge pull request #3642 from nursik/develop
Add ARM64 support for Windows
3 years ago
VFerrari
cac634fce3
POWER10: Fix multithreading check when USE_THREAD=0
This patch fixes an issue when OpenBLAS is compiled for TARGET=POWER10
and the flag USE_THREAD is set to 0.
The function `num_cpu_avail` is only available when USE_THREAD=1,
so SMP is defined.
3 years ago
Martin Kroeker
9283c7c0b5
Merge pull request #3655 from RajalakshmiSR/zgemmasmp10
POWER10: Fix ZGEMM testcase failures
3 years ago
Rajalakshmi Srinivasaraghavan
f191bc652b
POWER10: Fix ZGEMM testcase failures
This patch fixes storing and restoring non volatile registers
in zgemm POWER10 kernel.
3 years ago