Martin Kroeker
c2ffd90e8c
make NAN handling depend on dummy2 parameter
1 year ago
Martin Kroeker
dd6c33d34d
make NAN handling depend on dummy2 parameter
1 year ago
Martin Kroeker
2020569705
fix NAN handling and make it depend on dummy2 parameter
1 year ago
Martin Kroeker
3870995f01
make NAN handling depend on dummy2 parameter
1 year ago
Martin Kroeker
7284c533b5
make NAN handling depend on dummy2 parameter
1 year ago
Martin Kroeker
73751218a4
make NAN handling depend on dummy2 parameter
1 year ago
Martin Kroeker
b9bfc8ce09
make NAN handling depend on dummy2 parameter
1 year ago
Martin Kroeker
eb4879e04c
make NAN handling depend on the dummy2 parameter
1 year ago
Martin Kroeker
ee87cb90d0
Merge pull request #4803 from iha-taisei/SVESupportSDGEMV
A64FX: Add support for SVE to SGEMV/DGEMV kernels.
1 year ago
iha fujitsu
0985fdc82b
A64FX: Add support for SVE to SGEMV/DGEMV kernels.
1 year ago
Mark Ryan
67bf4b6998
Fix axpby_rvv kernels for cases where inc_y = 0
The following openblas_utest tests fail when the RISCV64_ZVL128B is
enabled.
TEST 89/103 axpby:zaxpby_inc_0 [FAIL]
TEST 92/103 axpby:caxpby_inc_0 [FAIL]
TEST 95/103 axpby:daxpby_inc_0 [FAIL]
TEST 98/103 axpby:saxpby_inc_0 [FAIL]
The issue is that the vectorized kernels do not work when inc_y == 0.
This patch updates the kernels to fall back to the scalar algorithms
when inc_y == 0, fixing the failing tests.
Signed-off-by: Mark Ryan <markdryan@rivosinc.com>
1 year ago
Martin Kroeker
5d08ec7ff3
Merge pull request #4782 from martin-frbg/azurewincl
Fix NAN handling in ARM/generic SCAL; have AzureCI Windows show errors on failure
1 year ago
Chip Kerchner
cb154832f8
Vectorize SBGEMM incopy - 4x faster.
1 year ago
Martin Kroeker
a5c04e326a
Update scal.c
1 year ago
Martin Kroeker
536200bc9e
fix handling of INF or NAN
1 year ago
Martin Kroeker
3677b3886c
Merge pull request #4702 from bashimao/detect-nv-grace
Correctly detect ARM Neoverse V2 CPUs.
1 year ago
Martin Kroeker
f3c364c2cc
temporarily(?) disable the alpha=0 branch as it fails to handle INF,NAN
1 year ago
Martin Kroeker
2a5fe97e3b
temporarily(?) disable the alpha=0 branch as it does not handle INF,NAN
1 year ago
Martin Kroeker
c1019d5832
Handle INF and NAN in inputs
1 year ago
Martin Kroeker
9e24121e7e
temporarily(?) disable da=0 shortcut to handle x=Inf or NAN
1 year ago
Martin Kroeker
a11f086c17
Update sscal_msa.c
1 year ago
Martin Kroeker
541e1b6959
disable the fast path for inc=1, alpha=0 as it does not handle x=NaN or Inf
1 year ago
Martin Kroeker
c08113c279
fix special cases of x= NAN or INF
1 year ago
Martin Kroeker
bd47630bcf
exclude the alpha=0 branch as it does not handle NaN or Inf in x
1 year ago
Martin Kroeker
68f2501958
temporarily(?) disable the alpha=0 branch to handle Inf/NaN in x
1 year ago
Martin Kroeker
0a744a939a
temporarily(?) disable the alpha=0 branch to handle NaN/Inf in x
1 year ago
Martin Kroeker
7f8f037a36
handle INF and NAN in input
1 year ago
Martin Kroeker
f1248b849d
handle INF and NAN in input
1 year ago
Martin Kroeker
a2ee4b1966
Merge branch 'OpenMathLib:develop' into issue4728
1 year ago
Martin Kroeker
3ec59922b6
Add a clobber list to fix utest errors seen with gcc13 on Apple M
1 year ago
Martin Kroeker
3d8054fb16
add clobber list
1 year ago
Martin Kroeker
dd7efcf9ef
Avoid exceeding the configured thread count in x86_64 TOBF16 ( #4748 )
* avoid setting nthreads higher than available
1 year ago
Martin Kroeker
6ffaf99817
disable da=0 shortcut to handle NAN and INF correctly
1 year ago
Martin Kroeker
c7cacd9b38
disable the shortcut for da=0 to ensure proper handling of INF and NAN
1 year ago
Martin Kroeker
5ed4f24d6e
Handle corner cases with INF and NAN arguments
1 year ago
Martin Kroeker
2bd43ad0eb
Merge branch 'OpenMathLib:develop' into issue4728
1 year ago
Martin Kroeker
1abafcd9b2
handle corner cases involving NAN and/or INF
1 year ago
Martin Kroeker
442dec28df
Merge pull request #4738 from martin-frbg/issue4737
Disable GEMM3M for generic targets (not implemented)
1 year ago
Martin Kroeker
2787c9f8e4
Disable GEMM3M for generic targets (not implemented)
1 year ago
gxw
af73ae6208
LoongArch: Fixed issue 4728
1 year ago
gxw
8ab2e9ec65
LoongArch: DGEMM small matrix opt
2 years ago
Martin Kroeker
83bc8d5dd8
Merge pull request #4712 from RajalakshmiSR/zscalp10
POWER: Fix issues in zscal to address lapack failures
1 year ago
Martin Kroeker
020b3e1682
fix handling of INF arguments
1 year ago
Martin Kroeker
8c05765a5a
fix other corner cases where x=INF
1 year ago
Martin Kroeker
516743f7dc
fix other instances of mishandling INF
1 year ago
Martin Kroeker
9ff4e9714e
additional fixes for handling INF arguments
1 year ago
Martin Kroeker
ce130f11d2
Update zscal.c
1 year ago
Martin Kroeker
ab13cfef93
more fixes for infinite x
1 year ago
Martin Kroeker
ad2b5c67c8
fix another corner case involving infinity
1 year ago
Bart Oldeman
62f7b244ff
Replace use of FLT_MAX in x86_64 zscal.c by isinf()
Commit def4996 fixed issues with inf and nan values in zscal,
but used FLT_MAX, where DBL_MAX or isinf() is more appropriate,
as FLT_MAX is for single precision only.
Using FLT_MAX caused test case failures in the LAPACK tests.
isinf() is consistent with the later fix 969601a1
1 year ago