Rohit Goswami
18732402d3
MAINT: Move -m64 out to cpu_family()
1 year ago
Rohit Goswami
2da1c2444f
MAINT: Add simd flags
1 year ago
Rohit Goswami
553ca0fb67
MAINT: Generalize and setup F_INTERFACE
1 year ago
Rohit Goswami
32567edbcc
MAINT: Rework make defines to meson arguments
For SMALL_MATRIX_OPT and MAX_STACK_ALLOC
1 year ago
Rohit Goswami
e06834c5cc
TMP: Focus on getting a single test example up
Use:
nm -gC bbdir/libopenblas.a | grep drot
❯ gcc trial.c -o trail -I$(pwd)/tmpmake/include -L$(pwd)/bbdir -lopenblas -Wl,--verbose | grep openblas
❯ ./trail
Resulting vectors:
x: 3.000000 4.000000 5.000000 6.000000
y: 2.000000 2.000000 2.000000 2.000000
1 year ago
Rohit Goswami
8947604447
BLD: Add generic BLAS2 modes
1 year ago
Rohit Goswami
844cb7a68f
ENH: Add more L2 flags
1 year ago
Rohit Goswami
bd43398df8
BLD: Add swap and refactor a bit
1 year ago
Rohit Goswami
d5da5164e4
TMP: Be more DRY
1 year ago
Rohit Goswami
ade3f82c73
ENH: Start abstracting rules for kernels
1 year ago
Martin Kroeker
5d08ec7ff3
Merge pull request #4782 from martin-frbg/azurewincl
Fix NAN handling in ARM/generic SCAL; have AzureCI Windows show errors on failure
1 year ago
Chip Kerchner
cb154832f8
Vectorize SBGEMM incopy - 4x faster.
1 year ago
Martin Kroeker
a5c04e326a
Update scal.c
1 year ago
Martin Kroeker
536200bc9e
fix handling of INF or NAN
1 year ago
Martin Kroeker
3677b3886c
Merge pull request #4702 from bashimao/detect-nv-grace
Correctly detect ARM Neoverse V2 CPUs.
1 year ago
Martin Kroeker
f3c364c2cc
temporarily(?) disable the alpha=0 branch as it fails to handle INF,NAN
1 year ago
Martin Kroeker
2a5fe97e3b
temporarily(?) disable the alpha=0 branch as it does not handle INF,NAN
1 year ago
Martin Kroeker
c1019d5832
Handle INF and NAN in inputs
1 year ago
Martin Kroeker
9e24121e7e
temporarily(?) disable da=0 shortcut to handle x=Inf or NAN
1 year ago
Martin Kroeker
a11f086c17
Update sscal_msa.c
1 year ago
Martin Kroeker
541e1b6959
disable the fast path for inc=1, alpha=0 as it does not handle x=NaN or Inf
1 year ago
Martin Kroeker
c08113c279
fix special cases of x= NAN or INF
1 year ago
Martin Kroeker
bd47630bcf
exclude the alpha=0 branch as it does not handle NaN or Inf in x
1 year ago
Martin Kroeker
68f2501958
temporarily(?) disable the alpha=0 branch to handle Inf/NaN in x
1 year ago
Martin Kroeker
0a744a939a
temporarily(?) disable the alpha=0 branch to handle NaN/Inf in x
1 year ago
Martin Kroeker
7f8f037a36
handle INF and NAN in input
1 year ago
Martin Kroeker
f1248b849d
handle INF and NAN in input
1 year ago
Martin Kroeker
a2ee4b1966
Merge branch 'OpenMathLib:develop' into issue4728
1 year ago
Martin Kroeker
3ec59922b6
Add a clobber list to fix utest errors seen with gcc13 on Apple M
1 year ago
Martin Kroeker
3d8054fb16
add clobber list
1 year ago
Martin Kroeker
dd7efcf9ef
Avoid exceeding the configured thread count in x86_64 TOBF16 ( #4748 )
* avoid setting nthreads higher than available
1 year ago
Martin Kroeker
6ffaf99817
disable da=0 shortcut to handle NAN and INF correctly
1 year ago
Martin Kroeker
c7cacd9b38
disable the shortcut for da=0 to ensure proper handling of INF and NAN
1 year ago
Martin Kroeker
5ed4f24d6e
Handle corner cases with INF and NAN arguments
1 year ago
Martin Kroeker
2bd43ad0eb
Merge branch 'OpenMathLib:develop' into issue4728
1 year ago
Martin Kroeker
1abafcd9b2
handle corner cases involving NAN and/or INF
1 year ago
Martin Kroeker
442dec28df
Merge pull request #4738 from martin-frbg/issue4737
Disable GEMM3M for generic targets (not implemented)
1 year ago
Martin Kroeker
2787c9f8e4
Disable GEMM3M for generic targets (not implemented)
1 year ago
gxw
af73ae6208
LoongArch: Fixed issue 4728
1 year ago
gxw
8ab2e9ec65
LoongArch: DGEMM small matrix opt
2 years ago
Martin Kroeker
83bc8d5dd8
Merge pull request #4712 from RajalakshmiSR/zscalp10
POWER: Fix issues in zscal to address lapack failures
1 year ago
Martin Kroeker
020b3e1682
fix handling of INF arguments
1 year ago
Martin Kroeker
8c05765a5a
fix other corner cases where x=INF
1 year ago
Martin Kroeker
516743f7dc
fix other instances of mishandling INF
1 year ago
Martin Kroeker
9ff4e9714e
additional fixes for handling INF arguments
1 year ago
Martin Kroeker
ce130f11d2
Update zscal.c
1 year ago
Martin Kroeker
ab13cfef93
more fixes for infinite x
1 year ago
Martin Kroeker
ad2b5c67c8
fix another corner case involving infinity
1 year ago
Bart Oldeman
62f7b244ff
Replace use of FLT_MAX in x86_64 zscal.c by isinf()
Commit def4996 fixed issues with inf and nan values in zscal,
but used FLT_MAX, where DBL_MAX or isinf() is more appropriate,
as FLT_MAX is for single precision only.
Using FLT_MAX caused test case failures in the LAPACK tests.
isinf() is consistent with the later fix 969601a1
1 year ago
Rajalakshmi Srinivasaraghavan
e112191b54
POWER: Fix issues in zscal to address lapack failures
This patch fixes following lapack failures with clang compiler on POWER.
zed.out: ZVX: 18 out of 5190 tests failed to pass the threshold
zgd.out: ZGV drivers: 25 out of 1092 tests failed to pass the threshold
zgd.out: ZGV drivers: 6 out of 1092 tests failed to pass the threshold
1 year ago