OpenBLAS

Commit Graph

Author	SHA1	Message	Date
Martin Kroeker	73751218a4	make NAN handling depend on dummy2 parameter	1 year ago
Martin Kroeker	b9bfc8ce09	make NAN handling depend on dummy2 parameter	1 year ago
Martin Kroeker	eb4879e04c	make NAN handling depend on the dummy2 parameter	1 year ago
Martin Kroeker	ee87cb90d0	Merge pull request #4803 from iha-taisei/SVESupportSDGEMV A64FX: Add support for SVE to SGEMV/DGEMV kernels.	1 year ago
gxw	34b80ce03f	mips64: Fixed numpy CI failure	1 year ago
gxw	f6d6c14a96	mips: Fixed numpy CI failure	1 year ago
Chip Kerchner	ba47c7f4f3	Vectorize reduction stage of sgemv_t.	1 year ago
iha fujitsu	0985fdc82b	A64FX: Add support for SVE to SGEMV/DGEMV kernels.	1 year ago
Mark Ryan	67bf4b6998	Fix axpby_rvv kernels for cases where inc_y = 0 The following openblas_utest tests fail when the RISCV64_ZVL128B is enabled. TEST 89/103 axpby:zaxpby_inc_0 [FAIL] TEST 92/103 axpby:caxpby_inc_0 [FAIL] TEST 95/103 axpby:daxpby_inc_0 [FAIL] TEST 98/103 axpby:saxpby_inc_0 [FAIL] The issue is that the vectorized kernels do not work when inc_y == 0. This patch updates the kernels to fall back to the scalar algorithms when inc_y == 0, fixing the failing tests. Signed-off-by: Mark Ryan <markdryan@rivosinc.com>	1 year ago
Mark Ryan	3b715e6162	Add autodetection for riscv64 Implement DYNAMIC_ARCH support for riscv64. Three cpu types are supported, riscv64_generic, riscv64_zvl256b, riscv64_zvl128b. The two non-generic kernels require CPU support for RVV 1.0 to function correctly. Detecting that a riscv64 device supports RVV 1.0 is a little complicated as there are some boards on the market that advertise support for V via hwcap but only support RVV 0.7.1, which is not binary compatible with RVV 1.0. The approach taken is to first try hwprobe. If hwprobe is not available, we fall back to hwcap + an additional check to distinguish between RVV 1.0 and RVV 0.7.1. Tested on a VM with VLEN=256, a CanMV K230 with VLEN=128 (with only the big core enabled), a Lichee Pi with RVV 0.7.1 and a VF2 with no vector. A compiler with RVV 1.0 support must be used to build OpenBLAS for riscv64 when DYNAMIC_ARCH=1. Signed-off-by: Mark Ryan <markdryan@rivosinc.com>	1 year ago
gxw	3f39c8f94f	LoongArch: Fixed numpy CI failure	1 year ago
gxw	f3cebb3ca3	x86: Fixed numpy CI failure when the target is ZEN.	1 year ago
Martin Kroeker	5d08ec7ff3	Merge pull request #4782 from martin-frbg/azurewincl Fix NAN handling in ARM/generic SCAL; have AzureCI Windows show errors on failure	1 year ago
Chip Kerchner	cb154832f8	Vectorize SBGEMM incopy - 4x faster.	1 year ago
Martin Kroeker	a5c04e326a	Update scal.c	1 year ago
Martin Kroeker	536200bc9e	fix handling of INF or NAN	1 year ago
Martin Kroeker	3677b3886c	Merge pull request #4702 from bashimao/detect-nv-grace Correctly detect ARM Neoverse V2 CPUs.	1 year ago
Martin Kroeker	f3c364c2cc	temporarily(?) disable the alpha=0 branch as it fails to handle INF,NAN	1 year ago
Martin Kroeker	2a5fe97e3b	temporarily(?) disable the alpha=0 branch as it does not handle INF,NAN	1 year ago
Martin Kroeker	c1019d5832	Handle INF and NAN in inputs	1 year ago
Chris Sidebottom	8c472ef7e3	Further tweak small GEMM for AArch64	1 year ago
Martin Kroeker	9e24121e7e	temporarily(?) disable da=0 shortcut to handle x=Inf or NAN	1 year ago
Martin Kroeker	a11f086c17	Update sscal_msa.c	1 year ago
Martin Kroeker	541e1b6959	disable the fast path for inc=1, alpha=0 as it does not handle x=NaN or Inf	1 year ago
Martin Kroeker	c08113c279	fix special cases of x= NAN or INF	1 year ago
Martin Kroeker	bd47630bcf	exclude the alpha=0 branch as it does not handle NaN or Inf in x	1 year ago
Martin Kroeker	68f2501958	temporarily(?) disable the alpha=0 branch to handle Inf/NaN in x	1 year ago
Martin Kroeker	0a744a939a	temporarily(?) disable the alpha=0 branch to handle NaN/Inf in x	1 year ago
Martin Kroeker	7f8f037a36	handle INF and NAN in input	1 year ago
Martin Kroeker	f1248b849d	handle INF and NAN in input	1 year ago
Martin Kroeker	a2ee4b1966	Merge branch 'OpenMathLib:develop' into issue4728	1 year ago
Martin Kroeker	3ec59922b6	Add a clobber list to fix utest errors seen with gcc13 on Apple M	1 year ago
Martin Kroeker	3d8054fb16	add clobber list	1 year ago
Martin Kroeker	dd7efcf9ef	Avoid exceeding the configured thread count in x86_64 TOBF16 (#4748 ) * avoid setting nthreads higher than available	1 year ago
Martin Kroeker	6ffaf99817	disable da=0 shortcut to handle NAN and INF correctly	1 year ago
Martin Kroeker	c7cacd9b38	disable the shortcut for da=0 to ensure proper handling of INF and NAN	1 year ago
Martin Kroeker	5ed4f24d6e	Handle corner cases with INF and NAN arguments	1 year ago
Martin Kroeker	2bd43ad0eb	Merge branch 'OpenMathLib:develop' into issue4728	1 year ago
Martin Kroeker	1abafcd9b2	handle corner cases involving NAN and/or INF	1 year ago
Martin Kroeker	442dec28df	Merge pull request #4738 from martin-frbg/issue4737 Disable GEMM3M for generic targets (not implemented)	1 year ago
Martin Kroeker	2787c9f8e4	Disable GEMM3M for generic targets (not implemented)	1 year ago
gxw	af73ae6208	LoongArch: Fixed issue 4728	1 year ago
gxw	8ab2e9ec65	LoongArch: DGEMM small matrix opt	2 years ago
Martin Kroeker	83bc8d5dd8	Merge pull request #4712 from RajalakshmiSR/zscalp10 POWER: Fix issues in zscal to address lapack failures	1 year ago
Martin Kroeker	020b3e1682	fix handling of INF arguments	1 year ago
Martin Kroeker	8c05765a5a	fix other corner cases where x=INF	1 year ago
Martin Kroeker	516743f7dc	fix other instances of mishandling INF	1 year ago
Martin Kroeker	9ff4e9714e	additional fixes for handling INF arguments	1 year ago
Martin Kroeker	ce130f11d2	Update zscal.c	1 year ago
Martin Kroeker	ab13cfef93	more fixes for infinite x	1 year ago

1 2 3 4 5 ...

2337 Commits (bb31bbef522b8b105ae2cb1bfce484ded5839b22)