OpenBLAS

Commit Graph

Author	SHA1	Message	Date
Martin Kroeker	f4194fc65f	Merge branch 'develop' into la64_fixed_cscal_zscal	7 months ago
pengxu	0ccb050583	Loongarch64: fixed cgemm_ncopy_16_lasx	8 months ago
pengxu	f19e72c402	Loongarch64: fixed swap_lasx	9 months ago
pengxu	b471fa337b	Loongarch64: fixed snrm2_lasx	9 months ago
pengxu	57bb46bedf	Loongarch64: fixed rot_lasx	9 months ago
pengxu	6dc4ca2391	Loongarch64: fixed icamax_lasx	9 months ago
pengxu	b528b1b8ea	Loongarch64: fixed iamax_lasx	9 months ago
pengxu	ba9569e382	Loongarch64: fixed dot_lasx	9 months ago
pengxu	dc5fa29851	Loongarch64: fixed cscal_lasx	9 months ago
pengxu	a98dd6d911	Loongarch64: fixed copy_lasx	9 months ago
pengxu	d49319c2d2	Loongarch64: fixed cnrm2_lasx	9 months ago
pengxu	74c97ef814	Loongarch64: fixed cdot_lasx	9 months ago
pengxu	be525521ad	Loongarch64: fixed asum_lasx	9 months ago
pengxu	0cd5ca5527	Loongarch64: fixed amax_lasx	9 months ago
gxw	2c4a5cc6e6	LoongArch64: Fixed snrm2_lsx.S and cnrm2_lsx.S When the data type is single-precision real or single-precision complex, converting it to double precision does not prevent overflow (as exposed in LAPACK tests). The only solution is to follow C's approach: find the maximum value in the array and divide each element by that maximum to avoid this issue	11 months ago
gxw	9e75d6b3d1	LoongArch64: Fixed swap_lsx.S Fixed the error when the stride is zero	11 months ago
gxw	e8c740368c	LoongArch64: Fixed rot_lsx.S ane crot_lsx.S Do not check whether the input parameters c and s are zero, as this may cause errors with special values (same as scal). Although OpenBLAS's own test suite doesn't catch this, it will cause LAPACK test cases to fail.	11 months ago
Hao Chen	c2212d0abd	LoongArch64: Fixed copy_lsx.S Fixed incorrect store operation Signed-off-by: gxw <guxiwei-hf@loongson.cn>	11 months ago
Hao Chen	7f1ebc7ae6	LoongArch64: Fixed iamax_lsx.S Fixed index retrieval issue when there are identical maximum absolute values Signed-off-by: Hao Chen <chenhao@loongson.cn> Signed-off-by: gxw <guxiwei-hf@loongson.cn>	11 months ago
Hao Chen	31d326f895	LoongArch64: Fixed dot_lsx.S Fixed incorrect register usage in instructions Signed-off-by: gxw <guxiwei-hf@loongson.cn>	1 year ago
Hao Chen	5d6356bc16	LoongArch64: Fixed amax_lsx.S Fixed register zeroing operation Signed-off-by: Hao Chen <chenhao@loongson.cn> Signed-off-by: gxw <guxiwei-hf@loongson.cn>	1 year ago
Martin Kroeker	180ba5e7d0	Merge pull request #5069 from tingboliao/dev_rotm_20250107 Further rearranged the rotm kernel for the different architectures.	1 year ago
gxw	2da86b80c9	LoongArch64: Fixed scalar version of cscal and zscal	1 year ago
gxw	5392f6df69	LoongArch64: Fixed LASX version of cscal and zscal	1 year ago
tingbo.liao	3c8df6358f	Further rearranged the rotm kernel for the different architectures. Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>	1 year ago
gxw	b2117bb2ca	LoongArch64: Fixed LSX version of cscal and zscal	1 year ago
gxw	e0a8216554	LoongArch64: Update dsymv LSX version	1 year ago
gxw	a9070ba3f9	LoongArch64: Update ssymv LSX version	1 year ago
Xi Ruoyao	af10c132b8	LoongArch64: Fix dsymv and ssymv LASX version "fmov.d $f2, $f4" leaves all the bits higher than the 63-th bit unpredictable but it's obvious that the following code uses the value of those high bits. We actually want to replicate the lower 64 bits here, so we should use xvreplve0.d instead. LA464 (Loongson 3[A-Z]-5000) happens to replicate them for us due to some uarch internal details so the issue was not detected, but for LA664 (Loongson 3[A-Z]-6000) and future uarch we need to do things correctly or we end up getting a lot of test failures. Closes: https://bbs.aosc.io/t/topic/302 Signed-off-by: Xi Ruoyao <xry111@xry111.site>	1 year ago
gxw	20a8e48f25	LoongArch64: Update ssymv LASX version	1 year ago
gxw	e0748588b8	LoongArch64: Update dsymv LASX version	1 year ago
gxw	bb31bbef52	LoongArch64: Opt somatcopy_ct with LASX	1 year ago
gxw	b37129341b	LoongArch64: Opt somatcopy_cn with LASX	1 year ago
gxw	acf6cab304	LoongArch64: Opt somatcopy_rn with LASX	1 year ago
gxw	15edb441bf	LoongArch64: Opt somatcopy_rt with LASX	1 year ago
Martin Kroeker	9783dd07ab	Rename KERNEL.LOONGSONGENERIC to KERNEL.LA64_GENERIC	1 year ago
Martin Kroeker	de421b7764	Merge pull request #4904 from XiWeiGu/la64_cross_cmake LoongArch64: Enable cmake cross-compilation	1 year ago
gxw	30af9278dc	LoongArch64: Enable cmake cross-compilation	1 year ago
gxw	48698b2b1d	LoongArch64: Rename core Use microarchitecture name instead of meaningless strings to name the core, the legacy core is still retained. 1. Rename LOONGSONGENERIC to LA64_GENERIC 2. Rename LOONGSON3R5 to LA464 3. Rename LOONGSON2K1000 to LA264	1 year ago
Martin Kroeker	e05d98d00a	expressly use fld.d/fst.d for floating point registers instead of LD/ST macros	1 year ago
gxw	3f39c8f94f	LoongArch: Fixed numpy CI failure	1 year ago
gxw	af73ae6208	LoongArch: Fixed issue 4728	1 year ago
gxw	8ab2e9ec65	LoongArch: DGEMM small matrix opt	2 years ago
Martin Kroeker	8da6f7e5f2	Merge pull request #4686 from XiWeiGu/loongarch64_dgemm_kernel_16x6 Loongarch64: Improving the Performance and Stability of dgemm	1 year ago
gxw	f9a26240a7	loongarch64: Fixed icamax_lsx	1 year ago
gxw	cb0f707409	loongarch64: Fixed utest fork:safety	1 year ago
Martin Kroeker	b45d8e1ab2	remove stray comma	1 year ago
gxw	6017ad7146	loongarch64: Update dgemm_kernel_16x4 to dgemm_kernel_16x6	1 year ago
Martin Kroeker	992b71fea2	remove stray comma	1 year ago
gxw	7cd438a5ac	loongarch64: Fixed clang compilation issues	1 year ago

1 2 3

139 Commits (cdebb4fd4b2bbbf856e5abdcedbe9a5cf348ef8e)