Martin Kroeker
9faebb3c97
fix lost indentation in the rules for the thread safety test
1 year ago
Martin Kroeker
262018f14c
Merge pull request #5092 from XiWeiGu/la64_fixed_cmake
LoongArch64: Fixed cmake
1 year ago
Martin Kroeker
180ba5e7d0
Merge pull request #5069 from tingboliao/dev_rotm_20250107
Further rearranged the rotm kernel for the different architectures.
1 year ago
gxw
1ebcbdbab3
LoongArch64: Fixed the issue of using the old-style TARGET in cmake builds
1 year ago
Martin Kroeker
1a6a9fb22f
add another generator line for rotm
1 year ago
Martin Kroeker
518e376820
Merge pull request #5090 from martin-frbg/cmakeutils
Fix CMake interpretation of KERNEL file variables relevant to WoA
1 year ago
Martin Kroeker
111c9b0733
Add translations for C_COMPILER and OSNAME
1 year ago
Martin Kroeker
4924319c50
fix position of srotm, qrotm
1 year ago
Martin Kroeker
b58cba9eb6
fix qrotm build rules
1 year ago
tingbo.liao
3c8df6358f
Further rearranged the rotm kernel for the different architectures.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
1 year ago
Martin Kroeker
76db346f7e
Merge pull request #5082 from martin-frbg/woa_cpuid
Get ARM64 TARGET information from the registry on Windows
1 year ago
Martin Kroeker
5f7b03a441
Merge pull request #5083 from martin-frbg/fixmips64ci
MIPS64 CI :fix breakage from inadvertent line join in yml file
1 year ago
Martin Kroeker
100e74d4d6
restore deleted line break
1 year ago
Martin Kroeker
ca3e1c8f9c
Get TARGET information from the registry on Windows
1 year ago
Martin Kroeker
87083fdbf6
[WIP] Work around assembler limitations in current LLVM for Windows on Arm ( #5076 )
* Protect align directives in assembly files that are currently problematic with LLVM on WoA
* use the armv8 zdot on WoA to work around other LLVM issues
1 year ago
Martin Kroeker
2954dc1a70
CI: Add NeoverseN2 build on the new Cobalt-100 ( #5080 )
* Add NeoverseN2 build
1 year ago
Martin Kroeker
7c3a920a81
CI: Update ubuntu-latest runners to fix side effects of switch to 24.04 ( #5079 )
1 year ago
Martin Kroeker
a7483d181b
Merge pull request #5074 from tingboliao/develop
Optimize the gemm_tcopy_8_rvv to be compatible with the vlens 128 and 256.
1 year ago
tingbo.liao
ef7f54b357
Optimized the gemm_tcopy_8_rvv to be compatible with the vlens 128 and 256.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
1 year ago
Martin Kroeker
eba7338484
Merge pull request #5073 from XiWeiGu/la64_update_symv_lsx_version
LoongArch64: Update symv lsx version
1 year ago
gxw
e0a8216554
LoongArch64: Update dsymv LSX version
1 year ago
gxw
a9070ba3f9
LoongArch64: Update ssymv LSX version
1 year ago
Martin Kroeker
9b981035db
Merge pull request #5070 from xry111/xry111/lasx-la664
LoongArch64: Fix dsymv and ssymv LASX version
1 year ago
Martin Kroeker
fee353e63d
Merge pull request #5072 from martin-frbg/azureosx13
Azure CI: update deprecated macos-12 jobs to macos-13 image
1 year ago
Martin Kroeker
0c0112dfef
update deprecated macos-12 jobs to macos-13 image
1 year ago
Xi Ruoyao
af10c132b8
LoongArch64: Fix dsymv and ssymv LASX version
"fmov.d $f2, $f4" leaves all the bits higher than the 63-th bit
unpredictable but it's obvious that the following code uses the value of
those high bits. We actually want to replicate the lower 64 bits here,
so we should use xvreplve0.d instead.
LA464 (Loongson 3[A-Z]-5000) happens to replicate them for us due to
some uarch internal details so the issue was not detected, but for LA664
(Loongson 3[A-Z]-6000) and future uarch we need to do things correctly
or we end up getting a lot of test failures.
Closes: https://bbs.aosc.io/t/topic/302
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
1 year ago
Martin Kroeker
4e817f804c
Update version to 0.3.29.dev
1 year ago
Martin Kroeker
8a316e68a5
Update version to 0.3.29.dev
1 year ago
Martin Kroeker
07756abb3e
Merge pull request #5067 from OpenMathLib/release-0.3.0
merge release 0.3.29 back into develop to copy tag
1 year ago
Martin Kroeker
8795fc7985
set version to 0.3.29
1 year ago
Martin Kroeker
e0c134e1f6
set version to 0.3.29
1 year ago
Martin Kroeker
9207052d85
Merge pull request #5066 from OpenMathLib/develop
Merge changes from develop in preparation of the 0.3.29 release
1 year ago
Martin Kroeker
7f5b703a80
Merge pull request #5065 from martin-frbg/changelog0329
Update the Changelog for version 0.3.29
1 year ago
Martin Kroeker
20f6114e98
add descriptions of build/runtime vars to 0.3.29 improvements
1 year ago
Martin Kroeker
f422845b6d
Merge pull request #5064 from martin-frbg/lapack1080
Replace LAPACK ?LARFT with a recursive implementation (Reference-LAPACK PR 1080)
1 year ago
Martin Kroeker
ce66ffe7bb
Update the Changelog for version 0.3.29
1 year ago
Martin Kroeker
d035e80d33
move the original non-recursive ?LARFT here (Reference-LAPACK PR 1080)
1 year ago
Martin Kroeker
459fa8102b
Create subdirectory for the old non-recursive ?larft
1 year ago
Martin Kroeker
0c4b4cd78c
move the non-recursive original ?larft here (Reference-LAPACK PR 1080)
1 year ago
Martin Kroeker
ed516994d6
replace ?larft with a recursive implementation (Reference-LAPACK PR 1080)
1 year ago
Martin Kroeker
5527eda561
Merge pull request #5063 from martin-frbg/lapack1062
Remove comparison that is always false (Reference-LAPACK PR 1062)
1 year ago
Martin Kroeker
4c1a23673a
Remove comparison that is always false (Reference-LAPACK PR 1062)
1 year ago
Martin Kroeker
d74eb02954
Merge pull request #5057 from martin-frbg/issue5050
Replace while loop in generic C/ZGEMM_BETA to avoid going out of bounds
1 year ago
Martin Kroeker
30f7a4120b
Merge pull request #5056 from tingboliao/dev_omatcopy_20250108
Optimize the omatcopy_cn/zomatcopy_cn kernels with RVV 1.0 intrinsic.
1 year ago
Martin Kroeker
0b9de3ef7d
Merge pull request #5042 from tingboliao/develop
Add the test cases of rot to improve the unit tests for rot_rvv.
1 year ago
Martin Kroeker
c31f148c76
Merge pull request #5061 from XiWeiGu/la64_update_symv
LoongArch64: Update symv
1 year ago
gxw
20a8e48f25
LoongArch64: Update ssymv LASX version
1 year ago
gxw
e0748588b8
LoongArch64: Update dsymv LASX version
1 year ago
Martin Kroeker
d91d4fa6e9
convert the beta=0 branch to a for loop as well
1 year ago
Martin Kroeker
8cc32f5461
Merge branch 'OpenMathLib:develop' into issue5050
1 year ago