Martin Kroeker
1546599a13
Merge pull request #5315 from loss-and-quick/arm-exec-stack
Add .note.GNU-stack in ARM epilogue to avoid writable stack
11 months ago
minicx
79b4dd0fb0
fix(arm): add .note.GNU-stack to ARM assembly to prevent writable-stack warnings
Add .section .note.GNU-stack in ARM assembly epilogue on Linux/ELF targets to
avoid warnings about a writable/executable stack and ensure shared objects do
not require an executable stack.
Signed-off-by: minicx <minicx@disroot.org>
11 months ago
Martin Kroeker
c2342fc2d0
Merge pull request #5314 from martin-frbg/dynampere1
Support AmpereOne/OneA as NeoverseN1 in DYNAMIC_ARCH builds
11 months ago
Martin Kroeker
e541bf68f5
support AmpereOne/OneA as NeoverseN1
11 months ago
Martin Kroeker
5ad6435660
Merge pull request #5312 from martin-frbg/x86cdot
Work around X86 POTRS/CDOT bug on old systems and add CI job for 32bit manylinux
11 months ago
Martin Kroeker
e684e36377
Add 32bit manylinux to match what python wheel build tests use
11 months ago
Martin Kroeker
3318a2b904
override CDOT and ZDOT with the generic C kernel
11 months ago
Martin Kroeker
85337c5160
Merge pull request #5310 from nakagawa-fj/bugfix/identify_cpu_part_for_arm64
Bug Fix: Problem with identifying some ARM64 processors
11 months ago
Masato Nakagawa
1dd396033a
Fix:Problem with identifying some ARM64 processors.
11 months ago
Martin Kroeker
f1097d1cba
Merge pull request #5306 from martin-frbg/lapack1131
Fix missing initialization leading to bypassing corner cases in C/ZGEQP3RK (Reference-LAPACK PR #1131 )
11 months ago
Martin Kroeker
bad47bd024
Fix too strict leading dimensions check in LAPACKE_?gesdd_work (Reference-LAPACK PR #1126 ) ( #5307 )
* relax leading dimensions check (Reference-LAPACK PR #1126 )
11 months ago
Martin Kroeker
7f3093a0ad
Merge pull request #5305 from martin-frbg/lapack1135
Fix 2nd dimension used by LAPACKE_c/zunmlq in NaN check and transposition (Reference-LAPACK PR #1135 )
11 months ago
Martin Kroeker
1804ff58d7
fix missing initialization
11 months ago
Martin Kroeker
906b9df316
fix missing initialization
11 months ago
Martin Kroeker
f4e5177050
fix dimension used in nancheck (Reference-LAPACK PR 1135)
11 months ago
Martin Kroeker
2a6beac88f
fix dimension used in transposition (Reference-LAPACK PR 1135)
11 months ago
Martin Kroeker
d8a2324699
fix dimension used in nancheck (Reference-LAPACK PR 1135)
11 months ago
Martin Kroeker
874744976c
fix dimension used in nancheck (Reference-LAPACK PR 1135)
11 months ago
Martin Kroeker
0ea173ec8c
Merge pull request #5304 from martin-frbg/fixgemmtr_if
fix source file used for sbgemmt/sbgemmtr in CMake builds
11 months ago
Martin Kroeker
5e393f207c
fix source file used for sbgemmt/sbgemmtr
11 months ago
Martin Kroeker
dbd5643d37
Merge pull request #5302 from martin-frbg/zscal_mips_3
mips64 SICORTEX: temporarily change default C/ZSCAL to the non-asm implementation
11 months ago
Martin Kroeker
e338d34ce1
fix path
11 months ago
Martin Kroeker
d36093d084
temporarily change default C/ZSCAL to the non-asm implementation
11 months ago
Martin Kroeker
cc4b04a684
Merge pull request #5301 from martin-frbg/zscal_mips_2
kernel/mips(64): Fix cscal and zscal
11 months ago
Martin Kroeker
b3c90564d7
resync with the generic arm version for inf/nan handling
11 months ago
Martin Kroeker
6bdc7f9eb7
Merge pull request #5300 from martin-frbg/fixup5296
kernel/riscv64: Fix cscal/zscal for riscv64_generic
11 months ago
Martin Kroeker
63272b6c82
Merge pull request #5299 from martin-frbg/x86_64-ssezscal
Disable the default SSE kernels for x86_64 CSCAL/ZSCAL for now
11 months ago
Martin Kroeker
73af02b89f
use dummy2 as Inf/NAN handling flag
11 months ago
Martin Kroeker
549a9f1dbb
Disable the default SSE kernels for CSCAL/ZSCAL for now
11 months ago
Martin Kroeker
ca1ce84ee5
Merge pull request #5298 from martin-frbg/fixup5281
Fix PR5281 "kernel/arm64: fix cscal/zscal"
11 months ago
Martin Kroeker
58eeb9041c
fix handling of dummy2
11 months ago
Martin Kroeker
7c77537b25
Merge pull request #5297 from martin-frbg/zscal_x86_sparc
kernel/(x86|sparc): Fix cscal and zscal by reverting to the generic C kernels
11 months ago
Martin Kroeker
63287e1855
Merge pull request #5296 from martin-frbg/zscal_riscv
kernel/riscv64: Fix cscal and zscal
11 months ago
Martin Kroeker
d2855d3dab
Merge pull request #5285 from martin-frbg/zscal_zarch
kernel/zarch: Fix cscal and zscal
11 months ago
Martin Kroeker
1408be5fe0
Merge pull request #5282 from martin-frbg/zscal_power
kernel/power: Fixed cscal and zscal
11 months ago
Martin Kroeker
1589d0b21e
Merge pull request #5281 from martin-frbg/zscal_arm64
kernel/arm64: fixed cscal and zscal
11 months ago
Martin Kroeker
a86419fb66
Merge pull request #5280 from martin-frbg/zscal_x86_64
kernel/x86_64: fixed cscal and zscal
11 months ago
Martin Kroeker
11ff18bb0f
Merge pull request #5081 from XiWeiGu/kernel_generic_fixed_cscal_zscal
kernel/generic: Fixed cscal and zscal
11 months ago
Martin Kroeker
2e2691b34b
Merge pull request #5078 from XiWeiGu/la64_fixed_cscal_zscal
LoongArch64: fixed cscal and zscal
11 months ago
Martin Kroeker
f4194fc65f
Merge branch 'develop' into la64_fixed_cscal_zscal
11 months ago
Martin Kroeker
e12132abd4
Use generic C/ZSCAL kernels to address inf/nan handling for now
11 months ago
Martin Kroeker
1cefbea7ea
Use generic SCAL kernels to address inf/nan handling for now
11 months ago
Martin Kroeker
f18b7a46bf
add dummy2 flag handling for inf/nan agnostic zeroing
11 months ago
Martin Kroeker
fe220a0d7d
Merge pull request #5291 from guoyuanplct/develop
kernel/riscv64:fixed the performance problem in RISCV64_ZVL256 when OPENBLAS_K is small
11 months ago
Martin Kroeker
bbdc265798
Merge pull request #5294 from arnej27959/arnej/fix-arm64-register
Accumulate results in output register explicitly
11 months ago
Arne Juul
5442aff218
Accumulate results in output register explicitly
11 months ago
guoyuanplct
83fcab7578
Merge branch 'develop' of https://github.com/guoyuanplct/OpenBLAS into develop
11 months ago
guoyuanplct
2ae019161a
fixed the performance problem in RISCV64_ZVL256 when OPENBLAS_K is small
11 months ago
Martin Kroeker
02267d86f5
Merge pull request #5288 from guoyuanplct/develop
kernel/riscv64:Optimized the implementation of axpby on TARGET=RISCV64_ZVL256B.
11 months ago
guoyuanplct
d2003dc886
del lines
11 months ago