Martin Kroeker
|
73af02b89f
|
use dummy2 as Inf/NAN handling flag
|
10 months ago |
Martin Kroeker
|
ca1ce84ee5
|
Merge pull request #5298 from martin-frbg/fixup5281
Fix PR5281 "kernel/arm64: fix cscal/zscal"
|
10 months ago |
Martin Kroeker
|
58eeb9041c
|
fix handling of dummy2
|
10 months ago |
Martin Kroeker
|
7c77537b25
|
Merge pull request #5297 from martin-frbg/zscal_x86_sparc
kernel/(x86|sparc): Fix cscal and zscal by reverting to the generic C kernels
|
10 months ago |
Martin Kroeker
|
63287e1855
|
Merge pull request #5296 from martin-frbg/zscal_riscv
kernel/riscv64: Fix cscal and zscal
|
10 months ago |
Martin Kroeker
|
d2855d3dab
|
Merge pull request #5285 from martin-frbg/zscal_zarch
kernel/zarch: Fix cscal and zscal
|
10 months ago |
Martin Kroeker
|
1408be5fe0
|
Merge pull request #5282 from martin-frbg/zscal_power
kernel/power: Fixed cscal and zscal
|
10 months ago |
Martin Kroeker
|
1589d0b21e
|
Merge pull request #5281 from martin-frbg/zscal_arm64
kernel/arm64: fixed cscal and zscal
|
10 months ago |
Martin Kroeker
|
a86419fb66
|
Merge pull request #5280 from martin-frbg/zscal_x86_64
kernel/x86_64: fixed cscal and zscal
|
10 months ago |
Martin Kroeker
|
11ff18bb0f
|
Merge pull request #5081 from XiWeiGu/kernel_generic_fixed_cscal_zscal
kernel/generic: Fixed cscal and zscal
|
10 months ago |
Martin Kroeker
|
2e2691b34b
|
Merge pull request #5078 from XiWeiGu/la64_fixed_cscal_zscal
LoongArch64: fixed cscal and zscal
|
10 months ago |
Martin Kroeker
|
f4194fc65f
|
Merge branch 'develop' into la64_fixed_cscal_zscal
|
11 months ago |
Martin Kroeker
|
e12132abd4
|
Use generic C/ZSCAL kernels to address inf/nan handling for now
|
11 months ago |
Martin Kroeker
|
1cefbea7ea
|
Use generic SCAL kernels to address inf/nan handling for now
|
11 months ago |
Martin Kroeker
|
f18b7a46bf
|
add dummy2 flag handling for inf/nan agnostic zeroing
|
11 months ago |
Martin Kroeker
|
fe220a0d7d
|
Merge pull request #5291 from guoyuanplct/develop
kernel/riscv64:fixed the performance problem in RISCV64_ZVL256 when OPENBLAS_K is small
|
11 months ago |
Martin Kroeker
|
bbdc265798
|
Merge pull request #5294 from arnej27959/arnej/fix-arm64-register
Accumulate results in output register explicitly
|
11 months ago |
Arne Juul
|
5442aff218
|
Accumulate results in output register explicitly
|
11 months ago |
guoyuanplct
|
83fcab7578
|
Merge branch 'develop' of https://github.com/guoyuanplct/OpenBLAS into develop
|
11 months ago |
guoyuanplct
|
2ae019161a
|
fixed the performance problem in RISCV64_ZVL256 when OPENBLAS_K is small
|
11 months ago |
Martin Kroeker
|
02267d86f5
|
Merge pull request #5288 from guoyuanplct/develop
kernel/riscv64:Optimized the implementation of axpby on TARGET=RISCV64_ZVL256B.
|
11 months ago |
guoyuanplct
|
d2003dc886
|
del lines
|
11 months ago |
guoyuanplct
|
45fd2d9b07
|
Optimized the axpby function.
|
11 months ago |
Martin Kroeker
|
fb8dc8ff5c
|
Add dummy2 flag handling
|
11 months ago |
Martin Kroeker
|
cf06250d36
|
add handling of dummy2 flag
|
11 months ago |
Martin Kroeker
|
28f8fdaf0f
|
support flag for NaN/Inf handling and fix scaling of NaN/Inf values
|
11 months ago |
Martin Kroeker
|
669c847ceb
|
support extra flag for NaN handling
|
11 months ago |
Martin Kroeker
|
0163143fdd
|
Merge pull request #5278 from martin-frbg/fixup5276
Fix compilation with pre-C99 compilers
|
11 months ago |
Martin Kroeker
|
20f2ba0141
|
Move declaration of i for pre-C99 compilers
|
11 months ago |
Martin Kroeker
|
e2e6a4d90a
|
Merge pull request #5276 from nakagawa-fj/gemm_2d_thread_partitioning
Improvement of 2D thread-partitioned GEMM for M << N case
|
11 months ago |
Martin Kroeker
|
9ef5995c22
|
Merge pull request #5277 from martin-frbg/fixmingw32
Fix building with mingw32-gcc15
|
11 months ago |
Martin Kroeker
|
42b7d1f897
|
Fix addressing of alpha in CBLAS
|
11 months ago |
Martin Kroeker
|
bd573a9d38
|
Expand mingw32 gfortran workaround to all versions after 14.1
|
11 months ago |
Masato Nakagawa
|
2351a98005
|
Update 2D thread-partitioned GEMM for M << N case.
|
11 months ago |
Martin Kroeker
|
a5f701c4ab
|
Merge pull request #5274 from martin-frbg/issue5247
Expressly provide a shared libs option in CMakelists.txt
|
11 months ago |
Martin Kroeker
|
4ca76d9de4
|
Expressly provide a shared libs option
|
11 months ago |
Martin Kroeker
|
846a5436e7
|
Merge pull request #5273 from martin-frbg/issue5259
CMAKE: Do not suffix the library with a 64 if LIBNAMESUFFIX already contains it
|
11 months ago |
Martin Kroeker
|
8779eac3b8
|
Do not add a 64 suffix to the library name if the user-provided suffix already contains it
|
11 months ago |
Martin Kroeker
|
3473118213
|
Merge pull request #5272 from martin-frbg/issue5271
Fix compiler options for NeoverseN1 and CortexX2/A?10 in CMake builds
|
11 months ago |
Martin Kroeker
|
f2022c23ac
|
Remove sve capability from NeoverseN1 and specify CortexX2/A?10 as arm8.4a
|
11 months ago |
Martin Kroeker
|
b5456c1b41
|
Merge pull request #5260 from taoye9/enable_bf16_gemm_gemv_forward_on_arm64
enable sbgemm to be forward to sbgemv on arm64
|
11 months ago |
Martin Kroeker
|
5a322f21af
|
Merge pull request #5268 from martin-frbg/fix-dyn-sgemmdirect
Fix conditional inclusion of SGEMM_KERNEL_DIRECT
|
11 months ago |
Martin Kroeker
|
6680e0592f
|
Fix conditional inclusion of SGEMM_KERNEL_DIRECT
|
11 months ago |
Martin Kroeker
|
0b0bb9951d
|
Merge pull request #5265 from guoyuanplct/develop
kernel/riscv64:Added support for omatcopy on RISCV64_ZVL256B
|
11 months ago |
guoyuanplct
|
7732a55200
|
Add retry mechanism after deadlock timeout for c910v.
|
11 months ago |
guoyuanplct
|
be9f7550b5
|
Format Code
|
11 months ago |
guoyuanplct
|
4d213653d8
|
kernel/riscv64:Added support for omatcopy on riscv64.
|
11 months ago |
Martin Kroeker
|
8afddc1a81
|
Merge pull request #5262 from guoyuanplct/develop
kernel/riscv64:Fixed the bug of openblas_utest_ext failing in c/zgemv and some c/zgbmv tests:
|
11 months ago |
guoyuanplct
|
9a7e3f102b
|
kernel/riscv64:Fixed the bug of openblas_utest_ext failing in c/zgemv and some c/zgbmv tests:
|
11 months ago |
Martin Kroeker
|
5366902f9d
|
Merge pull request #5261 from ErnstPeng/fix-lasx
Fix cgemm_ncopy_16_lasx function for lapack-test and add it C function
|
11 months ago |