Martin Kroeker
eb4879e04c
make NAN handling depend on the dummy2 parameter
1 year ago
Martin Kroeker
ee87cb90d0
Merge pull request #4803 from iha-taisei/SVESupportSDGEMV
A64FX: Add support for SVE to SGEMV/DGEMV kernels.
1 year ago
Martin Kroeker
e9f6aa46a4
Merge pull request #4800 from vlad0x00/patch-2
Add missing parentheses
1 year ago
Martin Kroeker
b1aa2e1768
Merge pull request #4802 from markdryan/markdryan/rvv_axpby_incy0
Fix axpby_rvv kernels for cases where inc_y = 0
1 year ago
iha fujitsu
0985fdc82b
A64FX: Add support for SVE to SGEMV/DGEMV kernels.
1 year ago
Vladimir Nikolić
56e1782ffb
Add another missing parenthesis
1 year ago
Vladimir Nikolić
127ea5d0d9
Add missing parenthesis
1 year ago
Martin Kroeker
a3c10c6c25
Merge pull request #4799 from martin-frbg/issue4762
Improve the error message for (p)thread creation failure
1 year ago
Martin Kroeker
a373d0f107
Improve the error message for thread creation failure
1 year ago
Mark Ryan
67bf4b6998
Fix axpby_rvv kernels for cases where inc_y = 0
The following openblas_utest tests fail when the RISCV64_ZVL128B is
enabled.
TEST 89/103 axpby:zaxpby_inc_0 [FAIL]
TEST 92/103 axpby:caxpby_inc_0 [FAIL]
TEST 95/103 axpby:daxpby_inc_0 [FAIL]
TEST 98/103 axpby:saxpby_inc_0 [FAIL]
The issue is that the vectorized kernels do not work when inc_y == 0.
This patch updates the kernels to fall back to the scalar algorithms
when inc_y == 0, fixing the failing tests.
Signed-off-by: Mark Ryan <markdryan@rivosinc.com>
1 year ago
Martin Kroeker
6013b36b16
Merge pull request #4796 from martin-frbg/ppcbuf
Suffix BUFFER_SIZEs on POWER as UL to prevent int overflow in computations
1 year ago
Martin Kroeker
9789034281
Merge branch 'OpenMathLib:develop' into ppcbuf
1 year ago
Martin Kroeker
5d08ec7ff3
Merge pull request #4782 from martin-frbg/azurewincl
Fix NAN handling in ARM/generic SCAL; have AzureCI Windows show errors on failure
1 year ago
Martin Kroeker
dfc11ef248
Merge pull request #4791 from ChipKerchner/vectorizeSBGEMMincopy
Vectorize SBGEMM incopy for Power10 - 4x faster.
1 year ago
Martin Kroeker
2fefdfa2b8
Merge branch 'OpenMathLib:develop' into azurewincl
1 year ago
Martin Kroeker
475bd2452b
Suffix BUFFERSIZEs as UL to prevent int overflow in computations
1 year ago
Martin Kroeker
b70227ad62
Merge pull request #4795 from pkubaj/patch-1
Fix build on FreeBSD/powerpc64*
1 year ago
Martin Kroeker
8277828fdc
Merge pull request #4785 from rgommers/docs-install
Rewrite "Install OpenBLAS" docs page
1 year ago
Martin Kroeker
f0fc7249f1
Merge pull request #4792 from martin-frbg/issue4790
Fix core assignment in cpu detection for Intel family 15
1 year ago
Martin Kroeker
362856fece
Merge pull request #4778 from JAicewizard/develop
Add support for RISCV64_GENERIC in cmake
1 year ago
Martin Kroeker
1d77647d1b
Merge pull request #4769 from drupol/fix-buffersize-value
openblas: fix `BUFFERSIZE` value
1 year ago
Piotr Kubaj
4c12090776
Fix build on FreeBSD/powerpc64*
1 year ago
Chip Kerchner
f708944fea
Add all 4 variations of the SBGEMM to compare_sgemm_sbgemm
1 year ago
Martin Kroeker
e706bc1ec0
Fix core assignment for Intel family 15
1 year ago
Chip Kerchner
cb154832f8
Vectorize SBGEMM incopy - 4x faster.
1 year ago
Martin Kroeker
a5c04e326a
Update scal.c
1 year ago
Ralf Gommers
268dcd8f45
docs: convert remaining install sections (Android, iOS, FreeBSD, Cortex-M)
1 year ago
Ralf Gommers
452014341e
docs: rework building from source on Windows section
1 year ago
Ralf Gommers
4547908901
docs: rewrite "Install OpenBLAS" page (part 1: binaries, basic from source)
1 year ago
Martin Kroeker
e1eef56e05
Merge pull request #4783 from martin-frbg/cpuid_meteor
Add another CPUID for Intel Meteor Lake
1 year ago
Martin Kroeker
536200bc9e
fix handling of INF or NAN
1 year ago
Martin Kroeker
3063d03021
Add another CPUID for Meteor Lake
1 year ago
Martin Kroeker
b422742899
collect error output from ctest, if any
1 year ago
Jaap Aarts
cea4abcac0
Fix compiling on mingw
1 year ago
Martin Kroeker
f729013d2e
Merge pull request #4781 from rgommers/fix-docs-deployment
fix CI job to deploy docs, and make it run on pull requests too
1 year ago
Ralf Gommers
6ede8b14c6
ci: fix CI job to deploy docs, and make it run on pull requests too
1 year ago
Martin Kroeker
9836883ee9
Merge pull request #4780 from martin-frbg/azureosx12
AzureCI: Update OSX jobs to use the macos-12 image
1 year ago
Martin Kroeker
df81b159e8
Merge pull request #4774 from rgommers/improve-docs
Improve documention content, formatting, and html theme
1 year ago
Martin Kroeker
2df4007425
Update compiler and sdk versions for osx
1 year ago
Martin Kroeker
acf0c3ccaf
Merge pull request #4777 from ev-br/sgesdd_ci_err
ignore the gesdd failure on codspeed
1 year ago
Martin Kroeker
74f059a3ce
Update OSX jobs to use the macos-12 image
1 year ago
Evgeni Burovski
cd3c167c28
ignore sgesdd failure on codspeed
In https://github.com/OpenMathLib/OpenBLAS/issues/4776
we're hitting
** On entry to SLASCL parameter number 4 had an illegal value
on codspeed, but not outside (either locally or on github runners)
1 year ago
Jaap Aarts
9d0abe2d26
Add support for RISCV64_GENERIC in cmake
1 year ago
Evgeni Burovski
5b385fd453
WIP: fish out the gesdd failure?
1 year ago
Ralf Gommers
c1c0dbfd60
docs: address review comments on PR 4774
1 year ago
Martin Kroeker
bdb6069051
Merge pull request #4775 from martin-frbg/issue4770
Guard against invalid thread_status.queue
1 year ago
Martin Kroeker
4052b312b2
Merge pull request #4763 from ev-br/sync-codspeed
BENCH: sync codspeed-benchmarks with BLAS-benchmarks
1 year ago
Martin Kroeker
3677b3886c
Merge pull request #4702 from bashimao/detect-nv-grace
Correctly detect ARM Neoverse V2 CPUs.
1 year ago
Martin Kroeker
d0b9948b23
Guard against invalid thread_status.queue
1 year ago
Ralf Gommers
ca9a0c28e8
docs: improve extensions page
1 year ago