Martin Kroeker
1742decdcb
Merge pull request #5375 from lowkeyrossi/CI_for_WoA
Add CI support for building and validating OpenBLAS on WoA
10 months ago
Martin Kroeker
08df0f02d9
Merge pull request #5382 from martin-frbg/issue5379
Update cross-compilation instructions for the Android NDK
10 months ago
Martin Kroeker
7d7757acd1
Update cross-compilation instructions for the Android NDK
10 months ago
Martin Kroeker
666e1081ac
Merge pull request #5378 from martin-frbg/cpuid_lunarlake
Add ID data for Intel Lunar Lake ("Core Ultra 200V series")
10 months ago
Martin Kroeker
3ea6322eff
Merge pull request #5377 from Mousius/test-fixes
Improve bgemm and sbgemm testing
10 months ago
Martin Kroeker
848e9e6ba7
Add ID data for Intel Lunar Lake ("Core Ultra 200V series")
10 months ago
Chris Sidebottom
3f110c8272
Improve bgemm and sbgemm testing
- Fixes wrong return type for `is_close`
- Adds stricter compiler flags for test files so we don't see the above
issue again
- Re-uses test helper functions between compare_sgemm_sbgemm/bgemm.c
10 months ago
newyork_loki
cb2c726716
Add CI support for OpenBLAS on WoA
10 months ago
newyork_loki
c8d41e4a32
Add CI support for OpenBLAS on WoA
10 months ago
Martin Kroeker
81b30d4538
Merge pull request #5374 from martin-frbg/fixup-5373
Fix compilation of the new bgemm test
10 months ago
Martin Kroeker
aad97c7763
Fix return type declaration
10 months ago
Martin Kroeker
7acb122a98
Merge pull request #5373 from Mousius/bgemm-optimized
Add optimized BGEMM kernel for NEOVERSEV1 target
10 months ago
Chris Sidebottom
740efd71c4
Add optimized BGEMM kernel for NEOVERSEV1 target
This also improves the testing and generic kernel by re-using the BF16
conversion functions.
Built on top of https://github.com/OpenMathLib/OpenBLAS/pull/5357 and derived from https://github.com/OpenMathLib/OpenBLAS/pull/5287
Co-authored-by: Ye Tao <ye.tao@arm.com>
10 months ago
Martin Kroeker
e927373f62
Merge pull request #5371 from martin-frbg/fixup-5357
Complete the infrastructure changes for adding BGEMM
10 months ago
Martin Kroeker
9a272fece6
Re-enable the BGEMM tests
10 months ago
Martin Kroeker
b54aec804e
remove spurious include
10 months ago
Martin Kroeker
343830c26f
Add BGEMM parameter tables
10 months ago
Martin Kroeker
b37516add6
Add BGEMM parameters
10 months ago
Martin Kroeker
d030f81380
Merge pull request #5369 from martin-frbg/lapack1144
Fix workspace allocation in LAPACKE strsen/dtrsen (Reference-LAPACK PR 1144)
10 months ago
Martin Kroeker
b746f0eda3
Allocate IWORK to hold at least the one element for workspace queries
10 months ago
Martin Kroeker
b8f66ba0ee
Merge pull request #5367 from Mousius/bgemm-init
Temporarily disable test_bgemm
10 months ago
Martin Kroeker
cdebb4fd4b
Merge pull request #5365 from martin-frbg/issue5324
Fix arm64 HAVE_SME setting for DYNAMIC_ARCH builds using CMake
10 months ago
Martin Kroeker
ff614575c9
Fix arm64 HAVE_SME setting for DYNAMIC_ARCH builds
10 months ago
Martin Kroeker
0e11537cab
Merge pull request #5357 from Mousius/bgemm-init
Add infrastructure for BGEMM
10 months ago
Chris Sidebottom
8cd4be8d47
Temporarily disable test_bgemm
10 months ago
Chris Sidebottom
66d9185ebe
Fix CMake support
10 months ago
Martin Kroeker
98aefb70b4
Merge pull request #5292 from isharif168/optimized_gemv_n_1x3
Optimize gemv_n_sve_v1x3 kernel
10 months ago
Martin Kroeker
fd37406817
Merge branch 'develop' into optimized_gemv_n_1x3
10 months ago
Chris Sidebottom
48394384ef
Use correct constants for per-target BGEMM/SBGEMM
This fixes the build and tests on `NEOVERSEV1` target, which was failing
with specific constants for `SBGEMM`
Co-authored-by: Ye Tao <ye.tao@arm.com>
10 months ago
Chris Sidebottom
73bf0b941a
Add bgemm to gensymbol
10 months ago
Chris Sidebottom
f95e7b0e32
Add infrastructure for BGEMM
Setting up all the infrastructure for BGEMM support in OpenBLAS, hopefully I found all the right places.
Derived mostly from the previous work done in https://github.com/OpenMathLib/OpenBLAS/pull/5287
Co-authored-by: Ye Tao <ye.tao@arm.com>
10 months ago
Martin Kroeker
15d6e58510
Merge pull request #5364 from martin-frbg/blashalf
change BLAS_HALF to BLAS_BFLOAT16 in parallelized POTRF (another missed rename)
10 months ago
Martin Kroeker
04bb5acd79
change BLAS_HALF to BLAS_BFLOAT16 (another missed rename)
10 months ago
Martin Kroeker
3d31887073
Merge pull request #5362 from Mousius/fix-bf16
Fix SBGEMM BFLOAT16 build
10 months ago
Martin Kroeker
0ddf8ebd42
Merge pull request #5354 from pratiklp00/p11
Add Support for POWER11
10 months ago
Martin Kroeker
d2ea9bbb6d
Merge pull request #5363 from guoyuanplct/develop
Update CONTRIBUTORS.md
10 months ago
guoyuanplct
4ff549a450
Update CONTRIBUTORS.md
10 months ago
guoyuanplct
309c48e327
Update CONTRIBUTORS.md
10 months ago
Chris Sidebottom
552e1c7a7a
Correct compiler flags for NEOVERSEV1 target
10 months ago
Chris Sidebottom
46b9b7a080
Also enable BFLOAT16 for make cirun
10 months ago
Chris Sidebottom
eaaa628af2
Enable BUILD_BFLOAT16 in cirun
10 months ago
Chris Sidebottom
7a97c4ca97
Rename HALF -> BFLOAT16 in some more places
10 months ago
Martin Kroeker
ee6560c89f
Merge pull request #5360 from sertonix/cpuid-arm
Fix cpuid.S on arm
10 months ago
Sertonix
8d11e4630c
Fix cpuid.S on arm
The ARM assembly syntax differs a bit
Fixes 61b9339d3a getarch/cpuid.S: Fix warning about executable stack
Signed-off-by: Sertonix <sertonix@posteo.net>
10 months ago
Martin Kroeker
03a4afcf14
Merge pull request #5359 from martin-frbg/gitign_isnan
update gitignore configuration
10 months ago
Martin Kroeker
901de8f33a
remove lapacke_mangling.h and add la_xisnan.mod
10 months ago
Martin Kroeker
ce6991780a
Merge pull request #5356 from ilina-linaro/ilina-woa
Update README.md to include Windows on Arm64
10 months ago
Martin Kroeker
df013c5e28
Merge pull request #5358 from iha-taisei/dot_unroll
Performance improvements of [SD]DOT with loop-unrolling on A64FX
10 months ago
Iha, Taisei
f7ad906b49
Performance improvements of [SD]DOT with loop-unrolling on A64FX
10 months ago
Lina Iyer
7f360001f9
Update README.md to include Windows on Arm64
Update README.md to indicate that binaries are available for Windows on ARM64
10 months ago