Martin Kroeker
1742decdcb
Merge pull request #5375 from lowkeyrossi/CI_for_WoA
Add CI support for building and validating OpenBLAS on WoA
6 months ago
Martin Kroeker
08df0f02d9
Merge pull request #5382 from martin-frbg/issue5379
Update cross-compilation instructions for the Android NDK
6 months ago
Martin Kroeker
7d7757acd1
Update cross-compilation instructions for the Android NDK
6 months ago
Martin Kroeker
666e1081ac
Merge pull request #5378 from martin-frbg/cpuid_lunarlake
Add ID data for Intel Lunar Lake ("Core Ultra 200V series")
6 months ago
Martin Kroeker
3ea6322eff
Merge pull request #5377 from Mousius/test-fixes
Improve bgemm and sbgemm testing
6 months ago
Martin Kroeker
848e9e6ba7
Add ID data for Intel Lunar Lake ("Core Ultra 200V series")
6 months ago
Chris Sidebottom
3f110c8272
Improve bgemm and sbgemm testing
- Fixes wrong return type for `is_close`
- Adds stricter compiler flags for test files so we don't see the above
issue again
- Re-uses test helper functions between compare_sgemm_sbgemm/bgemm.c
6 months ago
newyork_loki
cb2c726716
Add CI support for OpenBLAS on WoA
6 months ago
newyork_loki
c8d41e4a32
Add CI support for OpenBLAS on WoA
6 months ago
Martin Kroeker
81b30d4538
Merge pull request #5374 from martin-frbg/fixup-5373
Fix compilation of the new bgemm test
6 months ago
Martin Kroeker
aad97c7763
Fix return type declaration
6 months ago
Martin Kroeker
7acb122a98
Merge pull request #5373 from Mousius/bgemm-optimized
Add optimized BGEMM kernel for NEOVERSEV1 target
6 months ago
Chris Sidebottom
740efd71c4
Add optimized BGEMM kernel for NEOVERSEV1 target
This also improves the testing and generic kernel by re-using the BF16
conversion functions.
Built on top of https://github.com/OpenMathLib/OpenBLAS/pull/5357 and derived from https://github.com/OpenMathLib/OpenBLAS/pull/5287
Co-authored-by: Ye Tao <ye.tao@arm.com>
6 months ago
Martin Kroeker
e927373f62
Merge pull request #5371 from martin-frbg/fixup-5357
Complete the infrastructure changes for adding BGEMM
6 months ago
Martin Kroeker
9a272fece6
Re-enable the BGEMM tests
6 months ago
Martin Kroeker
b54aec804e
remove spurious include
6 months ago
Martin Kroeker
343830c26f
Add BGEMM parameter tables
6 months ago
Martin Kroeker
b37516add6
Add BGEMM parameters
6 months ago
Martin Kroeker
d030f81380
Merge pull request #5369 from martin-frbg/lapack1144
Fix workspace allocation in LAPACKE strsen/dtrsen (Reference-LAPACK PR 1144)
6 months ago
Martin Kroeker
b746f0eda3
Allocate IWORK to hold at least the one element for workspace queries
6 months ago
Martin Kroeker
b8f66ba0ee
Merge pull request #5367 from Mousius/bgemm-init
Temporarily disable test_bgemm
6 months ago
Martin Kroeker
cdebb4fd4b
Merge pull request #5365 from martin-frbg/issue5324
Fix arm64 HAVE_SME setting for DYNAMIC_ARCH builds using CMake
6 months ago
Martin Kroeker
ff614575c9
Fix arm64 HAVE_SME setting for DYNAMIC_ARCH builds
6 months ago
Martin Kroeker
0e11537cab
Merge pull request #5357 from Mousius/bgemm-init
Add infrastructure for BGEMM
6 months ago
Chris Sidebottom
8cd4be8d47
Temporarily disable test_bgemm
6 months ago
Chris Sidebottom
66d9185ebe
Fix CMake support
6 months ago
Martin Kroeker
98aefb70b4
Merge pull request #5292 from isharif168/optimized_gemv_n_1x3
Optimize gemv_n_sve_v1x3 kernel
6 months ago
Martin Kroeker
fd37406817
Merge branch 'develop' into optimized_gemv_n_1x3
6 months ago
Chris Sidebottom
48394384ef
Use correct constants for per-target BGEMM/SBGEMM
This fixes the build and tests on `NEOVERSEV1` target, which was failing
with specific constants for `SBGEMM`
Co-authored-by: Ye Tao <ye.tao@arm.com>
6 months ago
Chris Sidebottom
73bf0b941a
Add bgemm to gensymbol
6 months ago
Chris Sidebottom
f95e7b0e32
Add infrastructure for BGEMM
Setting up all the infrastructure for BGEMM support in OpenBLAS, hopefully I found all the right places.
Derived mostly from the previous work done in https://github.com/OpenMathLib/OpenBLAS/pull/5287
Co-authored-by: Ye Tao <ye.tao@arm.com>
7 months ago
Martin Kroeker
15d6e58510
Merge pull request #5364 from martin-frbg/blashalf
change BLAS_HALF to BLAS_BFLOAT16 in parallelized POTRF (another missed rename)
6 months ago
Martin Kroeker
04bb5acd79
change BLAS_HALF to BLAS_BFLOAT16 (another missed rename)
6 months ago
Martin Kroeker
3d31887073
Merge pull request #5362 from Mousius/fix-bf16
Fix SBGEMM BFLOAT16 build
6 months ago
Martin Kroeker
0ddf8ebd42
Merge pull request #5354 from pratiklp00/p11
Add Support for POWER11
6 months ago
Martin Kroeker
d2ea9bbb6d
Merge pull request #5363 from guoyuanplct/develop
Update CONTRIBUTORS.md
6 months ago
guoyuanplct
4ff549a450
Update CONTRIBUTORS.md
6 months ago
guoyuanplct
309c48e327
Update CONTRIBUTORS.md
6 months ago
Chris Sidebottom
552e1c7a7a
Correct compiler flags for NEOVERSEV1 target
6 months ago
Chris Sidebottom
46b9b7a080
Also enable BFLOAT16 for make cirun
6 months ago
Chris Sidebottom
eaaa628af2
Enable BUILD_BFLOAT16 in cirun
6 months ago
Chris Sidebottom
7a97c4ca97
Rename HALF -> BFLOAT16 in some more places
6 months ago
Martin Kroeker
ee6560c89f
Merge pull request #5360 from sertonix/cpuid-arm
Fix cpuid.S on arm
6 months ago
Sertonix
8d11e4630c
Fix cpuid.S on arm
The ARM assembly syntax differs a bit
Fixes 61b9339d3a getarch/cpuid.S: Fix warning about executable stack
Signed-off-by: Sertonix <sertonix@posteo.net>
6 months ago
Martin Kroeker
03a4afcf14
Merge pull request #5359 from martin-frbg/gitign_isnan
update gitignore configuration
6 months ago
Martin Kroeker
901de8f33a
remove lapacke_mangling.h and add la_xisnan.mod
6 months ago
Martin Kroeker
ce6991780a
Merge pull request #5356 from ilina-linaro/ilina-woa
Update README.md to include Windows on Arm64
6 months ago
Martin Kroeker
df013c5e28
Merge pull request #5358 from iha-taisei/dot_unroll
Performance improvements of [SD]DOT with loop-unrolling on A64FX
6 months ago
Iha, Taisei
f7ad906b49
Performance improvements of [SD]DOT with loop-unrolling on A64FX
6 months ago
Lina Iyer
7f360001f9
Update README.md to include Windows on Arm64
Update README.md to indicate that binaries are available for Windows on ARM64
7 months ago