abhishek-fujitsu
05fc88180c
ARM64: Enable bfloat16 kernels by default
8 months ago
Martin Kroeker
4272cf8c7f
Merge pull request #5398 from martin-frbg/fixup-5394
Update ?GEMM-to-?GEMV forwarding settings for CMake
6 months ago
Martin Kroeker
a5b55f6fe3
remove CBLAS restriction on GEMM_GEMV forwarding
6 months ago
Martin Kroeker
a4f4662459
Merge pull request #5397 from omegacoleman/fix-cblas-bgemm
Fix cmake building with cblas_bgemm
6 months ago
Martin Kroeker
82954ba4ca
Update ?GEMM-to-?GEMV forwarding settings
6 months ago
Martin Kroeker
392d38168e
Merge pull request #5394 from Mousius/optimize-bgemv
Optimized BGEMV for NEOVERSEV1 target
6 months ago
youcai
41f9701ebc
Fix cmake building with cblas_bgemm
6 months ago
Martin Kroeker
f4caa61e47
Merge pull request #5395 from martin-frbg/fixloongsonCI
Fix libffi6 download in the Loongarch64_clang CI job (for now)
6 months ago
Martin Kroeker
444d03db9c
switch to another site that still has libffi6 (for now)
6 months ago
Chris Sidebottom
2c3cdaf74e
Optimized BGEMV for NEOVERSEV1 target
- Adds bgemv T based off of sbgemv T kernel
- Adds bgemv N which is slightly alterated to not use Y as an
accumulator due to the output being bf16 which results in loss of
precision
- Enables BGEMM_GEMV_FORWARD to proxy BGEMM to BGEMV with new kernels
6 months ago
Martin Kroeker
2f81d6e60c
Merge pull request #5390 from martin-frbg/issue5388-2
Declare the "small" complex DOT and AXPY kernels for RISCV-ZVL256B static in addition to inline
6 months ago
Martin Kroeker
e2d941e9af
Declare the "small" kernel static in addition to inline
6 months ago
Martin Kroeker
8214700930
Declare the "small" kernel static in addition to inline
6 months ago
Martin Kroeker
4ae8707b54
Merge pull request #5389 from martin-frbg/issue5388
Add cross-compilation parameters for RISCV64 targets in CMake
6 months ago
Martin Kroeker
b24212f5df
fix numbers
6 months ago
Martin Kroeker
6ff06f5483
Add cross-compilation data for RISCV64 targets
6 months ago
Martin Kroeker
d92f151634
Merge pull request #5386 from martin-frbg/issue5384
Fixes for some gcc warnings
6 months ago
Martin Kroeker
30dbca5051
fix misleading indentation to silence a gcc warning
6 months ago
Martin Kroeker
38e6999295
format cleanup
6 months ago
Martin Kroeker
3df503cafd
portability fix and cleanup
6 months ago
Martin Kroeker
39c90f9859
Merge pull request #5380 from quic/topic/sgemm_direct_sme1_alpha_beta
SME1 based direct kernel (with alpha and beta) for cblas_sgemm level 3
6 months ago
Rajendra Prasad Matcha
eae0abfdb6
SME1 based direct kernel with alpha and beta for cblas_sgemm level 3 API.
6 months ago
Martin Kroeker
ac8cbfdd8e
Merge pull request #5381 from Mousius/bgemv-infrastructure
Add infrastructure for BGEMV
6 months ago
Martin Kroeker
1742decdcb
Merge pull request #5375 from lowkeyrossi/CI_for_WoA
Add CI support for building and validating OpenBLAS on WoA
6 months ago
Martin Kroeker
08df0f02d9
Merge pull request #5382 from martin-frbg/issue5379
Update cross-compilation instructions for the Android NDK
6 months ago
Martin Kroeker
7d7757acd1
Update cross-compilation instructions for the Android NDK
6 months ago
Chris Sidebottom
947d7af4c9
Fix CMake references to bscal and bgemv
6 months ago
Chris Sidebottom
72d2ebb4dd
Re-add GEMV fallback for Level3
6 months ago
Chris Sidebottom
e105411460
Add infrastructure for bgemv/bscal
- Sets up all the various entrypoints for `bgemv`
- Adds `bscal` for use in the `bgemv` interface
- Adds test cases for comparing `sgemv` and `bgemv`
- Adds generic kernels for `bgemv_n` and `bgemv_t` which are accurate
enough to pass above tests
6 months ago
Martin Kroeker
666e1081ac
Merge pull request #5378 from martin-frbg/cpuid_lunarlake
Add ID data for Intel Lunar Lake ("Core Ultra 200V series")
6 months ago
Martin Kroeker
3ea6322eff
Merge pull request #5377 from Mousius/test-fixes
Improve bgemm and sbgemm testing
6 months ago
Martin Kroeker
848e9e6ba7
Add ID data for Intel Lunar Lake ("Core Ultra 200V series")
6 months ago
Chris Sidebottom
09a016fdf6
Split sbgemv test from sbgemm test
6 months ago
Chris Sidebottom
3f110c8272
Improve bgemm and sbgemm testing
- Fixes wrong return type for `is_close`
- Adds stricter compiler flags for test files so we don't see the above
issue again
- Re-uses test helper functions between compare_sgemm_sbgemm/bgemm.c
6 months ago
newyork_loki
cb2c726716
Add CI support for OpenBLAS on WoA
6 months ago
newyork_loki
c8d41e4a32
Add CI support for OpenBLAS on WoA
6 months ago
Martin Kroeker
81b30d4538
Merge pull request #5374 from martin-frbg/fixup-5373
Fix compilation of the new bgemm test
6 months ago
Martin Kroeker
aad97c7763
Fix return type declaration
6 months ago
Martin Kroeker
7acb122a98
Merge pull request #5373 from Mousius/bgemm-optimized
Add optimized BGEMM kernel for NEOVERSEV1 target
6 months ago
Chris Sidebottom
740efd71c4
Add optimized BGEMM kernel for NEOVERSEV1 target
This also improves the testing and generic kernel by re-using the BF16
conversion functions.
Built on top of https://github.com/OpenMathLib/OpenBLAS/pull/5357 and derived from https://github.com/OpenMathLib/OpenBLAS/pull/5287
Co-authored-by: Ye Tao <ye.tao@arm.com>
6 months ago
Martin Kroeker
e927373f62
Merge pull request #5371 from martin-frbg/fixup-5357
Complete the infrastructure changes for adding BGEMM
6 months ago
Martin Kroeker
9a272fece6
Re-enable the BGEMM tests
6 months ago
Martin Kroeker
b54aec804e
remove spurious include
6 months ago
Martin Kroeker
343830c26f
Add BGEMM parameter tables
6 months ago
Martin Kroeker
b37516add6
Add BGEMM parameters
6 months ago
Martin Kroeker
d030f81380
Merge pull request #5369 from martin-frbg/lapack1144
Fix workspace allocation in LAPACKE strsen/dtrsen (Reference-LAPACK PR 1144)
6 months ago
Martin Kroeker
b746f0eda3
Allocate IWORK to hold at least the one element for workspace queries
6 months ago
Martin Kroeker
b8f66ba0ee
Merge pull request #5367 from Mousius/bgemm-init
Temporarily disable test_bgemm
6 months ago
Martin Kroeker
cdebb4fd4b
Merge pull request #5365 from martin-frbg/issue5324
Fix arm64 HAVE_SME setting for DYNAMIC_ARCH builds using CMake
6 months ago
Martin Kroeker
ff614575c9
Fix arm64 HAVE_SME setting for DYNAMIC_ARCH builds
6 months ago