Harishmcw
|
030ae1fd97
|
Redefined threading logic for WoA
|
11 months ago |
Harish-Gits
|
daf16b8229
|
Adjusted GESV threading logic for optimal performance on WoA
|
11 months ago |
Martin Kroeker
|
f42ce7067f
|
Merge pull request #5116 from martin-frbg/issue5110
Handle INCX=0 in ?NRM2
|
11 months ago |
Martin Kroeker
|
7478c10268
|
Merge branch 'OpenMathLib:develop' into issue5110
|
11 months ago |
Martin Kroeker
|
c54f5417cc
|
Merge pull request #5118 from martin-frbg/zrot_utestext
Disable extended utests for CSROT/ZDROT that invoke undefined behavior
|
11 months ago |
Martin Kroeker
|
57208b8bce
|
Disable tests with incx,incy=0 (undefined behavior)
|
11 months ago |
Martin Kroeker
|
3a4a9b21eb
|
Disable tests with incx,incy=0 (undefined behavior)
|
11 months ago |
Martin Kroeker
|
60d0be0e97
|
Update nrm2.c
|
11 months ago |
Martin Kroeker
|
0fd5448b2c
|
Handle INCX=0
|
11 months ago |
Martin Kroeker
|
1b85b6a396
|
Merge pull request #5108 from taoye9/sbgemm_neoversev1
Add SBGEMM for arm neoversev1
|
11 months ago |
Martin Kroeker
|
cae480683a
|
Merge pull request #5113 from martin-frbg/issue5112
Ensure that GEMMTR name appears in XERBLA if GEMMT was called as such
|
11 months ago |
Martin Kroeker
|
db7e5f1fa7
|
Update gemmt.c
|
11 months ago |
Martin Kroeker
|
ff30ac9666
|
Update Makefile
|
11 months ago |
Martin Kroeker
|
7c3e169b67
|
Update gemmt.c
|
11 months ago |
Martin Kroeker
|
09414a4187
|
Ensure that GEMMTR name appears in XERBLA if gemmt was called as such
|
11 months ago |
Ye Tao
|
c748e6a338
|
optimized sbgemm kernel for neoverse-v1 (sve-256)
Signed-off-by: Ye Tao <ye.tao@arm.com>
|
1 year ago |
Aditya Tewari
|
4379a6fbe3
|
* checkpoint sbgemm for SVE-256
|
1 year ago |
Martin Kroeker
|
c139b63342
|
Merge pull request #5107 from jhgit/develop
fix signedness of pointer to integer type passed to blas_lock()
|
1 year ago |
John Hein
|
6cd9bbe531
|
fix signedness of pointer to integer type passed to blas_lock()
|
1 year ago |
Martin Kroeker
|
5de5072940
|
Improve flang-new identification and add CI job for it on OSX-x86_64 (#5103)
* AzureCI: Add LLVM/flang-new build on OSX-x86_64
* distinguish classic flang from flang-new in name based recognition
|
1 year ago |
Martin Kroeker
|
1f74fb9a07
|
Merge pull request #5101 from martin-frbg/issue5100
Fix CMake build for PPCG4 breaking due to unparsable KERNEL file
|
1 year ago |
Martin Kroeker
|
d7036cfd74
|
Remove trailing blanks that break the cmake parser
|
1 year ago |
Martin Kroeker
|
3375a0c990
|
Merge pull request #5099 from martin-frbg/issue5097-2
Simplify build instructions for Windows on Arm
|
1 year ago |
Martin Kroeker
|
7a27e2b00d
|
Simplify build instructions for Windows on Arm
|
1 year ago |
Martin Kroeker
|
fdeac17237
|
Merge pull request #5098 from martin-frbg/issue5095
Fix compilation with BUILD_BFLOAT16 enabled
|
1 year ago |
Martin Kroeker
|
1829ac5b44
|
Add (dummy) declaration of SBROT_M
|
1 year ago |
Martin Kroeker
|
53d20a83f3
|
Merge pull request #5089 from annop-w/gemv_t
Simplify gemv_t_sve_v1x3 kernel
|
1 year ago |
Martin Kroeker
|
6e393a5599
|
Merge branch 'develop' into gemv_t
|
1 year ago |
Martin Kroeker
|
9b11fd5802
|
Merge pull request #5088 from michalowski-arm/develop
Add thread throttling profile for SGEMV on `NEOVERSEV1`
|
1 year ago |
Martin Kroeker
|
5930c162ef
|
Merge pull request #5097 from matthew-brett/fix-woa-cmd
Fix Windows on ARM build instructions
|
1 year ago |
Marek Michalowski
|
838bb57e27
|
Merge branch 'develop' into develop
|
1 year ago |
Matthew Brett
|
252c43265d
|
Fix Windows on ARM build instructions
The command as merged uses the compiler target as the compiler path.
I have run and tested a build with this command.
@Mugundanmcw - is this correct?
|
1 year ago |
Martin Kroeker
|
876ba58e28
|
Merge pull request #5091 from goplanid/develop
Small gemm kernel improvements for AArch64
|
1 year ago |
Martin Kroeker
|
a54f9a9c69
|
Merge pull request #5071 from annop-w/sgemm_throttling
Add thread throttling profile for SGEMM on NEOVERSEV1
|
1 year ago |
Martin Kroeker
|
9f2319b46d
|
Merge pull request #5094 from martin-frbg/issue5093
Fix "make install" operation when CPP_THREAD_SAFETY_TEST is selected
|
1 year ago |
Martin Kroeker
|
9faebb3c97
|
fix lost indentation in the rules for the thread safety test
|
1 year ago |
Martin Kroeker
|
262018f14c
|
Merge pull request #5092 from XiWeiGu/la64_fixed_cmake
LoongArch64: Fixed cmake
|
1 year ago |
Martin Kroeker
|
180ba5e7d0
|
Merge pull request #5069 from tingboliao/dev_rotm_20250107
Further rearranged the rotm kernel for the different architectures.
|
1 year ago |
gxw
|
1ebcbdbab3
|
LoongArch64: Fixed the issue of using the old-style TARGET in cmake builds
|
1 year ago |
Deeksha Goplani
|
d1bfa979f7
|
small gemm kernel packing modifications
|
1 year ago |
Martin Kroeker
|
1a6a9fb22f
|
add another generator line for rotm
|
1 year ago |
Martin Kroeker
|
518e376820
|
Merge pull request #5090 from martin-frbg/cmakeutils
Fix CMake interpretation of KERNEL file variables relevant to WoA
|
1 year ago |
Martin Kroeker
|
111c9b0733
|
Add translations for C_COMPILER and OSNAME
|
1 year ago |
Martin Kroeker
|
4924319c50
|
fix position of srotm, qrotm
|
1 year ago |
Martin Kroeker
|
b58cba9eb6
|
fix qrotm build rules
|
1 year ago |
Marek Michalowski
|
4d5b13f765
|
Add thread throttling profile for SGEMV on `NEOVERSEV1`
|
1 year ago |
tingbo.liao
|
3c8df6358f
|
Further rearranged the rotm kernel for the different architectures.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
|
1 year ago |
Annop Wongwathanarat
|
c0318cea6e
|
Simplify gemv_t_sve_v1x3 kernel
|
1 year ago |
Martin Kroeker
|
76db346f7e
|
Merge pull request #5082 from martin-frbg/woa_cpuid
Get ARM64 TARGET information from the registry on Windows
|
1 year ago |
Martin Kroeker
|
5f7b03a441
|
Merge pull request #5083 from martin-frbg/fixmips64ci
MIPS64 CI :fix breakage from inadvertent line join in yml file
|
1 year ago |