Martin Kroeker
db7e5f1fa7
Update gemmt.c
1 year ago
Martin Kroeker
ff30ac9666
Update Makefile
1 year ago
Martin Kroeker
7c3e169b67
Update gemmt.c
1 year ago
Martin Kroeker
09414a4187
Ensure that GEMMTR name appears in XERBLA if gemmt was called as such
1 year ago
Martin Kroeker
c139b63342
Merge pull request #5107 from jhgit/develop
fix signedness of pointer to integer type passed to blas_lock()
1 year ago
John Hein
6cd9bbe531
fix signedness of pointer to integer type passed to blas_lock()
1 year ago
Martin Kroeker
5de5072940
Improve flang-new identification and add CI job for it on OSX-x86_64 ( #5103 )
* AzureCI: Add LLVM/flang-new build on OSX-x86_64
* distinguish classic flang from flang-new in name based recognition
1 year ago
Martin Kroeker
1f74fb9a07
Merge pull request #5101 from martin-frbg/issue5100
Fix CMake build for PPCG4 breaking due to unparsable KERNEL file
1 year ago
Martin Kroeker
d7036cfd74
Remove trailing blanks that break the cmake parser
1 year ago
Martin Kroeker
3375a0c990
Merge pull request #5099 from martin-frbg/issue5097-2
Simplify build instructions for Windows on Arm
1 year ago
Martin Kroeker
7a27e2b00d
Simplify build instructions for Windows on Arm
1 year ago
Martin Kroeker
fdeac17237
Merge pull request #5098 from martin-frbg/issue5095
Fix compilation with BUILD_BFLOAT16 enabled
1 year ago
Martin Kroeker
1829ac5b44
Add (dummy) declaration of SBROT_M
1 year ago
Martin Kroeker
53d20a83f3
Merge pull request #5089 from annop-w/gemv_t
Simplify gemv_t_sve_v1x3 kernel
1 year ago
Martin Kroeker
6e393a5599
Merge branch 'develop' into gemv_t
1 year ago
Martin Kroeker
9b11fd5802
Merge pull request #5088 from michalowski-arm/develop
Add thread throttling profile for SGEMV on `NEOVERSEV1`
1 year ago
Martin Kroeker
5930c162ef
Merge pull request #5097 from matthew-brett/fix-woa-cmd
Fix Windows on ARM build instructions
1 year ago
Marek Michalowski
838bb57e27
Merge branch 'develop' into develop
1 year ago
Matthew Brett
252c43265d
Fix Windows on ARM build instructions
The command as merged uses the compiler target as the compiler path.
I have run and tested a build with this command.
@Mugundanmcw - is this correct?
1 year ago
Martin Kroeker
876ba58e28
Merge pull request #5091 from goplanid/develop
Small gemm kernel improvements for AArch64
1 year ago
Martin Kroeker
a54f9a9c69
Merge pull request #5071 from annop-w/sgemm_throttling
Add thread throttling profile for SGEMM on NEOVERSEV1
1 year ago
Martin Kroeker
9f2319b46d
Merge pull request #5094 from martin-frbg/issue5093
Fix "make install" operation when CPP_THREAD_SAFETY_TEST is selected
1 year ago
Martin Kroeker
9faebb3c97
fix lost indentation in the rules for the thread safety test
1 year ago
Martin Kroeker
262018f14c
Merge pull request #5092 from XiWeiGu/la64_fixed_cmake
LoongArch64: Fixed cmake
1 year ago
Martin Kroeker
180ba5e7d0
Merge pull request #5069 from tingboliao/dev_rotm_20250107
Further rearranged the rotm kernel for the different architectures.
1 year ago
gxw
1ebcbdbab3
LoongArch64: Fixed the issue of using the old-style TARGET in cmake builds
1 year ago
Deeksha Goplani
d1bfa979f7
small gemm kernel packing modifications
1 year ago
Martin Kroeker
1a6a9fb22f
add another generator line for rotm
1 year ago
Martin Kroeker
518e376820
Merge pull request #5090 from martin-frbg/cmakeutils
Fix CMake interpretation of KERNEL file variables relevant to WoA
1 year ago
Martin Kroeker
111c9b0733
Add translations for C_COMPILER and OSNAME
1 year ago
Martin Kroeker
4924319c50
fix position of srotm, qrotm
1 year ago
Martin Kroeker
b58cba9eb6
fix qrotm build rules
1 year ago
Marek Michalowski
4d5b13f765
Add thread throttling profile for SGEMV on `NEOVERSEV1`
1 year ago
tingbo.liao
3c8df6358f
Further rearranged the rotm kernel for the different architectures.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
1 year ago
Annop Wongwathanarat
c0318cea6e
Simplify gemv_t_sve_v1x3 kernel
1 year ago
Martin Kroeker
76db346f7e
Merge pull request #5082 from martin-frbg/woa_cpuid
Get ARM64 TARGET information from the registry on Windows
1 year ago
Martin Kroeker
5f7b03a441
Merge pull request #5083 from martin-frbg/fixmips64ci
MIPS64 CI :fix breakage from inadvertent line join in yml file
1 year ago
Martin Kroeker
100e74d4d6
restore deleted line break
1 year ago
Martin Kroeker
ca3e1c8f9c
Get TARGET information from the registry on Windows
1 year ago
Martin Kroeker
87083fdbf6
[WIP] Work around assembler limitations in current LLVM for Windows on Arm ( #5076 )
* Protect align directives in assembly files that are currently problematic with LLVM on WoA
* use the armv8 zdot on WoA to work around other LLVM issues
1 year ago
Martin Kroeker
2954dc1a70
CI: Add NeoverseN2 build on the new Cobalt-100 ( #5080 )
* Add NeoverseN2 build
1 year ago
Martin Kroeker
7c3a920a81
CI: Update ubuntu-latest runners to fix side effects of switch to 24.04 ( #5079 )
1 year ago
Martin Kroeker
a7483d181b
Merge pull request #5074 from tingboliao/develop
Optimize the gemm_tcopy_8_rvv to be compatible with the vlens 128 and 256.
1 year ago
tingbo.liao
ef7f54b357
Optimized the gemm_tcopy_8_rvv to be compatible with the vlens 128 and 256.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
1 year ago
Martin Kroeker
eba7338484
Merge pull request #5073 from XiWeiGu/la64_update_symv_lsx_version
LoongArch64: Update symv lsx version
1 year ago
gxw
e0a8216554
LoongArch64: Update dsymv LSX version
1 year ago
gxw
a9070ba3f9
LoongArch64: Update ssymv LSX version
1 year ago
Martin Kroeker
9b981035db
Merge pull request #5070 from xry111/xry111/lasx-la664
LoongArch64: Fix dsymv and ssymv LASX version
1 year ago
Martin Kroeker
fee353e63d
Merge pull request #5072 from martin-frbg/azureosx13
Azure CI: update deprecated macos-12 jobs to macos-13 image
1 year ago
Martin Kroeker
0c0112dfef
update deprecated macos-12 jobs to macos-13 image
1 year ago