Egbert Eich
51c11cf4bf
Create independent kernel Makfile & configuration when building DYNAMIC_ARCH
- For 'classic' builds, generate separate config_kernel_<TARGET>.h,
Makfile_<TARGET>.conf and getarch-<TARGET> files/binaries
- For cmake builds, generate separate getarch-<TARGET> binaries
for better debugging.
Signed-off-by: Egbert Eich <eich@suse.com>
4 years ago
Caroline Newcombe
5cc1111383
fix unsafe read of Y in assembly kernel
4 years ago
Wangyang Guo
225683218c
Small Matrix: use proper inline asm input constraint for AVX512 mask
4 years ago
Martin Kroeker
9c626e466e
really fix definition of SHUFFLE_MAGIC_NO
4 years ago
Martin Kroeker
0698212c8c
Remove stray $
4 years ago
Martin Kroeker
9d7429406f
Declare SHUFFLE_MAGIC_NO as const to placate clang
4 years ago
Martin Kroeker
d9894f45d3
Define sbgemm_r to fix DYNAMIC_ARCH builds
4 years ago
Martin Kroeker
522f809825
Merge pull request #3542 from martin-frbg/issue3540
Fix compilation for CooperLake on Windows/clang
4 years ago
Mosè Giordano
abbc947edb
Fix compilation of Skylake AVX512 kernels with GCC 6
4 years ago
Martin Kroeker
c62f8e2c01
Prevent compiler attempts to use k0 as mask register
4 years ago
Martin Kroeker
80eb581c83
Fix non-portable u_int64_t
4 years ago
Martin Kroeker
73ffabe6ba
Guard uses of _mm512_reduce_add_p?
4 years ago
Martin Kroeker
7656aba00e
Merge pull request #3493 from martin-frbg/casts+cleanup
WIP casts and cleanups
4 years ago
Martin Kroeker
addc2a7aaa
Add proper defaults for IMIN/IMAX
4 years ago
Martin Kroeker
299d4d70a3
Add default KERNEL file for Elbrus E2K arch
4 years ago
Martin Kroeker
3492bea602
Create Makefile
4 years ago
Martin Kroeker
898cf5faf3
Add Elbrus e2k architecture support
4 years ago
Martin Kroeker
c1c0d5ce1d
Merge pull request #3492 from binebrank/arm_sve_zgemm
SVE zgemm&cgemm (and other BLAS 3 complex)
4 years ago
Bine Brank
19d435b1b3
update armv8sve + contributors
4 years ago
Bine Brank
f158d59087
adapt CMake
4 years ago
Bine Brank
b6a445cfd8
adapt Makefile for SVE trsm
4 years ago
Bine Brank
0fb6cc07bf
fix ztrsm lt/ut copy
4 years ago
Bine Brank
f1315288a8
add sve ztrsm
4 years ago
Bine Brank
aaa2b1a861
fix sve dtrsm kernels
4 years ago
Bine Brank
8071e179f1
add remaining sve trsm copy kernels
4 years ago
Bine Brank
f87468ac91
trsm_lncopy_sve
4 years ago
Bine Brank
e8939b3d30
sve trsmRN and trsmRT
4 years ago
Bine Brank
098672b51b
add trsm_kernel_LT_sve
4 years ago
Bine Brank
be7e55880c
sve trsm_kernel_LN
4 years ago
Martin Kroeker
b6b024232d
Merge pull request #3508 from snadampal/v1_n2
OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics
4 years ago
Sunita Nadampalli
19c8f615dc
OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics
4 years ago
Bine Brank
bb33446b40
fix makefile.L3
4 years ago
Bine Brank
f33543d029
combine zchemm into single file
4 years ago
Bine Brank
0c91d043ae
adapt CMake for SVE
4 years ago
Bine Brank
39ab219704
sve copy functions for cgemm chemm zsymm
4 years ago
Bine Brank
18102ae8c3
add cgemm ctrmm sve kernels
4 years ago
Bine Brank
87537b8c55
modify sve zgemmcopy kernels
4 years ago
Bine Brank
d30157d891
update configuration of kernels for A64FX and ARMV8SVE
4 years ago
Bine Brank
07fa6fa3b1
configure Makefile for sve
4 years ago
Bine Brank
2e2c02b762
fix sve ztrmm kernel
4 years ago
Bine Brank
68c414d3a6
ztrmm sve copy functions
4 years ago
Bine Brank
ce329ab686
add sve zhemm copy routines
4 years ago
Bine Brank
0140373802
add sve ztrmm
4 years ago
Bine Brank
f7b6912868
ztrmm sve copy kernels
4 years ago
Bine Brank
40b14e4957
fix zgemm kernel
4 years ago
Bine Brank
6ec4aab875
zgemm sve copy routines
4 years ago
Bine Brank
878064f394
sve zgemm kernel
4 years ago
Bine Brank
683a7548bf
added macros for sve zgemm kernels
4 years ago
Martin Kroeker
7b146e590c
fix function typecast
4 years ago
Martin Kroeker
e9a0e52201
fix function typecast
4 years ago