Annop Wongwathanarat
e11744a411
Use SVE kernel for S/DGEMVN for SVE machines
9 months ago
Usui, Tetsuzo
d711906e3e
Add symv kernels for arm64
9 months ago
Annop Wongwathanarat
ec146157d3
Use SVE kernel for S/DGEMVT for SVE machines
10 months ago
manjam01
5c4e38ab17
Optimize gemv_n_sve kernel
11 months ago
Martin Kroeker
229d8a025e
Merge pull request #4959 from CDAC-Bengaluru/level-1-sve
SVE Implementation for Level-1 BLAS Routines
1 year ago
CDAC-SSDG
06ffd411a5
Update KERNEL.ARMV8SVE
1 year ago
Martin Kroeker
5fe983db29
retire the thunderx2 nrm2 kernels for now due to NAN and inaccuracies
1 year ago
SushilPratap04
fa880ab1cf
Update KERNEL.ARMV8SVE
updated KERNEL.ARMV8SVE for level 1 sve (swap, rot and scal) kernels.
1 year ago
Chris Sidebottom
7a6fa699f2
Small GEMM for AArch64
This is a fairly conservative addition of small matrix kernels using
SVE.
1 year ago
Martin Kroeker
12787775d9
add csum/zsum kernels (trivially derived from the asum ones)s)
1 year ago
Chris Sidebottom
84a268b6ca
Use SVE zgemm/cgemm on Arm(R) Neoverse(TM) V1 core
This patch removes the prefetches from cgemm/zgemm which improves the performance similar to sgemm/dgemm did in #3868 , this means I'm happy to enable this on any applicable cores.
I also replicated the unrolling the copies from sgemm and dgemm.
2 years ago
Chris Sidebottom
aea2a4622b
Use latest non-SVE kernels in ARMV8SVE
These are generally better and, in some cases, include threading which helps in the cores we're targeting here.
2 years ago
Chris Sidebottom
ec334e69dc
Use SVE kernel for SGEMM/DGEMM on Arm(R) Neoverse(TM) V1
This re-spins #3869 with some additional copy unrolling which helps maintain SYRK performance.
After #3868 , the SVE kernels represent a pretty good boost.
This re-uses ARMV8SVE as a base and I'm going to incrementally move everything to use ARMV8SVE in additional patches (as well as fix up anything that's not already in ARMV8SVE).
2 years ago
Bine Brank
19d435b1b3
update armv8sve + contributors
4 years ago
Bine Brank
f33543d029
combine zchemm into single file
4 years ago
Bine Brank
d30157d891
update configuration of kernels for A64FX and ARMV8SVE
4 years ago
Bine Brank
a8f62a347b
fix UNROLL_MN and add to targets for SVE
4 years ago
Bine Brank
9b9cb90bb1
modify Makefile for SVE copy
4 years ago
Bine Brank
b58d4f31ab
some clean-up & commentary
4 years ago
Bine Brank
7093372e32
add ARMV8SVE target
4 years ago