Chris Sidebottom
fd4f52c797
Add SVE implementation for sdot/ddot
This adds an SVE implementation to sdot/ddot when available, falling back to the previous Advanced SIMD kernel where there's no SVE implementation for the kernel.
All the targets were essentially treating `dot_thunderx2t99.c` as the Advanced SIMD implementation so I've renamed it to better fit with the feature detection.
3 years ago
Ashwin Sekhar T K
d5aeff636f
ARM64: Enable DYNAMIC_ARCH
Enable DYNAMIC_ARCH feature on ARM64. This patch uses the cpuid
feature in linux kernel to detect the core type at runtime
(https://www.kernel.org/doc/Documentation/arm64/cpu-feature-registers.txt ).
If this feature is missing in kernel, then the user should use the
OPENBLAS_CORETYPE env variable to select the desired core type.
7 years ago
Ashwin Sekhar T K
caf339412f
ARM64: Remove dependency of THUNDERX2T99 Makefile on CORTEXA57 Makefile
7 years ago
Ashwin Sekhar T K
19ba133383
THUNDERX2T99: Add Optimized ZGEMM Implementation
9 years ago
Ashwin Sekhar T K
a3935f0dfb
THUNDERX2T99: Add Optimized D/Z NRM2 Implementation
9 years ago
Ashwin Sekhar T K
ab3ffab96a
THUNDERX2T99: Add Optimized C/Z DOT Implementation
9 years ago
Ashwin Sekhar T K
f036be9ce2
THUNDERX2T99: Add Optimized SDOT Implementation
9 years ago
Ashwin Sekhar T K
172a62d73e
THUNDERX2T99: Add Optimized C/Z IAMAX Implementation
9 years ago
Ashwin Sekhar T K
228c75a69c
THUNDERX2T99: Add parallel SCNRM2 Implementation
9 years ago
Ashwin Sekhar T K
f63deae9de
THUNDERX2T99: Add Optimized S/D IAMAX Implementation
9 years ago
Ashwin Sekhar T K
071a830e8b
THUNDERX2T99: Add optimized S/D/C/Z SWAP Implementations
9 years ago
Ashwin Sekhar T K
d09f88192c
THUNDERX2T99: Add optimized S/D/C/Z COPY Implementations
9 years ago
Ashwin Sekhar T K
e58233460a
THUDNERX2T99: Add optimized D/C/Z ASUM Implementations
9 years ago
Ashwin Sekhar T K
99bd2892bf
THUNDERX2T99: Add optimized CASUM Implementation
9 years ago
Ashwin Sekhar T K
e0dc5f58c5
THUNDERX2T99: Remove Duplicate Code
9 years ago
Ashwin Sekhar T K
2757b49767
THUNDERX2T99: Add Optimized CGEMM Implementation
9 years ago
Ashwin Sekhar T K
907e286eb6
THUNDERX2T99: Add threaded SNRM2 Implementation
9 years ago
Ashwin Sekhar T K
cde3aee08b
ARM64: Rename kernel files to have consistent naming
9 years ago
Ashwin Sekhar T K
ee6ea7e988
THUNDERX2T99: Add Optimized CNRM2 Implementation
9 years ago
Ashwin Sekhar T K
ca0b36b012
THUNDERX2T99: Add Optimized SNRM2 Implementation
9 years ago
Ashwin Sekhar T K
d0a79ca6e0
THUNDERX2T99: Add threaded DDOT Implementation
9 years ago
Ashwin Sekhar T K
0c07003ccf
THUNDERX2T99: Add Optimized DDOT Implementation
9 years ago
Ashwin Sekhar T K
981064acc6
THUNDERX2T99: Add Optimized DAXPY Implementation
9 years ago
Ashwin Sekhar T K
f279ff4789
THUNDERX2T99: Add Optimized SGEMM Implementation
9 years ago
Ashwin Sekhar T K
759f37feba
ARM64: Let target VULCAN inherit THUNDERX2T99 properties
9 years ago
Ashwin Sekhar T K
4b55fae337
ARM64: Add Cavium THUNDERX2T99 Target
9 years ago