Martin Kroeker
0f27a03607
Add workaround for NVIDIA HPC mishandling of the asm DOT kernels
5 years ago
Qiyu8
60e6c68e38
Adapt ARM architect
5 years ago
Ashwin Sekhar T K
d5aeff636f
ARM64: Enable DYNAMIC_ARCH
Enable DYNAMIC_ARCH feature on ARM64. This patch uses the cpuid
feature in linux kernel to detect the core type at runtime
(https://www.kernel.org/doc/Documentation/arm64/cpu-feature-registers.txt ).
If this feature is missing in kernel, then the user should use the
OPENBLAS_CORETYPE env variable to select the desired core type.
7 years ago
Ashwin Sekhar T K
162e312832
ARM64: Remove dependency of CORTEXA57 Makefile on ARMV8 Makefile
7 years ago
Martin Kroeker
c9d408064a
Use dot.S also for DSDOT on CORTEXA57
8 years ago
Ashwin Sekhar T K
6085386b10
CORTEXA57: Add assembly kernels for copy routines
9 years ago
Ashwin Sekhar T K
7aa1ad4923
Functional Assembly Kernels for CortexA57
Adding functional (non-optimized) kernels for Cortex-A57
with the following layouts.
SGEMM - 16x4, 8x8
CGEMM - 8x4
DGEMM - 8x4, 4x8
10 years ago
Ashwin Sekhar T K
318f0949c3
lapack-test fixes in nrm2 kernels for Cortex A57
10 years ago
Ashwin Sekhar T K
98965da2e8
lapack-test fixes for Cortex A57
10 years ago
Ashwin Sekhar T K
c99c43d51e
Optimized trmm kernels for CORTEXA57
10 years ago
Ashwin Sekhar T K
1397b47197
Optimized zgemm kernel for CORTEXA57
10 years ago
Ashwin Sekhar T K
45f78963ac
Optimized cgemm kernel for CORTEXA57
Also, add a generic ztrmm 4x4 kernel
10 years ago
Ashwin Sekhar T K
402443bf9c
Optimized dgemm kernel for CORTEXA57
10 years ago
Ashwin Sekhar T K
19fdbee291
Improve the sgemm kernel for CORTEXA57
10 years ago
Ashwin Sekhar T K
3b0cdfab1e
Optimized gemv kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
46efa6a1da
Optimized swap kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
ea1465cdf8
Optimized scal kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
fb4be3b3eb
Optimized rot kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
6c2f4ddbcd
Optimized nrm2 kernels for CORTEXA57
10 years ago
Ashwin Sekhar T K
870c4d49c0
Optimized dot kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
cd7684097c
Optimized copy kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
2690b71b1f
Optimized axpy kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
3e4acedf0e
Optimized asum kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
2610752dbb
Optimized iamax kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
dbb213655e
Optimized amax kernels for CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago
Ashwin Sekhar T K
f2f8a0fe8b
Adding arm64 target CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
10 years ago