Martin Kroeker
002e646476
Add new functions from LAPACK 3.6.1
9 years ago
Martin Kroeker
3dad87bbb5
Merge pull request #1093 from martin-frbg/restore-cmakeinstall
Restore cmake install target
9 years ago
Martin Kroeker
bdd51cdabc
Add cmake install target
Add CMAKE install target (based on patch provided by PrimarchOfTheSpaceWolves in #957 )
This was originally merged as 988 but accidentally reverted by my subsequent PR the following day
9 years ago
Martin Kroeker
8a83daf4bf
Merge pull request #1084 from isuruf/develop
Install pkg-config files
9 years ago
Martin Kroeker
39abb079fb
Merge pull request #1087 from grisuthedragon/enable-a12
Enable EXCAVATOR kernels for A12-9800
9 years ago
Martin Koehler
76c6e33e54
Enable EXCAVATOR kernels for A12-9800
9 years ago
Martin Kroeker
a9594e8072
Merge pull request #1085 from vladimir-ch/lapacke_laswp_work
LAPACKE: fix incorrect value of lda_t in lapacke_?laswp_work
9 years ago
Vladimir Chalupecky
4c2b713ce5
LAPACKE: fix incorrect value of lda_t in lapacke_?laswp_work
Fixed in Reference LAPACK in commit:
07e1fbd897
9 years ago
Isuru Fernando
cdc954675c
Install pkg-config files
9 years ago
Martin Kroeker
60eea75409
Merge pull request #1076 from ashwinyes/develop_20170130_thunderx2t99
More optimized implementations for ThunderX2T99
9 years ago
Ashwin Sekhar T K
d09f88192c
THUNDERX2T99: Add optimized S/D/C/Z COPY Implementations
9 years ago
Ashwin Sekhar T K
e58233460a
THUDNERX2T99: Add optimized D/C/Z ASUM Implementations
9 years ago
Ashwin Sekhar T K
3918d17025
LAPACK: Fix lapack-test errors in ARM64 threaded version
9 years ago
Ashwin Sekhar T K
99bd2892bf
THUNDERX2T99: Add optimized CASUM Implementation
9 years ago
Ashwin Sekhar T K
ff6f572f2e
THUNDERX2T99: Rename labels in for DDOT and SNRM2
9 years ago
Ashwin Sekhar T K
e0dc5f58c5
THUNDERX2T99: Remove Duplicate Code
9 years ago
Ashwin Sekhar T K
2757b49767
THUNDERX2T99: Add Optimized CGEMM Implementation
9 years ago
Zhang Xianyi
ff41e13385
Merge pull request #1074 from ashwinyes/develop_20170116_thunderx2t99_sgemm
Add more THUNDERX2T99 Optimized APIs
9 years ago
Ashwin Sekhar T K
1de6fa0f50
Update .gitignore
9 years ago
Ashwin Sekhar T K
efda640723
Benchmark: Add MFlops print in iamax benchmark
9 years ago
Ashwin Sekhar T K
1530e78cfe
Benchmarks: Avoid building lapack benchmarks when NO_LAPACK=1
9 years ago
Ashwin Sekhar T K
907e286eb6
THUNDERX2T99: Add threaded SNRM2 Implementation
9 years ago
Ashwin Sekhar T K
cde3aee08b
ARM64: Rename kernel files to have consistent naming
9 years ago
Ashwin Sekhar T K
ee6ea7e988
THUNDERX2T99: Add Optimized CNRM2 Implementation
9 years ago
Ashwin Sekhar T K
ca0b36b012
THUNDERX2T99: Add Optimized SNRM2 Implementation
9 years ago
Ashwin Sekhar T K
01e1d85339
Update .gitignore
9 years ago
Ashwin Sekhar T K
d0a79ca6e0
THUNDERX2T99: Add threaded DDOT Implementation
9 years ago
Ashwin Sekhar T K
0c07003ccf
THUNDERX2T99: Add Optimized DDOT Implementation
9 years ago
Ashwin Sekhar T K
f33fcedb30
THUNDERX2T99: Improve SGEMM
9 years ago
Ashwin Sekhar T K
0f1d6e8b39
THUNDERX2T99: Improve DGEMM
9 years ago
Ashwin Sekhar T K
981064acc6
THUNDERX2T99: Add Optimized DAXPY Implementation
9 years ago
Zhang Xianyi
ab2033f2db
Merge pull request #1068 from sva-img/develop
Added MSA optimised rot functions.
9 years ago
Shivraj Patil
a4d97d980f
Added rot functions.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Ashwin Sekhar T K
f279ff4789
THUNDERX2T99: Add Optimized SGEMM Implementation
9 years ago
Ashwin Sekhar T K
759f37feba
ARM64: Let target VULCAN inherit THUNDERX2T99 properties
9 years ago
Martin Kroeker
e8d0e66982
Merge pull request #1067 from martin-frbg/msysinst
Fix DESTDIR support for cygwin/msys2 install
9 years ago
Martin Kroeker
331fd51260
Fix DESTDIR support for cygwin/msys2 install
fixes #1066
9 years ago
Zhang Xianyi
0863a0d4b4
Merge pull request #1061 from ashwinyes/develop_aarch64_vulcan_thunderx_patch
Add new targets for ARM64
9 years ago
Martin Kroeker
2e5f906f41
Update Makefile.install ( #1064 )
* Update Makefile.install to reflect name change of lapacke_mangling.h source
9 years ago
Werner Saar
d1a97bad39
Merge pull request #1063 from wernsaar/develop
prepared kernel/setparam-ref.c for UNROLL values, that are not a power of two
9 years ago
Werner Saar
28e2fab33e
prepared kernel/setparam-ref.c for UNROLL values, that are not a power of two
9 years ago
Werner Saar
752fdc6f82
Merge pull request #1062 from wernsaar/develop
prepared parameter.c for UNROLL values, that are not a power of two
9 years ago
Werner Saar
c1c5a63d3c
prepared parameter.c for UNROLL values, that are not a power of two
9 years ago
Werner Saar
209b63197e
prepared lapack/lauum for UNROLL values, that are not a power of two
9 years ago
Ashwin Sekhar T K
4b55fae337
ARM64: Add Cavium THUNDERX2T99 Target
9 years ago
Ashwin Sekhar T K
738d622feb
ARM64: Fix auto detect of ARM64 cpus
9 years ago
Andrew Pinski
95649dee28
THUNDERX: Add optimized version of daxpy
This is better for single core but does not change anything for multiple cores
10 years ago
Martin Kroeker
3a8c5180b9
Merge pull request #1060 from martin-frbg/lapacke-mingw
Split LAPACKE 3.7.0 obj list (take 2, missed splitting the actual ar command invocation)
9 years ago
Martin Kroeker
7611a41f40
Split LAPACKE 3.7.0 obj list (take 2)
Missed the splitting of the actual ar call
9 years ago
Werner Saar
1a39b92b1d
Merge pull request #1059 from wernsaar/develop
updated some level1 funcions, that are not thread save
9 years ago