Isuru Fernando
|
cdc954675c
|
Install pkg-config files
|
9 years ago |
Zhang Xianyi
|
ff41e13385
|
Merge pull request #1074 from ashwinyes/develop_20170116_thunderx2t99_sgemm
Add more THUNDERX2T99 Optimized APIs
|
9 years ago |
Ashwin Sekhar T K
|
ee6ea7e988
|
THUNDERX2T99: Add Optimized CNRM2 Implementation
|
9 years ago |
Ashwin Sekhar T K
|
ca0b36b012
|
THUNDERX2T99: Add Optimized SNRM2 Implementation
|
9 years ago |
Ashwin Sekhar T K
|
01e1d85339
|
Update .gitignore
|
9 years ago |
Ashwin Sekhar T K
|
d0a79ca6e0
|
THUNDERX2T99: Add threaded DDOT Implementation
|
9 years ago |
Ashwin Sekhar T K
|
0c07003ccf
|
THUNDERX2T99: Add Optimized DDOT Implementation
|
9 years ago |
Ashwin Sekhar T K
|
f33fcedb30
|
THUNDERX2T99: Improve SGEMM
|
9 years ago |
Ashwin Sekhar T K
|
0f1d6e8b39
|
THUNDERX2T99: Improve DGEMM
|
9 years ago |
Ashwin Sekhar T K
|
981064acc6
|
THUNDERX2T99: Add Optimized DAXPY Implementation
|
9 years ago |
Zhang Xianyi
|
ab2033f2db
|
Merge pull request #1068 from sva-img/develop
Added MSA optimised rot functions.
|
9 years ago |
Shivraj Patil
|
a4d97d980f
|
Added rot functions.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
9 years ago |
Ashwin Sekhar T K
|
f279ff4789
|
THUNDERX2T99: Add Optimized SGEMM Implementation
|
9 years ago |
Ashwin Sekhar T K
|
759f37feba
|
ARM64: Let target VULCAN inherit THUNDERX2T99 properties
|
9 years ago |
Martin Kroeker
|
e8d0e66982
|
Merge pull request #1067 from martin-frbg/msysinst
Fix DESTDIR support for cygwin/msys2 install
|
9 years ago |
Martin Kroeker
|
331fd51260
|
Fix DESTDIR support for cygwin/msys2 install
fixes #1066
|
9 years ago |
Zhang Xianyi
|
0863a0d4b4
|
Merge pull request #1061 from ashwinyes/develop_aarch64_vulcan_thunderx_patch
Add new targets for ARM64
|
9 years ago |
Martin Kroeker
|
2e5f906f41
|
Update Makefile.install (#1064)
* Update Makefile.install to reflect name change of lapacke_mangling.h source
|
9 years ago |
Werner Saar
|
d1a97bad39
|
Merge pull request #1063 from wernsaar/develop
prepared kernel/setparam-ref.c for UNROLL values, that are not a power of two
|
9 years ago |
Werner Saar
|
28e2fab33e
|
prepared kernel/setparam-ref.c for UNROLL values, that are not a power of two
|
9 years ago |
Werner Saar
|
752fdc6f82
|
Merge pull request #1062 from wernsaar/develop
prepared parameter.c for UNROLL values, that are not a power of two
|
9 years ago |
Werner Saar
|
c1c5a63d3c
|
prepared parameter.c for UNROLL values, that are not a power of two
|
9 years ago |
Werner Saar
|
209b63197e
|
prepared lapack/lauum for UNROLL values, that are not a power of two
|
9 years ago |
Ashwin Sekhar T K
|
4b55fae337
|
ARM64: Add Cavium THUNDERX2T99 Target
|
9 years ago |
Ashwin Sekhar T K
|
738d622feb
|
ARM64: Fix auto detect of ARM64 cpus
|
9 years ago |
Andrew Pinski
|
95649dee28
|
THUNDERX: Add optimized version of daxpy
This is better for single core but does not change anything for multiple cores
|
10 years ago |
Martin Kroeker
|
3a8c5180b9
|
Merge pull request #1060 from martin-frbg/lapacke-mingw
Split LAPACKE 3.7.0 obj list (take 2, missed splitting the actual ar command invocation)
|
9 years ago |
Martin Kroeker
|
7611a41f40
|
Split LAPACKE 3.7.0 obj list (take 2)
Missed the splitting of the actual ar call
|
9 years ago |
Werner Saar
|
1a39b92b1d
|
Merge pull request #1059 from wernsaar/develop
updated some level1 funcions, that are not thread save
|
9 years ago |
Werner Saar
|
dd6212e684
|
updated some level1 funcions, that are not thread save
|
9 years ago |
Werner Saar
|
9bcf50872b
|
Merge pull request #1058 from wernsaar/develop
prepared lapack/potrf functions for UNROLL values, that are not a pow…
|
9 years ago |
Werner Saar
|
c81dc6322f
|
prepared lapack/potrf functions for UNROLL values, that are not a power of two
|
9 years ago |
Andrew Pinski
|
8fdb0655e9
|
THUNDERX: Add an optimized version of ddot
|
10 years ago |
Andrew Pinski
|
fb200c7245
|
ARM64: Add Cavium THUNDERX Target
|
9 years ago |
Ashwin Sekhar T K
|
0b8e876d89
|
VULCAN: Add optimized DGEMM implementation
|
9 years ago |
Ashwin Sekhar T K
|
4713e7c47f
|
ARM64: Add the VULCAN Target
|
9 years ago |
Ashwin Sekhar T K
|
6085386b10
|
CORTEXA57: Add assembly kernels for copy routines
|
9 years ago |
Zhang Xianyi
|
002b41f024
|
Merge pull request #1055 from ksraste/develop
Add msa optimization for AXPY, COPY, SCALE, SWAP
|
9 years ago |
jiahaipeng
|
84b8170bfb
|
Adding multi-threading for copy, dot, rot, and asum funcitons
|
9 years ago |
jiahaipeng
|
1aa1e6cb54
|
modify the blas_l1_thread.c for support multi-threded for L1 fuction with return value
|
9 years ago |
Martin Kroeker
|
cbd2bf1f6e
|
Merge pull request #1057 from martin-frbg/lapacke-mingw
Split the obj list of LAPACKE 3.7.0
|
9 years ago |
Martin Kroeker
|
9f5cfd43dc
|
Split the obj list of LAPACKE 3.7.0
Split obj list to allow building with mingw (argument list too long for the msys ar)
|
9 years ago |
kaustubh
|
1480f3df71
|
Add msa optimization for AXPY, COPY, SCALE, SWAP
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
|
9 years ago |
kaustubh
|
88afb3bc94
|
Add msa optimization for AXPY, COPY, SCALE, SWAP
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
|
9 years ago |
Werner Saar
|
2ffbbb54f6
|
Merge pull request #1054 from wernsaar/develop
prepared lapack/getrf functions for UNROLL values, that are not a pow…
|
9 years ago |
Werner Saar
|
3e1bbd6b5f
|
prepared lapack/getrf functions for UNROLL values, that are not a power of two
|
9 years ago |
Zhang Xianyi
|
b678471d65
|
Merge branch 'z13' into develop
Conflicts:
CONTRIBUTORS.md
|
9 years ago |
Zhang Xianyi
|
864e202afd
|
Add USE_TRMM=1 for IBM z13 in kernel/Makefile.L3
|
9 years ago |
Werner Saar
|
b9bb009236
|
Merge pull request #1053 from wernsaar/develop
prepared driver/level3 functions for UNROLL values, that are not a po…
|
9 years ago |
Werner Saar
|
a2672d5589
|
prepared driver/level3 functions for UNROLL values, that are not a power of two
|
9 years ago |