Andreas Arnez
d117dfd505
Change bad usage of "asum" to "sum" in ZARCH versions of ?sum
The ZARCH implementations of ?sum contain a cut & paste-error: An inline
assembly argument is named "sum", but the assembly references "asum"
instead. The mismatch causes a build error. This is fixed.
6 years ago
Martin Kroeker
246ca29679
Add ZARCH implementation of ?sum
as trivial copies of the respective ?asum kernels with the ABS and vflpsb calls removed
6 years ago
maamountki
0a54c98b9d
[ZARCH] Modify constraints
7 years ago
maamountki
bec54ae366
[ZARCH] Fix caxpy
7 years ago
maamountki
f583674109
[ZARCH] Fix cgemv_t_4
7 years ago
maamountki
77fe70019f
[ZARCH] Fix constraints and source code formatting
7 years ago
maamountki
7039770165
[ZARCH] Undo the last commit
7 years ago
maamountki
11a43e8116
[ZARCH] Set alignment hint for vl/vst
7 years ago
maamountki
61526480f9
[ZARCH] Fix copy constraint
7 years ago
maamountki
81daf6bc38
[ZARCH] Format source code, Fix constraints
7 years ago
Martin Kroeker
874df65491
Fix incorrect sgemv results for IBM z14
part of PR #1993 that was inadvertently misplaced into the toplevel directory
7 years ago
Martin Kroeker
877023e1e1
Fix precision of zarch DSDOT
from patch provided by aarnez in #991
7 years ago
Martin Kroeker
265142edd5
Fix typo in the zarch min/max kernels
from patch provided by aarnez in #991
7 years ago
maamountki
29416cb5a3
[ZARCH] Add Z13 version for max/min functions
7 years ago
maamountki
48b9b94f7f
[ZARCH] Improve loading performance for camax/icamax
7 years ago
maamountki
fcd814a8d2
[ZARCH] Fix bug in max/min functions
7 years ago
maamountki
dc4d3bccd5
[ZARCH] Fix icamax/icamin
7 years ago
maamountki
c7143c1019
[ZARCH] Fix iamax/imax single precision
7 years ago
maamountki
04873bb174
[ZARCH] Undo the last commit
7 years ago
maamountki
c8ef9fb220
[ZARCH] Fix bug in iamax/iamin/imax/imin
7 years ago
maamountki
b111829226
[ZARCH] Update max/min functions
7 years ago
maamountki
b815a04c87
[ZARCH] fix a bug in max/min functions
7 years ago
maamountki
1a7925b3a3
[ZARCH] Update dgemv_n_4.c
7 years ago
maamountki
406f835f00
[ZARCH] update cgemv_n_4.c
7 years ago
maamountki
621dedb37b
[ZARCH] Update cgemv_t_4.c
7 years ago
maamountki
b731e8246f
Update sgemv_t_4.c
7 years ago
maamountki
ecc31b743f
Update dgemv_t_4.c
7 years ago
maamountki
5d89d6b143
[ZARCH] fix sgemv_n_4.c
7 years ago
maamountki
67432b23c2
[ZARCH] fix cgemv_n_4.c
7 years ago
maamountki
be66f5d5c2
[ZARCH] fix data prefetch type in sdot
7 years ago
maamountki
c2ffef8156
[ZARCH] fix data prefetch type in ddot
7 years ago
maamountki
e7455f500c
[ZARCH] fix dsdot.c
7 years ago
maamountki
3eafcfa650
[ZARCH] fix cgemv_n_4.c
7 years ago
maamountki
94cd946b96
[ZARCH] fix cgemv_n_4.c
7 years ago
maamountki
1aa840a0a2
[ZARCH] fix sgemv_t_4.c
7 years ago
maamountki
e6c0e39492
Optimize Zgemv
7 years ago
maamountki
23229011db
[ZARCH] Z14 support, BLAS 1/2 single precision implementations, Some missing double precision implementations, Gemv optimization
7 years ago
Martin Kroeker
c7b55b6082
Merge pull request #1499 from quickwritereader/develop
Implemented missing vsx simd kernels for power8 blas1/2 double. z13 modifications
7 years ago
QWR QWR
28ca97015d
power8:Added initial zgemv_(t|n) ,i(d|z)amax,i(d|z)amin,dgemv_t(transposed),zrot
z13: improved zgemv_(t|n)_4,zscal,zaxpy
8 years ago
Martin Kroeker
22167170b3
Merge pull request #1477 from quickwritereader/develop
Power8 blas3 copy-pack routines
8 years ago
Martin Kroeker
58f236ad73
Use generic/dot.c for DSDOT on zarch
8 years ago
Martin Kroeker
e207107150
Use generic/dot.c for DSDOT on z13
The implementation in arm/dot.c has lower precision, as shown by the utest for dsdot.
8 years ago
the mslm
c5425daa6b
power8 ?gemm_tcopy save/restore
8 years ago
Abdelrauf
60596a1abc
Merge branch 'develop' into develop
8 years ago
Abdelrauf
afd514c25d
small fix inside ifdef z13mvc . (z13mvc code is not used in production)
8 years ago
Martin Kroeker
f45776ec1f
Merge pull request #1440 from quickwritereader/develop
small corrections
8 years ago
Abdelrauf
f653e7a18d
small fix
small fix inside ifdef z13mvc . (z13mvc code is not used in production)
8 years ago
the mslm
f946a89432
zscal (case: real alpha=0 ) mikrokernel shift&mem fix , da_i as input reg. small typo fixes
8 years ago
Martin Kroeker
e4c71a799a
Merge pull request #1426 from quickwritereader/develop
(Z13 ) Blas1 mikrokernels can be inlined by gcc. Refactoring,fixes,tunings
8 years ago
the mslm
2619ad7ea5
Blas1 mikrokernels can be inlined by gcc. Refactoring ( symbolic operan
names). Some fixes and tunings
8 years ago