65 Commits (c1f7a81663ae9e172d82b91c7ffbd482d71ceeac)

Author SHA1 Message Date
  Chip Kerchner 2bb7ea64a1 Only vectorize 64-bit version for Power8. 2 years ago
  Chip Kerchner 09bb48d1b9 Vectorize in-copy packing/copying for SGEMM - 4X faster. 2 years ago
  Martin Kroeker d3555d2e50
Add workaround for LAPACK test failures with the NVIDIA HPC compiler 4 years ago
  Martin Kroeker 251a09ec90
Typo fix 5 years ago
  Martin Kroeker 95d37e1575
Regroup the 32 and 64bit sections and restore 64bit CAXPY 5 years ago
  Martin Kroeker f308e741b2
remove debug output and revert changes to cdot and crot 5 years ago
  Martin Kroeker f8c2697701
Use POWER6 GEMM, TRMM and DTRSM on 32bit POWER8 5 years ago
  Rajalakshmi Srinivasaraghavan bd9ff820bc Fix cmake compilation issue - POWER9 5 years ago
  Martin Kroeker 06208c8d01
Limit this fix to ELFv2 builds 5 years ago
  Martin Kroeker f5c4c28b98
Work around POWER8BE bugs on FreeBSD (ELFv2) 5 years ago
  Martin Kroeker 0b39cf95b0
Fix endianness conditionals 6 years ago
  Martin Kroeker 9f39f0a2c3
Specify ismin/ismax assembly kernels for POWER8 directly 6 years ago
  Martin Kroeker d483e9270a
Update KERNEL.POWER8 6 years ago
  Martin Kroeker 01834aee33
Merge pull request #29 from xianyi/develop 6 years ago
  Martin Kroeker d92bd5be24
Update KERNEL.POWER8 6 years ago
  Martin Kroeker 46e4b12946
Update KERNEL.POWER8 6 years ago
  Martin Kroeker dc345d84df
Fix syntax of endianness conditional and add gcc version check for workaround 6 years ago
  Martin Kroeker cad0d150db
Define alternate kernels for big-endian POWER8 6 years ago
  Martin Kroeker 673e5a0495
Replace several POWER8/9 C kernels with their gcc7-generated assembly versions (#2263) 6 years ago
  Rashmica Gupta bcdf1d4917 Add in runtime CPU detection for POWER. 6 years ago
  Ubuntu 4abc375a91 sgemv cgemv pairs 7 years ago
  Ubuntu 8c3386be87 Added missing Blas1 single fp {saxpy, caxpy, cdot, crot(refactored version of srot),isamax ,isamin, icamax, icamin}, 7 years ago
  QWR QWR 28ca97015d power8:Added initial zgemv_(t|n) ,i(d|z)amax,i(d|z)amin,dgemv_t(transposed),zrot 8 years ago
  martin 7a4b3cfbf8 Add trivially optimized DSDOT for POWER8 8 years ago
  Zhang Xianyi 515bc56ea9 Refs #946. Use nrm2 reference implementation for Power8. 9 years ago
  Zhang Xianyi ae70b916f4 Refs #929. Deal with zero and NaNs for scale. 9 years ago
  Werner Saar 8fb5a1aaff added optimized dtrsm_LT kernel for POWER8 9 years ago
  Werner Saar 56948dbf0f optimized dgemm for POWER8 9 years ago
  Werner Saar 0d0c6f7d7d optimized dgemm for POWER8 9 years ago
  Werner Saar a3da10662f added sgemm_tcopy_8_power8.S 9 years ago
  Werner Saar d46f07bb4e added cgemm_tcopy_8_power8.S 9 years ago
  Werner Saar 879a51165f Optimized zgemm and tested zgemm again 9 years ago
  Werner Saar 9276c9012f Optimized sgemm and dgemm and tested again. 9 years ago
  Werner Saar 3c6294ca3d added optimized sgemm_tcopy for power8 9 years ago
  Werner Saar 68a69c5b50 added optimized dgemv_n kernel for POWER8 9 years ago
  Werner Saar c2464a7c4a added optimized casum kernel for POWER8 9 years ago
  Werner Saar 294f933869 added optimized zasum kernel for POWER8 9 years ago
  Werner Saar f59c9bd6ef added optimized sasum kernel for POWER8 9 years ago
  Werner Saar c53be46d78 added optimized dasum kernel for POWER8 9 years ago
  Werner Saar 659ed16591 added otimized cswap and zswap kernels for POWER8 9 years ago
  Werner Saar 35c98a3556 added optimized zscal kernel for POWER8 9 years ago
  Werner Saar f1a5dd06c5 added optimized sscal kernel for POWER8 9 years ago
  Werner Saar 35f1f21a7f added drot- and srot-kernel optimimized for POWER8 9 years ago
  Werner Saar 3d9a50e841 added optimized sswap kernel for POWER8 10 years ago
  Werner Saar 828c849b44 added optimized ccopy kernel for POWER8 10 years ago
  Werner Saar ecc0bc9813 added optimized scopy kernel for POWER8 10 years ago
  Werner Saar 12f209b7b0 added optimized zswap kernel for POWER8 10 years ago
  Werner Saar 7316a87930 added optimized dswap kernel for POWER8 10 years ago
  Werner Saar 0bff057a87 added optimized dcopy kernel for POWER8 10 years ago
  Werner Saar 1e6cf9808c added optimized dscal kernel for POWER8 10 years ago