106 Commits (7eecd8e39cfd3bf3f8eddc1154b8b2bfec19ea33)

Author SHA1 Message Date
  Martin Kroeker 6b6c9b1441
Merge pull request #2172 from quickwritereader/develop 6 years ago
  AbdelRauf a97b301aaa cgemm/ctrmm power9 6 years ago
  Piotr Kubaj eebfeba768 Fix build on FreeBSD/powerpc64. 6 years ago
  kavanabhat a575f1e4c7
Update dtrmm_kernel_16x4_power8.S 6 years ago
  AbdelRauf cdbfb891da new sgemm 8x16 6 years ago
  Martin Kroeker a17cf36225
Merge pull request #2153 from quickwritereader/develop 6 years ago
  AbdelRauf 148c4cc5fd conflict resolve 6 years ago
  AbdelRauf d0c3543c3f power9 zgemm ztrmm optimized 6 years ago
  AbdelRauf a469b32cf4 sgemm pipeline improved, zgemm rewritten without inner packs, ABI lxvx v20 fixed with vs52 6 years ago
  AbdelRauf 8fe794f059 improved zgemm power9 based on power8 6 years ago
  Martin Kroeker 3f427c0cf9
Merge pull request #2107 from quickwritereader/develop 6 years ago
  AbdelRauf 47f892198c conflict resolve 6 years ago
  AbdelRauf 628b335e83 Merge branch 'develop' of https://github.com/quickwritereader/OpenBLAS into develop 6 years ago
  AbdelRauf 0f105dd8a5 sgemm/strmm 6 years ago
  Martin Kroeker ccfb7ead15
Merge pull request #2072 from martin-frbg/sum 6 years ago
  Rashmica Gupta bcdf1d4917 Add in runtime CPU detection for POWER. 6 years ago
  Martin Kroeker 706dfe263b
Add POWER implementation of ?sum 6 years ago
  Martin Kroeker 7c51cc8527
Merge branch 'develop' into develop 6 years ago
  AbdelRauf 853a18bc17 power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself 6 years ago
  Martin Kroeker 718efcec6f
Fix out-of-bounds memory access in gemm_beta 7 years ago
  Martin Kroeker f9d67bb5e8
Fix out-of-bounds memory access in gemm_beta 7 years ago
  Ubuntu 498ac98581 Note for unused kernels 7 years ago
  Ubuntu cd9ea45463 NBMAX=4096 for gemvn, added sgemvn 8x8 for future 7 years ago
  Ubuntu 4abc375a91 sgemv cgemv pairs 7 years ago
  Ubuntu 43a4572038 crot fix 7 years ago
  Abdelrauf a034e65512
Merge branch 'develop' into develop 7 years ago
  Ubuntu 8c3386be87 Added missing Blas1 single fp {saxpy, caxpy, cdot, crot(refactored version of srot),isamax ,isamin, icamax, icamin}, 7 years ago
  Martin Kroeker 961d25e9c7
Use the new zrot.c on POWER8 for crot as well 7 years ago
  Martin Kroeker 8a3b6fa108
Use generic zrot.c on ppc64/POWER6 to work around utest failure from … (#1535) 7 years ago
  QWR QWR 28ca97015d power8:Added initial zgemv_(t|n) ,i(d|z)amax,i(d|z)amin,dgemv_t(transposed),zrot 8 years ago
  the mslm 2c0a008281 dgemm_ncopy_4_ save/restore 8 years ago
  the mslm c5425daa6b power8 ?gemm_tcopy save/restore 8 years ago
  martin 7a4b3cfbf8 Add trivially optimized DSDOT for POWER8 8 years ago
  Martin Kroeker 9c017a2218 Save and restore VSX registers 8 years ago
  Matt Brown bd831a03a8 Optimise sscal for POWER9 8 years ago
  Matt Brown edc97918f8 Optimise srot for POWER9 8 years ago
  Matt Brown e0034de22d Optimise sdot for POWER9 8 years ago
  Matt Brown 32c7fe6bff Optimise sasum for POWER9 8 years ago
  Matt Brown 19bdf9d52b Optimise casum for POWER9 8 years ago
  Matt Brown 4f09030fdc Optimise cswap for POWER9 8 years ago
  Matt Brown 6f4eca5ea4 Optimise sswap for POWER9 8 years ago
  Matt Brown be55f96cbd Optimise scopy for POWER9 8 years ago
  Matt Brown 96dd0ef4f7 Optimise ccopy for POWER9 8 years ago
  Alan Modra dc40bc7368 Power8 inline assembly tweaks 8 years ago
  Martin Kroeker 9e2f316ede Power8 inline assembly fixes 9 years ago
  Martin Kroeker a6e9e0b94b Remove explicit include of complex.h 9 years ago
  Zhang Xianyi 515bc56ea9 Refs #946. Use nrm2 reference implementation for Power8. 9 years ago
  Zhang Xianyi ae70b916f4 Refs #929. Deal with zero and NaNs for scale. 9 years ago
  Werner Saar 412bcd187a optimized dtrsm_logic_LT_16x4_power8.S and dtrsm_macros_LT_16x4_power8.S 9 years ago
  Werner Saar 8b140220c8 optimized dtrsm_kernel_LT for POWER8 9 years ago