96 Commits (ad20ceaa680e555e6f4e5e6d199f4c158ef1b6df)

Author SHA1 Message Date
  Martin Kroeker 3f427c0cf9
Merge pull request #2107 from quickwritereader/develop 7 years ago
  AbdelRauf 47f892198c conflict resolve 7 years ago
  AbdelRauf 628b335e83 Merge branch 'develop' of https://github.com/quickwritereader/OpenBLAS into develop 7 years ago
  AbdelRauf 0f105dd8a5 sgemm/strmm 7 years ago
  Martin Kroeker ccfb7ead15
Merge pull request #2072 from martin-frbg/sum 7 years ago
  Rashmica Gupta bcdf1d4917 Add in runtime CPU detection for POWER. 7 years ago
  Martin Kroeker 706dfe263b
Add POWER implementation of ?sum 7 years ago
  Martin Kroeker 7c51cc8527
Merge branch 'develop' into develop 7 years ago
  AbdelRauf 853a18bc17 power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself 7 years ago
  Martin Kroeker 718efcec6f
Fix out-of-bounds memory access in gemm_beta 7 years ago
  Martin Kroeker f9d67bb5e8
Fix out-of-bounds memory access in gemm_beta 7 years ago
  Ubuntu 498ac98581 Note for unused kernels 7 years ago
  Ubuntu cd9ea45463 NBMAX=4096 for gemvn, added sgemvn 8x8 for future 7 years ago
  Ubuntu 4abc375a91 sgemv cgemv pairs 7 years ago
  Ubuntu 43a4572038 crot fix 7 years ago
  Abdelrauf a034e65512
Merge branch 'develop' into develop 7 years ago
  Ubuntu 8c3386be87 Added missing Blas1 single fp {saxpy, caxpy, cdot, crot(refactored version of srot),isamax ,isamin, icamax, icamin}, 7 years ago
  Martin Kroeker 961d25e9c7
Use the new zrot.c on POWER8 for crot as well 8 years ago
  Martin Kroeker 8a3b6fa108
Use generic zrot.c on ppc64/POWER6 to work around utest failure from … (#1535) 8 years ago
  QWR QWR 28ca97015d power8:Added initial zgemv_(t|n) ,i(d|z)amax,i(d|z)amin,dgemv_t(transposed),zrot 8 years ago
  the mslm 2c0a008281 dgemm_ncopy_4_ save/restore 8 years ago
  the mslm c5425daa6b power8 ?gemm_tcopy save/restore 8 years ago
  martin 7a4b3cfbf8 Add trivially optimized DSDOT for POWER8 8 years ago
  Martin Kroeker 9c017a2218 Save and restore VSX registers 8 years ago
  Matt Brown bd831a03a8 Optimise sscal for POWER9 9 years ago
  Matt Brown edc97918f8 Optimise srot for POWER9 9 years ago
  Matt Brown e0034de22d Optimise sdot for POWER9 9 years ago
  Matt Brown 32c7fe6bff Optimise sasum for POWER9 9 years ago
  Matt Brown 19bdf9d52b Optimise casum for POWER9 9 years ago
  Matt Brown 4f09030fdc Optimise cswap for POWER9 9 years ago
  Matt Brown 6f4eca5ea4 Optimise sswap for POWER9 9 years ago
  Matt Brown be55f96cbd Optimise scopy for POWER9 9 years ago
  Matt Brown 96dd0ef4f7 Optimise ccopy for POWER9 9 years ago
  Alan Modra dc40bc7368 Power8 inline assembly tweaks 9 years ago
  Martin Kroeker 9e2f316ede Power8 inline assembly fixes 9 years ago
  Martin Kroeker a6e9e0b94b Remove explicit include of complex.h 9 years ago
  Zhang Xianyi 515bc56ea9 Refs #946. Use nrm2 reference implementation for Power8. 9 years ago
  Zhang Xianyi ae70b916f4 Refs #929. Deal with zero and NaNs for scale. 9 years ago
  Werner Saar 412bcd187a optimized dtrsm_logic_LT_16x4_power8.S and dtrsm_macros_LT_16x4_power8.S 10 years ago
  Werner Saar 8b140220c8 optimized dtrsm_kernel_LT for POWER8 10 years ago
  Werner Saar 8fb5a1aaff added optimized dtrsm_LT kernel for POWER8 10 years ago
  Werner Saar 6a2bde7a2d optimized dgemm and dgetrf for POWER8 10 years ago
  Werner Saar 8310d4d3f7 optimized dgemm for 20 threads 10 years ago
  Werner Saar 56948dbf0f optimized dgemm for POWER8 10 years ago
  Werner Saar 0d0c6f7d7d optimized dgemm for POWER8 10 years ago
  Werner Saar a3da10662f added sgemm_tcopy_8_power8.S 10 years ago
  Werner Saar d46f07bb4e added cgemm_tcopy_8_power8.S 10 years ago
  Werner Saar 879a51165f Optimized zgemm and tested zgemm again 10 years ago
  Werner Saar 9276c9012f Optimized sgemm and dgemm and tested again. 10 years ago
  Werner Saar 0001260f4b optimized sgemm 10 years ago