287 Commits (670ec6f7576ecc74fff96be7c00ec8fffed8647b)

Author SHA1 Message Date
  Martin Kroeker 4ec62d7f73
remove non-vectorized code path for power8, restoring PR4880 9 months ago
  Ubuntu 0cc2485594 Explicit unaligned vector load/stores in PPC64LE GEMV kernels 9 months ago
  Martin Kroeker 77fba0f400
Fix "dummy2" flag handling 11 months ago
  Martin Kroeker 81eed868b6
Restore the non-vectorized code from before PR4880 for POWER8 11 months ago
  Martin Kroeker 98b5ef929c
Restore the non-vectorized code from before PR4880 for POWER8 11 months ago
  Martin Kroeker d7036cfd74
Remove trailing blanks that break the cmake parser 1 year ago
  tingbo.liao 3c8df6358f Further rearranged the rotm kernel for the different architectures. 1 year ago
  Sergey Fedorov 229efa42ff scal.S: use r11 on 32-bit Darwin on powerpc 1 year ago
  Sergey Fedorov 81e1be8d90 Revert "temporarily disable the default S/DSCAL kernel" 1 year ago
  Martin Kroeker 9b9c0aa5c9
temporarily disable the default S/DSCAL kernel 1 year ago
  Ayappan Perumal 020cce1068 Fix build issues with gcc compiler as well 1 year ago
  Ayappan Perumal b6ec73e77c Fix AIX build 1 year ago
  Chip Kerchner ab71a1edf2 Better VSX. 1 year ago
  Chip Kerchner 36bd3eeddf Vectorize BF16 GEMV (VSX & MMA). Use GEMM_GEMV_FORWARD_BF16 (for Power). 1 year ago
  Martin Kroeker e52d9b4cf1
Merge pull request #4928 from austinpagan/czgemm_in_c 1 year ago
  Gordon Fossum 0b7fb5c791 CGEMM & ZGEMM using C code. 1 year ago
  Martin Kroeker c9e92348a6
Handle inf/nan if dummy2 flag is set 1 year ago
  Martin Kroeker d714013ab9
change sgemm kernel to 4x4 as the 16x4 altivec goes out of bounds 1 year ago
  Chip Kerchner 1a7b8c650d Merge branch 'develop' into betterPowerGEMVTail 1 year ago
  Martin Kroeker f5d04318e3
Merge branch 'OpenMathLib:develop' into scalfixes 1 year ago
  Martin Kroeker 73f8866ffb
make NAN handling depend on DUMMY2 parameter 1 year ago
  Hong Bo Peng db98f8753f Try to fix LAPACK testing failures on P7. 1 year ago
  Martin Kroeker b9bfc8ce09
make NAN handling depend on dummy2 parameter 1 year ago
  Chip Kerchner ba47c7f4f3 Vectorize reduction stage of sgemv_t. 1 year ago
  Chip Kerchner cb154832f8 Vectorize SBGEMM incopy - 4x faster. 1 year ago
  Martin Kroeker 2a5fe97e3b
temporarily(?) disable the alpha=0 branch as it does not handle INF,NAN 1 year ago
  Martin Kroeker 7f8f037a36
handle INF and NAN in input 1 year ago
  Martin Kroeker f1248b849d
handle INF and NAN in input 1 year ago
  Rajalakshmi Srinivasaraghavan e112191b54 POWER: Fix issues in zscal to address lapack failures 1 year ago
  Martin Kroeker aa259b141d
Merge pull request #4704 from amritahs-ibm/saxpy_perf_fix 1 year ago
  Chip Kerchner 3a1417671a POWER: Fixing endianness issue in cswap/zswap kernel for AIX 1 year ago
  Amrita H S 87b3d9054f Fix regression SAXPY when compiler with OpenXL compiler. 1 year ago
  Chip-Kerchner 99384933ff Revert "Merge pull request #4532 from austinpagan/cgemm_zgemm_c_code" 1 year ago
  Martin Kroeker accea15551
Merge pull request #4532 from austinpagan/cgemm_zgemm_c_code 1 year ago
  austinpagan 87ba528d8b Changed C files to straighten out indentation. Removed commented lines from other file. 2 years ago
  austinpagan ddac75e0ef Adding .C versions of CGEMM and ZGEMM 2 years ago
  Chip Kerchner 2bb7ea64a1 Only vectorize 64-bit version for Power8. 2 years ago
  Chip Kerchner 09bb48d1b9 Vectorize in-copy packing/copying for SGEMM - 4X faster. 2 years ago
  Chip-Kerchner 058dd2a4cb Replace two vector loads with one vector pair load and fix endianess of stores - DGEMM versions. 2 years ago
  barracuda156 d9653af018 KERNEL.PPC970, KERNEL.PPCG4: unbreak CMake parsing 2 years ago
  Chip-Kerchner 4e738e561a Replace two vector loads with one vector pair load and fix endianess of stores. 2 years ago
  Rajalakshmi Srinivasaraghavan 980f702f72 POWER: AIX: Make use of power10 optimization 2 years ago
  Rajalakshmi Srinivasaraghavan 82fc29a57a POWER10: Fallback to POWER8 functions 2 years ago
  Martin Kroeker 8e6d93359d
Merge pull request #4196 from TiborGY/obsolete_inlines 2 years ago
  Ian McInerney 79c15db348 Fix power10 gcc intrinsic check 2 years ago
  TGY b5ba95a6c0 Modernize obsolete inline order 2 years ago
  Martin Kroeker 54d3246fc6
Allow negative INCX (API change from version 3.10 of the reference implementation) 2 years ago
  Manjul Mohan 58b88aa5f0 POWER10: Fix compiler warnings 2 years ago
  Martin Kroeker 1688c7da43
change line endings from CRLF to LF 3 years ago
  Martin Kroeker 6c118b7977
Fix DNRM2 returning INF instead of zero due to intermediate overflow 3 years ago