296 Commits (c8f53b85ceb885bd0f49fcba0ac888419fb6d3bd)

Author SHA1 Message Date
  Chip Kerchner c8f53b85ce Merge remote-tracking branch 'origin/develop' into vectorizeBF16GEMV 1 year ago
  Martin Kroeker e52d9b4cf1
Merge pull request #4928 from austinpagan/czgemm_in_c 1 year ago
  Gordon Fossum 0b7fb5c791 CGEMM & ZGEMM using C code. 1 year ago
  Chip Kerchner d6bb8dcfd1 Common code. 1 year ago
  Martin Kroeker c9e92348a6
Handle inf/nan if dummy2 flag is set 1 year ago
  Chip Kerchner 9ac0fb0111 Merge branch 'develop' into vectorizeBF16GEMV 1 year ago
  Martin Kroeker d714013ab9
change sgemm kernel to 4x4 as the 16x4 altivec goes out of bounds 1 year ago
  Chip Kerchner 915a6d6e44 Add casting. 1 year ago
  Chip Kerchner 7ec3c16d82 Remove beta from optimized functions. 1 year ago
  Chip Kerchner 7cc00f68c9 Remove more duplicate. 1 year ago
  Chip Kerchner e238a68c03 Remove duplicate. 1 year ago
  Chip Kerchner 32095b0cbb Remove parameter. 1 year ago
  Chip Kerchner c8788208c8 Fixing block issue with transpose version. 1 year ago
  Chip Kerchner d7c0d87cd1 Small changes. 1 year ago
  Chip Kerchner eb6f3a05ef Common MMA code. 1 year ago
  Chip Kerchner fb287d17fc Common code. 1 year ago
  Chip Kerchner 8ab6245771 Small change. 1 year ago
  Chip Kerchner df19375560 Almost final code for MMA. 1 year ago
  Chip Kerchner 05aa63e738 More MMA BF16 GEMV code. 1 year ago
  Chip Kerchner c9ce37d527 Force vector pairs in clang. 1 year ago
  Chip Kerchner 89a12fa083 MMA BF16 GEMV code. 1 year ago
  Chip Kerchner 7947970f9d Move common code. 1 year ago
  Chip Kerchner 72216d28c2 Fix bug with inc_y adding results twice. 1 year ago
  Chip Kerchner 2f142ee857 More common code. 1 year ago
  Chip Kerchner 39fd29f1de Minor improvement and turn off BF16 GEMV forwarding by default. 1 year ago
  Chip Kerchner 8541b25e1d Special case beta is one. 1 year ago
  Chip Kerchner 76227e2948 Initial commit for vectorized BF16 GEMV. Added GEMM_GEMV_FORWARD_BF16 to enable using BF16 GEMV for one dimension matrices. Updated unit test to support inc_x != 1 or inc_y for GEMV. 1 year ago
  Chip Kerchner 1a7b8c650d Merge branch 'develop' into betterPowerGEMVTail 1 year ago
  Martin Kroeker f5d04318e3
Merge branch 'OpenMathLib:develop' into scalfixes 1 year ago
  Martin Kroeker 73f8866ffb
make NAN handling depend on DUMMY2 parameter 1 year ago
  Hong Bo Peng db98f8753f Try to fix LAPACK testing failures on P7. 1 year ago
  Martin Kroeker b9bfc8ce09
make NAN handling depend on dummy2 parameter 1 year ago
  Chip Kerchner ba47c7f4f3 Vectorize reduction stage of sgemv_t. 1 year ago
  Chip Kerchner cb154832f8 Vectorize SBGEMM incopy - 4x faster. 1 year ago
  Martin Kroeker 2a5fe97e3b
temporarily(?) disable the alpha=0 branch as it does not handle INF,NAN 1 year ago
  Martin Kroeker 7f8f037a36
handle INF and NAN in input 1 year ago
  Martin Kroeker f1248b849d
handle INF and NAN in input 1 year ago
  Rajalakshmi Srinivasaraghavan e112191b54 POWER: Fix issues in zscal to address lapack failures 1 year ago
  Martin Kroeker aa259b141d
Merge pull request #4704 from amritahs-ibm/saxpy_perf_fix 1 year ago
  Chip Kerchner 3a1417671a POWER: Fixing endianness issue in cswap/zswap kernel for AIX 1 year ago
  Amrita H S 87b3d9054f Fix regression SAXPY when compiler with OpenXL compiler. 1 year ago
  Chip-Kerchner 99384933ff Revert "Merge pull request #4532 from austinpagan/cgemm_zgemm_c_code" 1 year ago
  Martin Kroeker accea15551
Merge pull request #4532 from austinpagan/cgemm_zgemm_c_code 1 year ago
  austinpagan 87ba528d8b Changed C files to straighten out indentation. Removed commented lines from other file. 2 years ago
  austinpagan ddac75e0ef Adding .C versions of CGEMM and ZGEMM 2 years ago
  Chip Kerchner 2bb7ea64a1 Only vectorize 64-bit version for Power8. 2 years ago
  Chip Kerchner 09bb48d1b9 Vectorize in-copy packing/copying for SGEMM - 4X faster. 2 years ago
  Chip-Kerchner 058dd2a4cb Replace two vector loads with one vector pair load and fix endianess of stores - DGEMM versions. 2 years ago
  barracuda156 d9653af018 KERNEL.PPC970, KERNEL.PPCG4: unbreak CMake parsing 2 years ago
  Chip-Kerchner 4e738e561a Replace two vector loads with one vector pair load and fix endianess of stores. 2 years ago