285 Commits (32095b0cbbfbf2a9db382931cacbc400ae975603)

Author SHA1 Message Date
  Chip Kerchner 32095b0cbb Remove parameter. 1 year ago
  Chip Kerchner c8788208c8 Fixing block issue with transpose version. 1 year ago
  Chip Kerchner d7c0d87cd1 Small changes. 1 year ago
  Chip Kerchner eb6f3a05ef Common MMA code. 1 year ago
  Chip Kerchner fb287d17fc Common code. 1 year ago
  Chip Kerchner 8ab6245771 Small change. 1 year ago
  Chip Kerchner df19375560 Almost final code for MMA. 1 year ago
  Chip Kerchner 05aa63e738 More MMA BF16 GEMV code. 1 year ago
  Chip Kerchner c9ce37d527 Force vector pairs in clang. 1 year ago
  Chip Kerchner 89a12fa083 MMA BF16 GEMV code. 1 year ago
  Chip Kerchner 7947970f9d Move common code. 1 year ago
  Chip Kerchner 72216d28c2 Fix bug with inc_y adding results twice. 1 year ago
  Chip Kerchner 2f142ee857 More common code. 1 year ago
  Chip Kerchner 39fd29f1de Minor improvement and turn off BF16 GEMV forwarding by default. 1 year ago
  Chip Kerchner 8541b25e1d Special case beta is one. 1 year ago
  Chip Kerchner 76227e2948 Initial commit for vectorized BF16 GEMV. Added GEMM_GEMV_FORWARD_BF16 to enable using BF16 GEMV for one dimension matrices. Updated unit test to support inc_x != 1 or inc_y for GEMV. 1 year ago
  Chip Kerchner 1a7b8c650d Merge branch 'develop' into betterPowerGEMVTail 1 year ago
  Martin Kroeker f5d04318e3
Merge branch 'OpenMathLib:develop' into scalfixes 1 year ago
  Martin Kroeker 73f8866ffb
make NAN handling depend on DUMMY2 parameter 1 year ago
  Hong Bo Peng db98f8753f Try to fix LAPACK testing failures on P7. 1 year ago
  Martin Kroeker b9bfc8ce09
make NAN handling depend on dummy2 parameter 1 year ago
  Chip Kerchner ba47c7f4f3 Vectorize reduction stage of sgemv_t. 1 year ago
  Chip Kerchner cb154832f8 Vectorize SBGEMM incopy - 4x faster. 1 year ago
  Martin Kroeker 2a5fe97e3b
temporarily(?) disable the alpha=0 branch as it does not handle INF,NAN 1 year ago
  Martin Kroeker 7f8f037a36
handle INF and NAN in input 1 year ago
  Martin Kroeker f1248b849d
handle INF and NAN in input 1 year ago
  Rajalakshmi Srinivasaraghavan e112191b54 POWER: Fix issues in zscal to address lapack failures 1 year ago
  Martin Kroeker aa259b141d
Merge pull request #4704 from amritahs-ibm/saxpy_perf_fix 1 year ago
  Chip Kerchner 3a1417671a POWER: Fixing endianness issue in cswap/zswap kernel for AIX 1 year ago
  Amrita H S 87b3d9054f Fix regression SAXPY when compiler with OpenXL compiler. 1 year ago
  Chip-Kerchner 99384933ff Revert "Merge pull request #4532 from austinpagan/cgemm_zgemm_c_code" 1 year ago
  Martin Kroeker accea15551
Merge pull request #4532 from austinpagan/cgemm_zgemm_c_code 1 year ago
  austinpagan 87ba528d8b Changed C files to straighten out indentation. Removed commented lines from other file. 2 years ago
  austinpagan ddac75e0ef Adding .C versions of CGEMM and ZGEMM 2 years ago
  Chip Kerchner 2bb7ea64a1 Only vectorize 64-bit version for Power8. 2 years ago
  Chip Kerchner 09bb48d1b9 Vectorize in-copy packing/copying for SGEMM - 4X faster. 2 years ago
  Chip-Kerchner 058dd2a4cb Replace two vector loads with one vector pair load and fix endianess of stores - DGEMM versions. 2 years ago
  barracuda156 d9653af018 KERNEL.PPC970, KERNEL.PPCG4: unbreak CMake parsing 2 years ago
  Chip-Kerchner 4e738e561a Replace two vector loads with one vector pair load and fix endianess of stores. 2 years ago
  Rajalakshmi Srinivasaraghavan 980f702f72 POWER: AIX: Make use of power10 optimization 2 years ago
  Rajalakshmi Srinivasaraghavan 82fc29a57a POWER10: Fallback to POWER8 functions 2 years ago
  Martin Kroeker 8e6d93359d
Merge pull request #4196 from TiborGY/obsolete_inlines 2 years ago
  Ian McInerney 79c15db348 Fix power10 gcc intrinsic check 2 years ago
  TGY b5ba95a6c0 Modernize obsolete inline order 2 years ago
  Martin Kroeker 54d3246fc6
Allow negative INCX (API change from version 3.10 of the reference implementation) 2 years ago
  Manjul Mohan 58b88aa5f0 POWER10: Fix compiler warnings 2 years ago
  Martin Kroeker 1688c7da43
change line endings from CRLF to LF 3 years ago
  Martin Kroeker 6c118b7977
Fix DNRM2 returning INF instead of zero due to intermediate overflow 3 years ago
  Martin Kroeker c43ec53bdd
Merge pull request #3690 from RajalakshmiSR/cdotp10 3 years ago
  Rajalakshmi Srinivasaraghavan a612e78a97 POWER: Fix complex dot function failures 3 years ago