Commit Graph

  • *
  • | *
  • | *
  • | |\
  • | * |
  • | * |
  • | * |
  • * | |
  • * | |
  • * | |
  • * | |
  • * | |
  • * | |
  • * | |
  • |/ /
  • * |
  • * |
  • * |
  • * |
  • * |
  • * |
  • | | *
  • | | *
  • | | *
  • | |/
  • |/|
  • | | *
  • | |/
  • | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | | *
  • | | |/
  • | |/|
  • | * |
  • | |\ \
  • | * \ \
  • | |\ \ \
  • | | | | *
  • | * | | |
  • | |\ \ \ \
  • | | | | | *
  • | | | | | *
  • | | | | | *
  • | | | | | *
  • | | | | | *
  • | |_|_|_|/
  • |/| | | |
  • | | | * |
  • | | | * |
  • | |_|/ /
  • |/| | |
  • * | | |
  • |\| | |
  • | | * |
  • | | * |
  • | | * |
  • | * | |
  • | |\ \ \
  • | | | | *
  • | |_|_|/
  • |/| | |
  • | | * |
  • | | * |
  • | |/ /
  • |/| |
  • | * |
  • | * |
  • | |\ \
  • | | |/
  • | |/|
  • | | *
  • | |/
  • | *
  • | |\
  • | * \
  • | |\ \
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | |_|/
  • |/| |
  • | | *
  • | | *
  • | |/
  • |/|
  • * |
  • |\|
  • | *
  • | |\
  • | | *
  • | | *
  • | * |
  • | |\ \
  • | | * |
  • | |/ /
  • |/| |
  • | | *
  • | |/
  • |/|
  • | *
  • | |\
  • | | *
  • | |/
  • | *
  • | *
  • | |\
  • | * \
  • | |\ \
  • | | * |
  • | |/ /
  • | | *
  • | |/
  • |/|
  • * |
  • |\|
  • | *
  • | |\
  • | * \
  • | |\ \
  • | * \ \
  • | |\ \ \
  • | * | | |
  • | | | * |
  • | | | * |
  • | |_|/ /
  • |/| | |
  • | | | | *
  • | | | | *
  • | | | | *
  • | |_|_|/
  • |/| | |
  • | | | *
  • | | | *
  • | |_|/
  • |/| |
  • | | *
  • | |/
  • | *
  • | |\
  • | | *
  • | |/
  • | *
  • | |\
  • | * \
  • | |\ \
  • | | | *
  • | | | *
  • | |_|/
  • |/| |
  • | | *
  • | |/
  • | *
  • | |\
  • | | *
  • | | *
  • af8084906 Add sgemm_direct by Martin Kroeker 2020-08-17 18:54:28 +0200
  • aa286e301 (refs/pull/2784/head) Add typedef for bfloat16 if needed by Martin Kroeker 2020-08-17 15:32:14 +0200
  • 9f0ef9cdf Merge pull request #77 from xianyi/develop by Martin Kroeker 2020-08-17 15:28:15 +0200
  • 6bfc66663 (refs/pull/2781/head) revert by Martin Kroeker 2020-08-17 15:20:41 +0200
  • a8c6fb9e1 revert by Martin Kroeker 2020-08-17 15:20:16 +0200
  • 5ec8f716c revert by Martin Kroeker 2020-08-17 15:19:40 +0200
  • 54e02aaf1 Update gemm.c by Martin Kroeker 2020-08-16 20:45:20 +0200
  • a83cb3966 Refactor sgemm_direct by Martin Kroeker 2020-08-16 19:01:43 +0200
  • 5a74bd45f remove include as sgemm_direct is handled at the makefile level now by Martin Kroeker 2020-08-16 09:20:44 +0200
  • 56d4d4f84 Move sgemm_direct_performant helper to separate file by Martin Kroeker 2020-08-16 09:19:34 +0200
  • 2586b26e2 Add direct_sgemm by Martin Kroeker 2020-08-16 09:16:52 +0200
  • 86e3455d0 Add sgemm_direct targets by Martin Kroeker 2020-08-16 09:15:56 +0200
  • 774029af3 move sgemm_direct function declarations by Martin Kroeker 2020-08-16 09:13:39 +0200
  • 82f8a0aeb Update .drone.yml by Martin Kroeker 2020-08-15 15:46:18 +0200
  • d57d503c1 Update Makefile by Martin Kroeker 2020-08-15 14:46:26 +0200
  • 37ac23e8a Add simple MT sgemm precision test and INTERFACE64 build by Martin Kroeker 2020-08-15 13:38:05 +0200
  • 6a93e3b2b Add simple sgemm preicsion test by Martin Kroeker 2020-08-15 13:33:52 +0200
  • 47ce1dd08 Update gemm64.cpp by Martin Kroeker 2020-08-15 13:31:28 +0200
  • f5fcc5bae Add trivial gemm test for multithread consistency by Martin Kroeker 2020-08-15 13:30:29 +0200
  • 597010a96 (refs/pull/2778/head) Fix incorrect argument to SLASET by Martin Kroeker 2020-08-14 00:41:56 +0200
  • d64f1ef26 Fix incorrect argument to SLASET by Martin Kroeker 2020-08-14 00:40:24 +0200
  • c62aad62e Fix incorrect calls to DLASET by Martin Kroeker 2020-08-14 00:35:45 +0200
  • e740c4873 Enable COOPERLAKE build target by Chen, Guobing 2020-08-13 06:17:34 +0800
  • efdd237a9 Add a dedicated POWER9 build to the Travis CI (#2774) by Martin Kroeker 2020-08-12 23:08:38 +0200
  • 8f1111f4c (refs/pull/2774/head) Update .travis.yml by Martin Kroeker 2020-08-12 22:35:29 +0200
  • b05289dd2 Switch p9 to Ubuntu 18 container to ensure P9 hosting by Martin Kroeker 2020-08-12 19:57:38 +0200
  • 7632a561d use autodetection for power9 in case there are still power8 boxes in the mix by Martin Kroeker 2020-08-12 18:05:14 +0200
  • 941339824 Update .travis.yml by Martin Kroeker 2020-08-12 16:54:06 +0200
  • ef2db95f5 add the script back... by Martin Kroeker 2020-08-12 13:57:39 +0200
  • 5137146d5 use plain apt commands rather than addon on ppc64le by Martin Kroeker 2020-08-12 12:50:55 +0200
  • 072f68dbc Update .travis.yml by Martin Kroeker 2020-08-12 10:54:10 +0200
  • f7bd46483 Update .travis.yml by Martin Kroeker 2020-08-11 21:13:48 +0200
  • 93e748d67 (refs/pull/2771/head) Change BFLOAT16 data type/API support naming by Chen, Guobing 2020-08-11 09:27:29 +0800
  • 4573cb2f4 Merge pull request #2765 from martin-frbg/issue2760 by Martin Kroeker 2020-08-11 22:40:17 +0200
  • 2a4bb797d Merge pull request #2773 from martin-frbg/issue2770 by Martin Kroeker 2020-08-11 21:02:55 +0200
  • 72f8d8f44 Update .travis.yml by Martin Kroeker 2020-08-11 18:34:22 +0200
  • cbbe38bb8 Merge pull request #2772 from mhillenibm/s390x_gemm_tuning by Martin Kroeker 2020-08-11 18:14:09 +0200
  • 4f9fb930e Update .travis.yml by Martin Kroeker 2020-08-11 18:06:18 +0200
  • 22f746786 Update .travis.yml by Martin Kroeker 2020-08-11 17:57:16 +0200
  • 780bd896b Update .travis.yml by Martin Kroeker 2020-08-11 17:49:59 +0200
  • 7dd3ccf79 Bump gcc version for POWER9 build by Martin Kroeker 2020-08-11 17:37:36 +0200
  • 8ccd6831d Add dedicated POWER9 build by Martin Kroeker 2020-08-11 16:12:49 +0200
  • 619343278 (refs/pull/2773/head) Fix mishandling of NO_CBLAS=0 and NO_LAPACKE=0 by Martin Kroeker 2020-08-11 13:40:40 +0200
  • fee361ae6 fix another source of NO_CBLAS=0 surprise by Martin Kroeker 2020-08-11 13:27:19 +0200
  • 62f4c84f2 Merge pull request #76 from xianyi/develop by Martin Kroeker 2020-08-11 13:25:12 +0200
  • e115c97e0 (refs/pull/2772/head) s390x/SGEMM: adjust default P and Q to multiples of M by Marius Hillenbrand 2020-08-11 12:55:59 +0200
  • 07c334e7b s390x: Factor out small block sizes for SGEMM/DGEMM on z14 by Marius Hillenbrand 2020-08-11 12:55:53 +0200
  • e2828e30a s390x: Optimize SGEMM/DGEMM blocks for z14 with explicit loop unrolling/interleaving by Marius Hillenbrand 2020-08-11 12:55:42 +0200
  • 7219c9cb8 Merge pull request #2764 from martin-frbg/lapacktests by Martin Kroeker 2020-08-10 13:27:51 +0200
  • c9d32674e (refs/pull/2765/head) Add memory barrier to the blas_lock implementation for Linux by Martin Kroeker 2020-08-09 19:17:04 +0200
  • 64259d521 (refs/pull/2764/head) Fix use of unallocated array in workspace query and wrong type of argument to xSCAL by Martin Kroeker 2020-08-09 13:02:27 +0200
  • 6f5ca44c1 Expand TAU array as SGEMQR/DGEMQR read elements 2 and 3 by Martin Kroeker 2020-08-09 12:59:20 +0200
  • d28b3f277 Create Jenkinsfile for OSUOSL PowerCI by Martin Kroeker 2020-08-08 18:05:20 +0200
  • ba3f7b3ac Merge pull request #2761 from RajalakshmiSR/Makefile_err by Martin Kroeker 2020-08-08 12:20:04 +0200
  • 475b5c95b (refs/pull/2761/head) Remove extra symbol in Makefile by Rajalakshmi Srinivasaraghavan 2020-08-07 15:27:44 -0500
  • cd60080d4 Merge pull request #2758 from martin-frbg/undef_shift by Martin Kroeker 2020-08-03 23:30:26 +0200
  • 4847bfddd Merge pull request #2757 from martin-frbg/cmake64 by Martin Kroeker 2020-08-02 23:05:21 +0200
  • 81dcfdcf3 (refs/pull/2758/head) Multiply by 2 instead of left-shifting a potentially negative number by Martin Kroeker 2020-08-02 18:29:56 +0200
  • 0ef4b3f1f Multiply instead of doing a left shift of a potentially negative number by Martin Kroeker 2020-08-02 18:27:40 +0200
  • aa53a8a5c Multiply by two instead of left-shifting one place by Martin Kroeker 2020-08-02 18:25:09 +0200
  • aa3a1e7d8 Multiply by two rather than left shift by one place by Martin Kroeker 2020-08-02 18:22:31 +0200
  • aaf1a1716 (refs/pull/2757/head) Apply current library name suffix by Martin Kroeker 2020-08-02 17:58:33 +0200
  • 53add6a80 Apply library name suffix to openblas if any by Martin Kroeker 2020-08-02 17:57:12 +0200
  • 9eb897cc0 Merge pull request #75 from xianyi/develop by Martin Kroeker 2020-08-02 17:50:06 +0200
  • 7cead5625 Merge pull request #2753 from martin-frbg/issue2751 by Martin Kroeker 2020-08-02 15:32:46 +0200
  • 6794ac341 (refs/pull/2753/head) Add SYMBOLPREFIX and/or -SUFFIX to cblas.h if needed by Martin Kroeker 2020-08-02 11:20:08 +0200
  • ecf4b9e0f Improve substitution rules for SYMBOLPREFIX and -SUFFIX addition by Martin Kroeker 2020-08-01 17:06:03 +0200
  • dfe5d0964 Merge pull request #2756 from martin-frbg/issue2755 by Martin Kroeker 2020-08-01 15:19:02 +0200
  • 60cd5e55f (refs/pull/2756/head) Protect against inadvertent activation of USE_CUDA by Martin Kroeker 2020-08-01 12:31:39 +0200
  • da9e2a7ad Add SYMBOLPREFIX and/or SYMBOLSUFFIX to cblas prototypes by Martin Kroeker 2020-07-31 16:03:33 +0200
  • c88cbc5e0 Merge pull request #2752 from kadler/cpuid_aix by Martin Kroeker 2020-07-31 12:52:24 +0200
  • 589c74aed (refs/pull/2752/head) Use systemcfg APIs for CPU detection on AIX by Kevin Adler 2020-07-30 20:52:16 -0500
  • 104aa678b Fix inadvertent version number reversal to 0.3.9.dev caused by #2710 by Martin Kroeker 2020-07-30 11:40:52 +0200
  • c6b48e039 Merge pull request #2749 from martin-frbg/make_ppc by Martin Kroeker 2020-07-30 11:35:53 +0200
  • 492725129 Merge pull request #2750 from RajalakshmiSR/dgemv_p10 by Martin Kroeker 2020-07-30 10:13:19 +0200
  • f77b6a83f (refs/pull/2750/head) dgemv optimization for POWER10 by Rajalakshmi Srinivasaraghavan 2020-07-29 18:59:32 -0500
  • 39724e812 (refs/pull/2749/head) Separate OpenMP handling and allow compilation of Power9 code with older gcc by Martin Kroeker 2020-07-30 01:14:08 +0200
  • 525db5401 Merge pull request #74 from xianyi/develop by Martin Kroeker 2020-07-30 01:04:09 +0200
  • cb097beba Merge pull request #2741 from martin-frbg/issue2739 by Martin Kroeker 2020-07-29 10:01:14 +0200
  • 7c02f4b1f Merge pull request #2744 from martin-frbg/issue2738 by Martin Kroeker 2020-07-28 19:32:04 +0200
  • 383262035 Merge pull request #2740 from RajalakshmiSR/clang-power by Martin Kroeker 2020-07-28 18:15:25 +0200
  • 5fa581c87 Put hint to use git develop rather than master branch in README by Martin Kroeker 2020-07-28 14:22:41 +0000
  • 12918358a (refs/pull/2744/head) Add AMD Renoir/Matisse and preliminary support for Zen3 as Zen2 by Martin Kroeker 2020-07-28 13:53:17 +0000
  • 200f5c44c Add AMD Renoir models and preliminary support for ZEN3 as ZEN2 by Martin Kroeker 2020-07-28 13:45:23 +0000
  • c4176105d (refs/pull/2743/head) Fix accidental deletion by Martin Kroeker 2020-07-28 10:08:41 +0000
  • ba27936ce Add cpuid detection of AMD Zen2 Matisse and Renoir by Martin Kroeker 2020-07-28 09:03:52 +0000
  • afdca268a Add AMD Matisse and Renoir Zen2 variants by Martin Kroeker 2020-07-28 09:00:12 +0000
  • 64e2e4aaf (refs/pull/2741/head) missing braces by Martin Kroeker 2020-07-27 20:19:22 +0000
  • 921ec4e9e Adjust A53 SGEMM parameters to reflect move to 8x8 kernel by Martin Kroeker 2020-07-27 19:54:46 +0000
  • d557584b7 (refs/pull/2740/head) Fix compilation issues with clang on POWER by Rajalakshmi Srinivasaraghavan 2020-07-27 14:11:07 -0500
  • a4ceb1ade Merge pull request #2737 from ashwinyes/add_thunderx3_target by Martin Kroeker 2020-07-27 15:19:47 +0200
  • 4e1be0e48 (refs/pull/2737/head) ARM64: Add THUNDERX3T110 Target by Ashwin Sekhar T K 2020-06-11 04:12:49 -0700
  • 49b83e00b Merge pull request #2735 from martin-frbg/move_potrf by Martin Kroeker 2020-07-26 19:54:11 +0200
  • 769ed9ffa Merge pull request #2734 from RajalakshmiSR/p10_fix by Martin Kroeker 2020-07-25 09:02:32 +0200
  • f194ad59e (refs/pull/2735/head) Use _Atomic instead of volatile where available (file moved from ../getrf) by Martin Kroeker 2020-07-25 08:52:24 +0200
  • 4fda217f9 Delete potrf_parallel.c (moving it to ../potrf) by Martin Kroeker 2020-07-25 06:42:39 +0000
  • 9be2688c7 (refs/pull/2734/head) Fix to store results in correct order for POWER10 GEMM kernels by Rajalakshmi Srinivasaraghavan 2020-07-24 23:08:11 -0500
  • 6a2a60038 Merge pull request #2720 from martin-frbg/issue2694 by Martin Kroeker 2020-07-24 23:19:45 +0200
  • 251a09ec9 (refs/pull/2720/head) Typo fix by Martin Kroeker 2020-07-24 16:04:58 +0000
  • 95d37e157 Regroup the 32 and 64bit sections and restore 64bit CAXPY by Martin Kroeker 2020-07-24 10:13:46 +0000