Commit Graph

  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • |\
  • | *
  • | *
  • |/
  • *
  • |\
  • | *
  • * |
  • |\ \
  • | * |
  • | * |
  • | |/
  • | *
  • | |\
  • | |/
  • |/|
  • * |
  • |\ \
  • | * |
  • |/ /
  • * |
  • |\ \
  • | * |
  • | * |
  • | * |
  • | * |
  • | * |
  • | * |
  • | * |
  • | * |
  • |/ /
  • * |
  • |\ \
  • | * |
  • |/ /
  • * |
  • |\ \
  • | * |
  • |/ /
  • * |
  • |\ \
  • | * |
  • | |/
  • * |
  • |\ \
  • | * |
  • | |/
  • * |
  • |\ \
  • | * |
  • | |/
  • | *
  • | |\
  • | |/
  • |/|
  • * |
  • |\ \
  • | * |
  • |/ /
  • * |
  • |\ \
  • | * |
  • | |/
  • | *
  • | |\
  • | |/
  • |/|
  • * |
  • |\ \
  • | * |
  • |/ /
  • * |
  • |\ \
  • | * |
  • | |/
  • | *
  • | |\
  • | |/
  • |/|
  • * |
  • |\ \
  • | * |
  • | |/
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | |/
  • | *
  • | |\
  • | |/
  • |/|
  • * |
  • |\ \
  • * | |
  • | | | *
  • | | | *
  • * | | |
  • |\ \ \ \
  • | | * | |
  • | |/ / /
  • |/| | |
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | |/
  • | | *
  • | | |\
  • | |_|/
  • |/| |
  • * | |
  • |\ \ \
  • | | | | *
  • | | | | *
  • | | | | *
  • | | | | *
  • | | * | |
  • | | |/ /
  • | * / /
  • | |/ /
  • | * |
  • | |\ \
  • | |/ /
  • |/| |
  • * | |
  • |\ \ \
  • | * | |
  • | * | |
  • | |/ /
  • * | |
  • |\ \ \
  • | * | |
  • | |/ /
  • | * |
  • | |\ \
  • | |/ /
  • |/| |
  • * | |
  • |\ \ \
  • | * | |
  • | * | |
  • | |/ /
  • * | |
  • |\ \ \
  • | * | |
  • | |/ /
  • ea5bdc3f7 split cortex-a53 param to match 8x8 kernel by 张丹枫 2020-05-20 22:34:47 +0800
  • 9df79ae9a update sgemm and strmm kernel selecting strategy by 张丹枫 2020-05-20 21:57:12 +0800
  • a1fc6041c use general register to speedup by 张丹枫 2020-05-20 21:55:32 +0800
  • edb423d77 align general register using to strmm_kernel_8x8 by 张丹枫 2020-05-20 21:52:49 +0800
  • 0e6eb8c24 sgemm kernel use sgemm_kernel_8x8_cortexa53 by zhangdanfeng 2020-05-18 16:51:33 +0800
  • d475db29c optimized for cortex-a53 by zhangdanfeng 2020-05-18 16:47:33 +0800
  • 729ac6bd4 Merge pull request #2623 from mhillenibm/zarch_dgemm_z14 by Martin Kroeker 2020-05-20 14:51:04 +0200
  • 89fe17f20 (refs/pull/2623/head) s390x: Use new sgemm kernel also for DGEMM and DTRMM on Z14 by Marius Hillenbrand 2020-05-19 14:56:34 +0200
  • bdd795ed0 s390x/GEMM: replace 0-init with peeled first iteration by Marius Hillenbrand 2020-05-19 14:30:44 +0200
  • e1038ea83 Merge pull request #2622 from martin-frbg/issue2619 by Martin Kroeker 2020-05-19 23:07:22 +0200
  • 6baa9a778 (refs/pull/2622/head) Improve declaration of LAPACKE_get_nancheck by Martin Kroeker 2020-05-19 17:59:31 +0200
  • cf46c9f84 Merge pull request #2617 from martin-frbg/issue2616 by Martin Kroeker 2020-05-18 13:23:58 +0200
  • 55602fce5 (refs/pull/2617/head) Ignore spurious all-numeric library names derived from mishandled jobserver flags by Martin Kroeker 2020-05-17 15:28:14 +0200
  • 3d5e159e7 Ignore spurious all-numeric library names derived from mishandled jobserver flags by Martin Kroeker 2020-05-17 15:26:57 +0200
  • 2931feb57 Merge pull request #58 from xianyi/develop by Martin Kroeker 2020-05-17 15:23:32 +0200
  • 20245ded5 Merge pull request #2615 from mhillenibm/z14_alignment_hints by Martin Kroeker 2020-05-14 21:06:34 +0200
  • 2840432e4 (refs/pull/2615/head) s390x: improvise vector alignment hints for older compilers by Marius Hillenbrand 2020-05-13 17:48:50 +0200
  • ea78106c7 Merge pull request #2614 from mhillenibm/gemm_vec_z14 by Martin Kroeker 2020-05-13 15:09:23 +0200
  • cb9dc36dd (refs/pull/2614/head) Update CONTRIBUTORS.md by Marius Hillenbrand 2020-05-12 16:14:00 +0200
  • 1b0b4349a s390x/Z14: Change register blocking for SGEMM to 16x4 by Marius Hillenbrand 2020-05-12 15:06:38 +0200
  • 71b6eaf45 s390x: Use new sgemm kernel also for strmm on Z14 and newer by Marius Hillenbrand 2020-05-12 14:40:30 +0200
  • 43c0d4f31 s390x: Add vectorized sgemm kernel for Z14 and newer by Marius Hillenbrand 2020-05-12 14:13:54 +0200
  • d7c1677c2 (refs/pull/2613/head) Update CONTRIBUTORS.md, adding myself by Marius Hillenbrand 2020-05-12 11:09:28 +0200
  • 0dbe61a61 s390x: choose SIMD kernels at run-time based on OS and compiler support by Marius Hillenbrand 2020-05-11 13:00:10 +0200
  • 62cf391cb s390x: only build kernels supported by gcc with dynamic arch support by Marius Hillenbrand 2020-05-11 18:37:04 +0200
  • 8c338616f s390x: gate dynamic arch detection on gcc version and add generic by Marius Hillenbrand 2020-05-11 12:37:21 +0200
  • f94c53ec0 Merge pull request #2612 from RajalakshmiSR/testshgemm by Martin Kroeker 2020-05-12 08:34:02 +0200
  • 8efba9b7c (refs/pull/2612/head) Improve shgemm test by Rajalakshmi Srinivasaraghavan 2020-05-11 17:15:10 -0500
  • 4fffa556d Merge pull request #2611 from RajalakshmiSR/bench_half by Martin Kroeker 2020-05-11 21:08:41 +0200
  • ce90e2bd3 (refs/pull/2611/head) Include shgemm in benchtest by Rajalakshmi Srinivasaraghavan 2020-05-11 09:57:46 -0500
  • 948b6712b Merge pull request #2610 from martin-frbg/issue2552-3 by Martin Kroeker 2020-05-10 13:10:31 +0200
  • 2271c3506 (refs/pull/2610/head) Work around excessive LAPACK test failures on Skylake-X by Martin Kroeker 2020-05-09 23:49:18 +0200
  • db00b2144 Merge pull request #2609 from martin-frbg/issue2552-2 by Martin Kroeker 2020-05-09 21:33:02 +0200
  • 58d26b444 (refs/pull/2609/head) Correct ifort options by Martin Kroeker 2020-05-09 17:15:36 +0200
  • 8e47d1405 Merge pull request #2608 from martin-frbg/issue2604 by Martin Kroeker 2020-05-09 16:36:14 +0200
  • cd10b35fe (refs/pull/2608/head) Handle trailing spaces and empty condition variables by Martin Kroeker 2020-05-09 13:42:33 +0200
  • 9472dd99c Merge pull request #57 from xianyi/develop by Martin Kroeker 2020-05-09 13:20:44 +0200
  • 718166545 Merge pull request #2605 from RajalakshmiSR/cmake-power by Martin Kroeker 2020-05-09 11:29:28 +0200
  • bd9ff820b (refs/pull/2605/head) Fix cmake compilation issue - POWER9 by Rajalakshmi Srinivasaraghavan 2020-05-08 20:31:56 -0500
  • 63e45def7 Merge pull request #2603 from martin-frbg/issue2552 by Martin Kroeker 2020-05-08 22:08:39 +0200
  • ec0f22863 (refs/pull/2603/head) Add FFLAGS_DRV to the generated make.inc to fix lapack-test on x86_64 with icc/ifort by Martin Kroeker 2020-05-08 18:06:12 +0200
  • 90e2941c6 Merge pull request #56 from xianyi/develop by Martin Kroeker 2020-05-07 22:43:48 +0200
  • 10d5f3c87 Merge pull request #2602 from ashwinyes/thunderx2_develop by Martin Kroeker 2020-05-07 22:06:41 +0200
  • 8353cb245 (refs/pull/2602/head) ARM64: Improve DAXPY for ThunderX2 by Ashwin Sekhar T K 2020-05-07 09:14:05 -0700
  • ec2dd7b87 Merge pull request #2601 from martin-frbg/issue818 by Martin Kroeker 2020-05-07 10:12:33 +0200
  • 4e82eb9f8 (refs/pull/2601/head) Undefine ASMNAME/NAME/CNAME before defining them by Martin Kroeker 2020-05-07 00:31:32 +0200
  • 61300bb73 Merge pull request #55 from xianyi/develop by Martin Kroeker 2020-05-07 00:27:14 +0200
  • 33e9b1246 Merge pull request #2597 from martin-frbg/appleclang by Martin Kroeker 2020-05-05 13:55:08 +0200
  • 90dba9f71 (refs/pull/2597/head) Duplicate earlier Clang 9.0.0 workaround for corresponding Apple Clang version by Martin Kroeker 2020-05-05 10:44:50 +0200
  • 4d0fd365a (refs/pull/2594/head) Update common_x86_64.h by Martin Kroeker 2020-05-02 20:29:25 +0200
  • 4abb651af fix format specifier for unsigned by Martin Kroeker 2020-05-02 16:10:49 +0200
  • b5d3e46e6 more debugging by Martin Kroeker 2020-05-02 15:21:13 +0200
  • ccdf81ecc and back to unsigned to run another test... by Martin Kroeker 2020-05-02 14:22:32 +0200
  • 20f2f6fc8 revert last change, blas_quickdivide returns a signed int again by Martin Kroeker 2020-05-01 21:12:11 +0200
  • 6b96e6dfa make blas_quickdivide actually return unsigned (to placate clang) by Martin Kroeker 2020-05-01 16:01:42 +0200
  • 94487c02d Delete extra semicolon after brace to make clang happy by Martin Kroeker 2020-05-01 15:56:17 +0200
  • c3c00380d Delete spurious copy of common_param.h by Martin Kroeker 2020-05-01 15:34:56 +0200
  • 2de3fff4f Move some declarations for pre-C99 compatibility by Martin Kroeker 2020-05-01 15:25:32 +0200
  • 424d551e0 Merge pull request #53 from xianyi/develop by Martin Kroeker 2020-05-01 15:18:46 +0200
  • 596f5df9e Merge pull request #2591 from RajalakshmiSR/testhalf by Martin Kroeker 2020-05-01 09:59:39 +0200
  • 5dd14e3d4 Make building the bfloat16 functions conditional on option BUILD_HALF (#2590) by Martin Kroeker 2020-05-01 09:58:30 +0200
  • 924cc7e58 (refs/pull/2590/head) typo fix by Martin Kroeker 2020-04-29 22:11:42 +0200
  • 4297e2ed8 fix shgemm parameter references in arm64 branch by Martin Kroeker 2020-04-29 22:09:23 +0200
  • a54e35e78 Merge pull request #2586 from martin-frbg/miscfixes by Martin Kroeker 2020-04-29 22:01:41 +0200
  • 564b0d39e (refs/pull/2591/head) Add test for shgemm by Rajalakshmi Srinivasaraghavan 2020-04-29 13:40:34 -0500
  • 254a934b5 ifdef another group of shgemm parameters by Martin Kroeker 2020-04-29 20:25:33 +0200
  • 9acf45c67 Fix overlooked shgemm parameters by Martin Kroeker 2020-04-29 19:25:13 +0200
  • 8d4042d89 Make shgemm parameters conditional on BUILD_HALF by Martin Kroeker 2020-04-29 18:46:16 +0200
  • 33059ad1d make bfloat16 functions conditional on BUILD_HALF by Martin Kroeker 2020-04-29 18:31:24 +0200
  • 137781096 fix endif by Martin Kroeker 2020-04-29 18:30:41 +0200
  • b2f6f76a5 Pass BUILD_HALF as a compiler define for dynamic_arch builds by Martin Kroeker 2020-04-29 18:30:10 +0200
  • 84e5b0c4f typo by Martin Kroeker 2020-04-29 16:07:27 +0200
  • 75e0495a7 Make shgemm kernels conditional on BUILD_HALF by Martin Kroeker 2020-04-29 15:58:59 +0200
  • fd267b58b make shgemm kernels conditional on BUILD_HALF by Martin Kroeker 2020-04-29 14:48:37 +0200
  • f881c697f pass the BUILD_HALF option to gensymbol by Martin Kroeker 2020-04-29 14:47:09 +0200
  • 48e26bc31 make bfloat16 functions conditional on BUILD_HALF by Martin Kroeker 2020-04-29 14:46:13 +0200
  • 34e64d57a make shgemm functions conditional on BUILD_HALF by Martin Kroeker 2020-04-29 14:44:53 +0200
  • 45881fab5 make shgemm functions conditional on BUILD_HALF by Martin Kroeker 2020-04-29 14:44:07 +0200
  • 7bf186565 make building the bfloat16 functions conditional on BUILD_HALF by Martin Kroeker 2020-04-29 14:42:35 +0200
  • 3c37071ee make bfloat16 kernels conditional on BUILD_HALF by Martin Kroeker 2020-04-29 14:40:17 +0200
  • 5d58b1110 Merge pull request #52 from xianyi/develop by Martin Kroeker 2020-04-29 14:36:15 +0200
  • d394d4e67 Merge pull request #2585 from martin-frbg/mips64fix by Martin Kroeker 2020-04-28 19:47:55 +0200
  • 9d3a317ab Refs #2587 Fix typos. by Xianyi Zhang 2020-04-29 00:19:19 +0800
  • 92372c70f Fix gemm interface bug for small matrix. by Xianyi Zhang 2020-04-28 23:15:20 +0800
  • 43bef4aaa Add alpha=1.0 beta=0.0 for small gemm. by Xianyi Zhang 2020-04-28 22:35:36 +0800
  • aae6af94b Add small marix optimization kernel interface. by Xianyi Zhang 2020-04-28 19:01:36 +0800
  • f4248af26 (refs/pull/2586/head) Fix compiler warnings by Martin Kroeker 2020-04-28 10:43:12 +0200
  • 2d89603e9 (refs/pull/2585/head) Increase BUFFER_SIZE on mips64 to match SGEMM parameters by Martin Kroeker 2020-04-28 10:40:40 +0200
  • 26bc15258 Merge pull request #51 from xianyi/develop by Martin Kroeker 2020-04-28 10:38:50 +0200
  • 141998dce Merge pull request #2584 from martin-frbg/issue2583 by Martin Kroeker 2020-04-28 10:35:12 +0200
  • 3bd56846b (refs/pull/2584/head) Silence a debug message by Martin Kroeker 2020-04-27 16:27:09 +0200
  • e7bbdfdf8 Have CMAKE parse conditional lines in KERNEL files by Martin Kroeker 2020-04-27 15:20:03 +0200
  • b6795db73 Merge pull request #2582 from martin-frbg/mips32fix by Martin Kroeker 2020-04-27 09:18:34 +0200
  • 5e0dbf8df (refs/pull/2582/head) Increase default BUFFER_SIZE to accomodate SGEMM parameters by Martin Kroeker 2020-04-26 22:21:05 +0200
  • 955d73127 Merge pull request #50 from xianyi/develop by Martin Kroeker 2020-04-26 22:17:56 +0200
  • a8c1bea7a Merge pull request #2581 from martin-frbg/raji by Martin Kroeker 2020-04-25 19:57:10 +0200
  • e43b49e06 (refs/pull/2581/head) Drop the set -e from travis scripts by Martin Kroeker 2020-04-25 16:18:54 +0200
  • 3e28db7f3 Update CONTRIBUTORS.md by Martin Kroeker 2020-04-25 13:51:44 +0200
  • 4b69ee31a Merge pull request #2580 from martin-frbg/issue2538-3 by Martin Kroeker 2020-04-25 00:28:18 +0200
  • 03ff213c5 (refs/pull/2580/head) Increase POWER8 ZGEMM_R and use same R values for POWER9 by Martin Kroeker 2020-04-24 21:46:54 +0200