Commit Graph

  • *
  • |\
  • | *
  • | |\
  • | | *
  • | |/
  • |/|
  • | *
  • | |\
  • | * \
  • | |\ \
  • | | * |
  • | |/ /
  • | | *
  • | | *
  • | | *
  • | |/
  • | *
  • | |\
  • | | | *
  • | | |/
  • | |/|
  • | * |
  • | |\ \
  • | | | | *
  • | | | | *
  • | | | | *
  • | | | | *
  • | | | | | *
  • | | * | | |
  • | | * | | |
  • | |/ / / /
  • |/| | | |
  • | | | * |
  • | | | * |
  • | | | * |
  • | | | * |
  • | |_|/ /
  • |/| | |
  • * | | |
  • |\| | |
  • | * | |
  • | |\ \ \
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | |/ / /
  • | * | |
  • | |\ \ \
  • | | * | |
  • | |/ / /
  • |/| | |
  • | | | | *
  • | |_|_|/
  • |/| | |
  • | * | |
  • | * | |
  • | |\ \ \
  • | | | | | *
  • | | | | | *
  • | | | | | *
  • | |_|_|_|/
  • |/| | | |
  • * | | | |
  • |\| | | |
  • | * | | |
  • | * | | |
  • | |\ \ \ \
  • | | | | | | *
  • | | | | | | *
  • | | | | | | *
  • | | | | | | *
  • | | | | | | *
  • | | | | | | *
  • | | | | | | *
  • | |_|_|_|_|/
  • |/| | | | |
  • | | * | | |
  • | | * | | |
  • | |/ / / /
  • |/| | | |
  • | | | | *
  • | | | | *
  • | * | | |
  • | |\ \ \ \
  • | | * | | |
  • | |/ / / /
  • | | | | | *
  • | | | | | *
  • | | | | | |\
  • | | |_|_|_|/
  • | |/| | | |
  • | | | | | *
  • | * | | | |
  • | |\ \ \ \ \
  • | | * | | | |
  • | |/ / / / /
  • | | | | | *
  • | | * | | |
  • | | * | | |
  • | | * | | |
  • | | * | | |
  • | | * | | |
  • | |/ / / /
  • |/| | | |
  • | * | | |
  • | |\ \ \ \
  • | * \ \ \ \
  • | |\ \ \ \ \
  • | | * | | | |
  • | |/ / / / /
  • | | * / / /
  • | |/ / / /
  • | * | | |
  • | |\ \ \ \
  • | | * | | |
  • | |/ / / /
  • |/| | | |
  • | * | | |
  • | |\ \ \ \
  • | | * | | |
  • | | * | | |
  • | |/ / / /
  • |/| | | |
  • * | | | |
  • |\| | | |
  • | * | | |
  • | |\ \ \ \
  • | | |_|/ /
  • | |/| | |
  • | | | | | *
  • | |_|_|_|/
  • |/| | | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | * | | |
  • | * | | |
  • | |\ \ \ \
  • | | | | | | *
  • | | | | | | *
  • | | | | | | *
  • | | | | | | *
  • | | * | | | |
  • | |/ / / / /
  • | | | | | *
  • | | | | | *
  • | | | | | *
  • | | | | | *
  • | * | | | |
  • | |\ \ \ \ \
  • | | | | | | *
  • 0d1f30a29 Merge pull request #81 from xianyi/develop by Martin Kroeker 2020-09-05 12:47:03 +0200
  • 70a254d50 Merge pull request #2822 from martin-frbg/issue2821 by Martin Kroeker 2020-09-05 12:39:32 +0200
  • 330044d82 (refs/pull/2822/head) Fix potentiol domain error in sqrt by Martin Kroeker 2020-09-05 09:44:33 +0200
  • 97636b2c8 Merge pull request #2819 from h-vetinari/carry_lapack_437 by Martin Kroeker 2020-09-04 23:50:43 +0200
  • 4d3671154 Merge pull request #2820 from RajalakshmiSR/clang by Martin Kroeker 2020-09-04 23:09:31 +0200
  • 718f67421 (refs/pull/2820/head) POWER9: Fix mcpu option with clang by Rajalakshmi Srinivasaraghavan 2020-09-04 10:36:19 -0500
  • 3426519ae (refs/pull/2819/head) adapt ?ggsv?-functions to ambient code style in LAPACKE/include/lapack.h by H. Vetinari 2020-09-02 22:46:47 +0200
  • 1c6c71fa8 Follow-up to lapack#434 & lapack#409: add missing 'const' in signatures by H. Vetinari 2020-09-02 22:41:50 +0200
  • 860247b5d Follow-up to lapack#434 & lapack#409: fix signature mismatches by H. Vetinari 2020-09-02 22:38:56 +0200
  • c61771e33 Merge pull request #2778 from martin-frbg/lapackeig by Martin Kroeker 2020-09-04 10:06:02 +0200
  • deaeb6c5b (refs/pull/2796/head) Add bfloat16 based dot and conversion with single/double by Chen, Guobing 2020-08-27 06:42:28 +0800
  • c7ef7174e Merge pull request #2817 from martin-frbg/lapack436 by Martin Kroeker 2020-09-03 17:10:23 +0200
  • 775a87242 (refs/pull/2816/head) Rename KERNEL.SILICON to KERNEL.VORTEX by Martin Kroeker 2020-09-03 08:44:20 +0200
  • af5bc9550 Rename SILICON to VORTEX and fix duplicate numbering by Martin Kroeker 2020-09-03 08:43:26 +0200
  • ea3a58c84 Rename SILICON to VORTEX by Martin Kroeker 2020-09-03 08:38:53 +0200
  • 17dca035d rename SILICON to VORTEX by Martin Kroeker 2020-09-03 08:38:08 +0200
  • 1b0f17eee (refs/pull/2803/head) align to 64, using SSE when input size is small by Gengxin Xie 2020-09-01 15:41:48 +0800
  • c31b72965 (refs/pull/2817/head) Fix data type of work array in zgesvdq prototype by Martin Kroeker 2020-09-02 23:44:44 +0200
  • 0ce2aa316 Fix data type of rwork array by Martin Kroeker 2020-09-02 23:41:51 +0200
  • 80794fe8f Create KERNEL.SILICON by Martin Kroeker 2020-09-02 22:56:58 +0200
  • 4a4d1ca6e Add AppleSIlicon cpu by Martin Kroeker 2020-09-02 22:52:12 +0200
  • b37d17382 Add Apple Silicon by Martin Kroeker 2020-09-02 22:48:49 +0200
  • 029fd01cf Detect AppleSilicon cpu on OSX by Martin Kroeker 2020-09-02 22:47:38 +0200
  • 9d1ea75aa Merge pull request #80 from xianyi/develop by Martin Kroeker 2020-09-02 22:16:41 +0200
  • 776d005f4 Merge pull request #2815 from mhillenibm/clang_s390x by Martin Kroeker 2020-09-02 16:56:01 +0200
  • 2ee5b899c (refs/pull/2815/head) s390x: enable S/DGEMM block with explicit loop unrolling + interleaving with clang by Marius Hillenbrand 2020-09-01 16:16:53 +0200
  • 095f4e696 s390x: allow clang to emit fused multiply-adds (replicates gcc's default behavior) by Marius Hillenbrand 2020-09-01 15:09:32 +0200
  • 87e5bbd88 s390x: avoid variable-length arrays in struct for asm operands by Marius Hillenbrand 2020-09-01 12:08:05 +0200
  • b9b3265ec s390x: avoid inline assembly for vector loads for clang by Marius Hillenbrand 2020-09-01 12:04:28 +0200
  • a1616a0b8 s390x: replace nop with "nop 0" in inline assembly by Marius Hillenbrand 2020-09-01 11:58:48 +0200
  • 60ef19325 s390x: use "lghi" for immediate values to fix build with clang by Marius Hillenbrand 2020-09-01 13:59:06 +0200
  • 18bfb6d6f Merge pull request #2813 from martin-frbg/issue2804-2 by Martin Kroeker 2020-09-01 23:39:46 +0200
  • e4900caa1 (refs/pull/2813/head) Fix c_check misinterpreting arm64 in uname output to mean armv7 by Martin Kroeker 2020-09-01 19:54:08 +0200
  • be9a20fb6 (refs/pull/2812/head) Accept uname output of arm64 as such by Martin Kroeker 2020-09-01 17:45:41 +0200
  • 68b1713c3 Merge pull request #2811 from martin-frbg/issue2806 by Martin Kroeker 2020-09-01 17:19:14 +0200
  • 4074770d0 Merge pull request #2797 from martin-frbg/relafixes1 by Martin Kroeker 2020-09-01 16:04:03 +0200
  • 88bf71d02 (refs/pull/2811/head) Fix accidental deletion of Cooperlake entries with the preceding commit by Martin Kroeker 2020-09-01 14:15:20 +0200
  • a76e56f91 Report NO_AVX512 being set (as it is already done for NO_AVX, NO_AVX2) by Martin Kroeker 2020-09-01 13:34:59 +0200
  • 47e75f0ac Allow overriding the AVX512 check with a NO_AVX512 define by Martin Kroeker 2020-09-01 12:09:25 +0200
  • b87a77da0 Merge pull request #79 from xianyi/develop by Martin Kroeker 2020-09-01 12:03:53 +0200
  • f42e84d46 Fix misnaming of LAPACK_?ggsvp function prototypes as LAPACKE_ (#2808) by Martin Kroeker 2020-09-01 10:44:48 +0200
  • 0a4c5c4c4 Merge pull request #2807 from martin-frbg/issue2804 by Martin Kroeker 2020-08-31 23:44:56 +0200
  • a5f7626bd (refs/pull/2808/head) missing comma by Martin Kroeker 2020-08-31 23:29:20 +0200
  • 3aaa6a47b fix argument lists of LAPACK_?ggsvp prototypes by Martin Kroeker 2020-08-31 23:18:04 +0200
  • 16b805b25 Update lapack.h by Martin Kroeker 2020-08-31 22:53:41 +0200
  • deb119992 Update lapack.h by Martin Kroeker 2020-08-31 22:38:15 +0200
  • defb15e71 Update lapack.h by Martin Kroeker 2020-08-31 21:36:21 +0200
  • cd4fbf124 Need to drop the LAPACKE matrix_layout parameter for LAPACK ?ggsvp as well by Martin Kroeker 2020-08-31 21:18:45 +0200
  • 1a0552370 Fix misnaming of LAPACK_?ggsvp function prototypes as LAPACKE_ by Martin Kroeker 2020-08-31 20:08:35 +0200
  • 3210a4273 (refs/pull/2807/head) Report cpu as ARMV8 instead of just giving up on non-Linux hosts by Martin Kroeker 2020-08-31 20:03:21 +0200
  • 5feb087c0 Handle Apple labeling armv8 as arm64 rather than aarch64 by Martin Kroeker 2020-08-31 20:02:08 +0200
  • 448152cdd define __AVX2__ to ensure the haswell code compiled with avx2 by Gengxin Xie 2020-08-31 14:39:08 +0800
  • cb3c190a3 Implementaion of dasum, sasum with AVX2 & AVX512 intrinsic by Gengxin Xie 2020-08-21 14:44:36 +0800
  • 59e01b1ae Merge pull request #2799 from RajalakshmiSR/p10_ger by Martin Kroeker 2020-08-28 22:52:11 +0200
  • 317ff27cd (refs/pull/2799/head) POWER10: Avoid setting accumulators to zero in gemm kernels by Rajalakshmi Srinivasaraghavan 2020-08-28 10:42:54 -0500
  • 4130d1732 Refs #2587 fix small matrix c/zgemm bug. by Xianyi Zhang 2020-08-28 22:36:36 +0800
  • 255b6dd0f Merge branch 'develop' into small_matrices by Xianyi Zhang 2020-08-28 21:38:58 +0800
  • 741d6c5cb Refs #2587 Add small matrix optimization reference kernel for c/zgemm. by Xianyi Zhang 2020-08-28 21:00:54 +0800
  • 514a3d7d6 Merge pull request #2798 from kadler/aix-cpuid by Martin Kroeker 2020-08-28 08:30:59 +0200
  • 085aae8bd (refs/pull/2798/head) Fix compile error on AIX cpuid detection by Kevin Adler 2020-08-27 23:08:33 -0500
  • 712ca4306 Change a1b0 gemm to b0 gemm. by Xianyi Zhang 2020-08-28 07:55:27 +0800
  • de6367571 (refs/pull/2797/head) Add early returns and fix sign errors in workspace calculations by Martin Kroeker 2020-08-27 11:25:18 +0200
  • d64cc2be8 Add early returns by Martin Kroeker 2020-08-27 11:22:50 +0200
  • c9b67141f Add early returns by Martin Kroeker 2020-08-27 11:20:31 +0200
  • 6797a3a1e Add early returns by Martin Kroeker 2020-08-27 11:15:12 +0200
  • 936966a42 Make ILAENV and xGETRF2 functions available by Martin Kroeker 2020-08-27 10:59:08 +0200
  • 5c6c2cd4f Merge pull request #2775 from Guobing-Chen/Fix_OMP_threads_specify by Martin Kroeker 2020-08-24 20:18:09 +0200
  • e54be4ba1 Merge pull request #2792 from pkubaj/patch-1 by Martin Kroeker 2020-08-24 08:03:39 +0200
  • 48a1364e1 (refs/pull/2792/head) Add aliases for armv6, armv7 by pkubaj 2020-08-23 18:50:19 +0000
  • 0c1c903f1 (refs/pull/2775/head) Fix OMP num specify issue by Chen, Guobing 2020-08-12 03:28:25 +0800
  • a073fa870 Merge pull request #2791 from martin-frbg/issue2787 by Martin Kroeker 2020-08-23 19:33:03 +0200
  • b2053239f (refs/pull/2791/head) Fix mssing dummy parameter (imag part of alpha) of zdot_thread_function by Martin Kroeker 2020-08-23 15:08:16 +0200
  • b11bb6e72 Merge pull request #2790 from martin-frbg/issue2789 by Martin Kroeker 2020-08-23 14:42:35 +0200
  • 1840bc5b5 (refs/pull/2790/head) Add OpenMP dependency to pkgconfig file if needed by Martin Kroeker 2020-08-22 13:55:18 +0200
  • 7c0977c26 Add OpenMP dependency to pkgconfig file if needed by Martin Kroeker 2020-08-22 13:53:44 +0200
  • fb3d80c42 Merge pull request #78 from xianyi/develop by Martin Kroeker 2020-08-22 13:52:29 +0200
  • 9ee21a0a3 Merge pull request #2780 from Guobing-Chen/CPL_build_support by Martin Kroeker 2020-08-20 19:54:29 +0200
  • 35557ec92 (refs/pull/2788/head) Add R benchmarks at higher core counts by Martin Kroeker 2020-08-20 16:42:27 +0200
  • bd3207b4b (refs/pull/2780/head) Update system.cmake by Martin Kroeker 2020-08-19 22:51:10 +0200
  • b8ebfc933 Update system.cmake by Martin Kroeker 2020-08-19 22:30:19 +0200
  • 7c1986640 fallback from cooperlake to skylake if gcc<10 by Martin Kroeker 2020-08-19 20:48:39 +0200
  • 71d33c952 Typo fix by Martin Kroeker 2020-08-19 17:44:23 +0200
  • 6a3c07478 -march=cooperlake requires gcc10 by Martin Kroeker 2020-08-19 17:22:12 +0200
  • 430f741b3 -march=cooperlake requires gcc10 by Martin Kroeker 2020-08-19 17:17:53 +0200
  • 6f4dc7445 Fix typo by Martin Kroeker 2020-08-19 16:36:55 +0200
  • 81fbe8d08 -march=cooperlake only available in gcc >= 10 by Martin Kroeker 2020-08-19 16:10:15 +0200
  • bb9cf766f make march=cooperlake option conditional on gcc >= 10.1 by Martin Kroeker 2020-08-19 15:06:30 +0200
  • 75eeb265d [WIP] Refactor the driver code for direct SGEMM (#2782) by Martin Kroeker 2020-08-19 14:51:09 +0200
  • 2c7297257 Merge pull request #2785 from albertziegenhagel/always-generate-pkg-config by Martin Kroeker 2020-08-19 14:42:58 +0200
  • 416ee2602 (refs/pull/2782/head) revert the unrelated drone.io CI config change by Martin Kroeker 2020-08-18 16:17:19 +0200
  • a7fc14c50 Limit direct sgemm to x86_64 by Martin Kroeker 2020-08-18 14:13:15 +0200
  • b86214e43 Limit direct sgemm to x86_64 by Martin Kroeker 2020-08-18 14:12:19 +0200
  • 1ba18212d Update common_s.h by Martin Kroeker 2020-08-18 09:36:59 +0200
  • 6b731d917 (refs/pull/2785/head) Do not require pkg-config to generate the *.pc file by Albert Ziegenhagel 2020-08-18 08:48:48 +0200
  • 5a0e9e8de Update setparam-ref.c by Martin Kroeker 2020-08-17 22:38:02 +0200
  • e46d761bc Update setparam-ref.c by Martin Kroeker 2020-08-17 22:16:20 +0200
  • 6c279ef55 Update setparam-ref.c by Martin Kroeker 2020-08-17 21:55:54 +0200
  • 7996458ea Update common_s.h by Martin Kroeker 2020-08-17 20:06:59 +0200
  • 5dcf47cd9 Merge pull request #2784 from martin-frbg/issue2783 by Martin Kroeker 2020-08-17 19:06:13 +0200
  • 7fe38daee use macros for sgemm_direct to support dynamic_arch naming via common_s,h by Martin Kroeker 2020-08-17 18:56:05 +0200