Commit Graph

  • *
  • | *
  • |/
  • *
  • |\
  • | | *
  • | |/
  • |/|
  • | *
  • | |\
  • | * \
  • | |\ \
  • | | * |
  • | |/ /
  • | | *
  • | |/
  • | *
  • | |\
  • | * \
  • | |\ \
  • | | | *
  • | | |/
  • | |/|
  • | | *
  • | |/
  • |/|
  • * |
  • |\|
  • | *
  • | |\
  • | * \
  • | |\ \
  • | | | *
  • | | * |
  • | |/ /
  • | | *
  • | | *
  • | | *
  • | |/
  • |/|
  • * |
  • |\|
  • | *
  • | |\
  • | * \
  • | |\ \
  • | | | *
  • | |_|/
  • |/| |
  • | | *
  • * | |
  • |\| |
  • | | *
  • | * |
  • | |\ \
  • | | | *
  • | | * |
  • | * | |
  • | |\ \ \
  • | * \ \ \
  • | |\ \ \ \
  • | | | * | |
  • | | |/ / /
  • | |/| | |
  • | | | | *
  • | | | | *
  • | | | | *
  • | | | | *
  • | | | | *
  • | | | | *
  • | |_|_|/
  • |/| | |
  • | | * |
  • | * | |
  • | |\ \ \
  • | | | * |
  • | | | | | *
  • | | |_|_|/
  • | |/| | |
  • | | | * |
  • | |_|/ /
  • |/| | |
  • * | | |
  • |\| | |
  • | | * |
  • | |/ /
  • | * |
  • | * |
  • | * |
  • | |\|
  • | | *
  • | | |\
  • | | |/
  • | |/|
  • | * |
  • | * |
  • | * |
  • | |\ \
  • | | * |
  • | |/ /
  • |/| |
  • | * |
  • | |\ \
  • | | * |
  • | |/ /
  • |/| |
  • * | |
  • |\| |
  • | | | *
  • | | | |\
  • | | |_|/
  • | |/| |
  • | * | |
  • | |\ \ \
  • | | | | | *
  • | | |_|_|/|
  • | |/| | |/
  • | | | |/|
  • | | * | |
  • | | * | |
  • | |/ / /
  • |/| | |
  • | * | |
  • | * | |
  • | |\ \ \
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | |/ / /
  • |/| | |
  • * | | |
  • |\| | |
  • | * | |
  • | |\ \ \
  • | | | | | *
  • | * | | | |
  • | |\ \ \ \ \
  • | | |_|_|_|/
  • | |/| | | |
  • | | | * | |
  • | | * | | |
  • | | * | | |
  • | |/ / / /
  • | | * | |
  • | | * | |
  • | | * | |
  • | |/ / /
  • |/| | |
  • * | | |
  • |\| | |
  • | * | |
  • | |\ \ \
  • | * \ \ \
  • | |\ \ \ \
  • | * \ \ \ \
  • | |\ \ \ \ \
  • | | | * | | |
  • | | | | * | |
  • | | | * | | |
  • | | * | | | |
  • | |/ / / / /
  • | | * | | |
  • | | * | | |
  • | | * | | |
  • | |/ / / /
  • |/| | | |
  • * | | | |
  • |\| | | |
  • | * | | |
  • | |\ \ \ \
  • | | * | | |
  • | |/ / / /
  • | | * | |
  • | | * | |
  • | * | | |
  • | |\ \ \ \
  • | | | * | |
  • | | | * | |
  • | | | * | |
  • | | | * | |
  • ee90f3038 Increase BUFFERSIZE for POWER8-10 and use same value for POWER6 by Martin Kroeker 2020-10-22 18:47:07 +0200
  • 2e48d560b (refs/pull/2936/head) Fix compiler version check by Martin Kroeker 2020-10-22 16:23:29 +0200
  • ab7f46646 Merge pull request #106 from xianyi/develop by Martin Kroeker 2020-10-22 16:21:09 +0200
  • f95031204 (refs/pull/2935/head) Fix macro used in argument conversion (LAPACK PR 458) by Martin Kroeker 2020-10-22 16:19:26 +0200
  • 909068fac Merge pull request #2932 from RajalakshmiSR/copyp10 by Martin Kroeker 2020-10-22 00:29:46 +0200
  • 5b7438fdd Merge pull request #2934 from thrasibule/improve_version_check by Martin Kroeker 2020-10-22 00:29:02 +0200
  • 47696b43e (refs/pull/2934/head) actually check that version is greater than 4.7 by Guillaume Horel 2020-10-21 16:42:37 -0400
  • ad745c0ba (refs/pull/2932/head) Optimize scopy/ccopy for POWER10 by Rajalakshmi Srinivasaraghavan 2020-10-21 09:53:45 -0500
  • 17c46bf06 Merge pull request #2930 from ismail/fix-no-return by Martin Kroeker 2020-10-21 11:43:01 +0200
  • 28242096c Merge pull request #2928 from martin-frbg/issue2917 by Martin Kroeker 2020-10-21 10:11:02 +0200
  • 4a1d00f58 (refs/pull/2930/head) Fix build with -Werror=return-type dgemm_tcopy_16_skylakex.c CNAME function should return an int, add a return 0 similar to other files. by İsmail Dönmez 2020-10-21 08:43:39 +0200
  • 00813363b (refs/pull/2928/head) Enable -mavx2 for flang as well by Martin Kroeker 2020-10-20 23:56:30 +0200
  • 336e35469 Merge pull request #105 from xianyi/develop by Martin Kroeker 2020-10-20 23:48:53 +0200
  • 29668458f Merge pull request #2925 from martin-frbg/issue2911-2 by Martin Kroeker 2020-10-20 11:27:36 +0200
  • ee83e2904 Merge pull request #2926 from bartoldeman/vzeroupper-clobber-all by Martin Kroeker 2020-10-20 09:24:47 +0200
  • 1a0f57c8f (refs/pull/2925/head) Fix missing backquotes by Martin Kroeker 2020-10-20 08:37:53 +0200
  • b073d759d (refs/pull/2926/head) x86_64: clobber all xmm registers after vzeroupper by Bart Oldeman 2020-10-20 02:16:47 +0000
  • eddc65c7b Add POWER10 support flag (unconditionally for now) by Martin Kroeker 2020-10-20 01:09:49 +0200
  • bb8c3f686 Add ld/binutils version check for POWER10 support by Martin Kroeker 2020-10-20 01:04:20 +0200
  • ff65952e4 Move HAVE_P10_SUPPORT to the build system by Martin Kroeker 2020-10-20 00:55:41 +0200
  • 6208c9899 Merge pull request #104 from xianyi/develop by Martin Kroeker 2020-10-20 00:52:08 +0200
  • 8e20ab21c Merge pull request #2924 from martin-frbg/issue2920 by Martin Kroeker 2020-10-19 23:33:45 +0200
  • dc6e44c3f Merge pull request #2916 from martin-frbg/issue2911 by Martin Kroeker 2020-10-19 23:33:31 +0200
  • 4ad33c46b (refs/pull/2924/head) Add back symbols that got dropped when splitting by type by Martin Kroeker 2020-10-19 20:37:52 +0200
  • fe2a922ad (refs/pull/2916/head) Add POWER10 compiler options to CCOMMON_OPT rather than COMMON_OPT by Martin Kroeker 2020-10-19 17:43:53 +0200
  • 9cac37965 Merge pull request #103 from xianyi/develop by Martin Kroeker 2020-10-19 15:56:20 +0200
  • a61c08640 Fix spurious trailing whitespace in comment by Martin Kroeker 2020-10-19 09:12:12 +0200
  • 5b9ebe4f8 Merge pull request #2919 from isuruf/export by Martin Kroeker 2020-10-19 08:14:27 +0200
  • 7eddaf0d6 Remove -mmma again (reduntant with cpu=power10) and add override statements by Martin Kroeker 2020-10-19 08:11:22 +0200
  • 14b1d3393 (refs/pull/2919/head) Fix exporting some lapack and cblas by Isuru Fernando 2020-10-18 21:42:32 -0500
  • 77669b019 Merge pull request #2915 from bartoldeman/no-empty_sgemm_direct_skylakex by Martin Kroeker 2020-10-19 00:09:54 +0200
  • 5e8ddc900 Merge pull request #2913 from martin-frbg/issue2910 by Martin Kroeker 2020-10-18 23:04:56 +0200
  • 03e781b76 (refs/pull/2915/head) sgemm_direct_skylakex: fix 75eeb26 regression. by Bart Oldeman 2020-10-18 19:50:38 +0000
  • f1a4071d8 Clean up STACKSIZE redefinition by Martin Kroeker 2020-10-18 19:41:43 +0200
  • 97cf10062 Clean up STACKSIZE redefinition by Martin Kroeker 2020-10-18 19:39:18 +0200
  • 17e288e18 Clean up STACKSIZE redefinition by Martin Kroeker 2020-10-18 19:37:04 +0200
  • c1422f3e4 Clean up STACKSIZE redefinition by Martin Kroeker 2020-10-18 19:31:01 +0200
  • d85b24e10 Clean up STACKSIZE redefinition by Martin Kroeker 2020-10-18 19:29:45 +0200
  • 7d6c85f9d Add compiler option -mmma for POWER10 by Martin Kroeker 2020-10-18 19:27:51 +0200
  • 2e7ee7c71 (refs/pull/2913/head) Fix naming of L2 cache size item reported for Vortex by Martin Kroeker 2020-10-18 19:22:05 +0200
  • efd47b010 Merge pull request #2909 from isuruf/patch-1 by Martin Kroeker 2020-10-18 19:16:08 +0200
  • f5902ab0a Support cross-compiling for Apple Vortex by Martin Kroeker 2020-10-18 19:10:58 +0200
  • bf1f1c66b (refs/pull/2912/head) VORTEX by Isuru Fernando 2020-10-18 12:08:35 -0500
  • 1a0c18512 Support cross-compiling for Apple Vortex by Martin Kroeker 2020-10-18 18:54:54 +0200
  • 89eea6b45 Merge pull request #102 from xianyi/develop by Martin Kroeker 2020-10-18 18:49:59 +0200
  • a5c667b55 (refs/pull/2909/head) Need a space when redirecting to file by Isuru Fernando 2020-10-18 09:40:31 -0500
  • 0ac610270 Update version string to 0.3.11.dev by Martin Kroeker 2020-10-17 22:40:47 +0200
  • 26a701f4a Update version string to 0.3.11.dev by Martin Kroeker 2020-10-17 22:40:06 +0200
  • fcd0fa1a3 Merge pull request #2908 from xianyi/release-0.3.0 by Martin Kroeker 2020-10-17 22:38:58 +0200
  • 51c22612e (tag: v0.3.11, refs/pull/2908/head) Merge pull request #2907 from xianyi/develop by Martin Kroeker 2020-10-17 22:14:12 +0200
  • b8f689200 (refs/pull/2907/head) Update version number to 0.3.11 by Martin Kroeker 2020-10-17 22:11:34 +0200
  • fe9015b61 Update version for 0.3.11 release by Martin Kroeker 2020-10-17 22:10:50 +0200
  • f99b8c150 Merge pull request #2906 from martin-frbg/changelog-0311 by Martin Kroeker 2020-10-17 22:07:14 +0200
  • 5381a1805 (refs/pull/2906/head) Update Changelog.txt with the 0.3.11 changes by Martin Kroeker 2020-10-17 22:05:36 +0200
  • e35576c6f Merge pull request #2905 from martin-frbg/aocc-clang by Martin Kroeker 2020-10-17 09:45:22 +0200
  • f1bb85d37 (refs/pull/2905/head) Add AVX flags for clang/aocc as well by Martin Kroeker 2020-10-16 20:52:15 +0200
  • 25907e672 Merge pull request #101 from xianyi/develop by Martin Kroeker 2020-10-16 20:48:58 +0200
  • d7ba7679b Merge branch 'develop' into risc-v by Zhang Xianyi 2020-10-16 23:27:38 +0800
  • 978937538 Merge pull request #2900 from martin-frbg/fixcmake_sse by Martin Kroeker 2020-10-16 16:17:36 +0200
  • 0eda7ac2c (refs/pull/2903/head) Merge 'origin/release-0.3.0' into develop to get the 0.3.10 tag by mattip 2020-10-16 13:15:43 +0300
  • f64243ff5 (refs/pull/2900/head) Add compiler options for sse/sse2/ssse3/sse4.1 by Martin Kroeker 2020-10-16 10:47:06 +0200
  • 786c0a3ce Add sse options for use of intrinics with older compilers by Martin Kroeker 2020-10-16 10:41:53 +0200
  • df7066704 fix core list for sse/sse2 by Martin Kroeker 2020-10-16 09:55:48 +0200
  • e6c5b13a1 Merge pull request #2898 from martin-frbg/morefixes by Martin Kroeker 2020-10-16 07:26:39 +0200
  • f071d1207 (refs/pull/2898/head) add sse2 by Martin Kroeker 2020-10-15 22:10:32 +0200
  • dc6cefd2f Expressly enable -msse for 32bit DYNAMIC_ARCH kernels by Martin Kroeker 2020-10-15 20:16:15 +0200
  • c339c40c0 Silence a redefinition warning by Martin Kroeker 2020-10-15 19:08:12 +0200
  • ac8af9cec Add -msse where supported, apparently required for older gcc by Martin Kroeker 2020-10-15 19:06:45 +0200
  • 10379fc83 Use ifdef instead of if by Martin Kroeker 2020-10-15 19:05:37 +0200
  • a85ac7163 Merge pull request #100 from xianyi/develop by Martin Kroeker 2020-10-15 18:54:20 +0200
  • 4c25910da Merge pull request #2896 from martin-frbg/intrin-double by Martin Kroeker 2020-10-15 11:12:35 +0200
  • ef8e7d027 (refs/pull/2899/head) Add the support for RISC-V Vector. by damonyu 2020-10-15 16:05:37 +0800
  • 9b9ee92d5 Merge pull request #2897 from Qiyu8/usimd-double by Martin Kroeker 2020-10-15 08:38:24 +0200
  • ae6ac8399 (refs/pull/2896/head) Revert "add double precision SSE" by Martin Kroeker 2020-10-15 08:37:02 +0200
  • 4fac91ef3 (refs/pull/2897/head) adapt arm platform by Qiyu8 2020-10-15 11:08:10 +0800
  • bfdf4b56d Add double precision universal intrinsics for X86/ARM by Qiyu8 2020-10-15 10:29:42 +0800
  • ebf0470fc add sse4.1 for DYNAMIC_ARCH kernels by Martin Kroeker 2020-10-14 20:34:33 +0200
  • ca160bb44 Add -msse4.1 when SSE4.1 is supported by Martin Kroeker 2020-10-14 19:18:07 +0200
  • c9c3ae07a Add double precision operations by Martin Kroeker 2020-10-14 18:10:45 +0200
  • a897bc3bd Merge pull request #99 from xianyi/develop by Martin Kroeker 2020-10-14 18:09:20 +0200
  • 756802df6 Merge pull request #2890 from martin-frbg/s-d-sum by Martin Kroeker 2020-10-14 09:02:03 +0200
  • 01492decf Merge pull request #2895 from martin-frbg/sb-tests by Martin Kroeker 2020-10-14 09:01:16 +0200
  • bd0752444 Merge pull request #2894 from RajalakshmiSR/bf16_packing by Martin Kroeker 2020-10-14 08:12:08 +0200
  • c1f4f5d4e (refs/pull/2895/head) Replace Makefile with simplified version again by Martin Kroeker 2020-10-14 01:08:50 +0200
  • 75e3a92df (refs/pull/2890/head) Add express -mavx and -msse options (and fix a stray = for cooperlake) by Martin Kroeker 2020-10-14 01:01:58 +0200
  • 2a329baa8 Add the BFLOAT16 functions to cmake builds by Martin Kroeker 2020-10-13 23:21:38 +0200
  • 0826d68f9 (refs/pull/2894/head) POWER10: Change the packing format for bfloat16 by Rajalakshmi Srinivasaraghavan 2020-10-13 16:05:10 -0500
  • 4bb73c017 Rename "HALF" type to "BFLOAT16" by Martin Kroeker 2020-10-13 20:07:19 +0200
  • bc5c7f957 Cleanup by Martin Kroeker 2020-10-13 19:56:09 +0200
  • 437b7fe26 sh prefix renamed to sb by Martin Kroeker 2020-10-13 19:55:14 +0200
  • a0ada4bcb Merge pull request #98 from xianyi/develop by Martin Kroeker 2020-10-13 18:50:30 +0200
  • 602a0c7a6 Merge pull request #2892 from RajalakshmiSR/bf16_make by Martin Kroeker 2020-10-13 18:48:37 +0200
  • b5d30b390 (refs/pull/2892/head) Fix build issues with bfloat16 by Rajalakshmi Srinivasaraghavan 2020-10-13 11:00:22 -0500
  • 137ae618d Fix typo by Martin Kroeker 2020-10-13 15:02:17 +0200
  • 9e3cff5cf Expressly enable -mavx2 on Zen, SkylakeX and Cooperlake as well by Martin Kroeker 2020-10-13 14:41:25 +0200
  • d85b96842 Merge pull request #2891 from martin-frbg/fix-2886 by Martin Kroeker 2020-10-13 13:46:17 +0200
  • 5f60a32ca Add -mssse3 if supported by the hardware by Martin Kroeker 2020-10-13 11:57:04 +0200
  • fecedc9c6 Add -mssse3 by Martin Kroeker 2020-10-13 11:55:41 +0200
  • 0eacbca85 Add Haswell and Zen to temporary sse3 whitelist by Martin Kroeker 2020-10-13 11:42:39 +0200
  • 6999086a2 whitelist SANDYBRIDGE for SSE3 by Martin Kroeker 2020-10-13 10:32:19 +0200