Commit Graph

  • *
  • *
  • *
  • *
  • *
  • *
  • |\
  • | | *
  • | |/
  • |/|
  • | *
  • | |\
  • | * \
  • | |\ \
  • | | | | *
  • | * | | |
  • | |\ \ \ \
  • | | | | | | *
  • | |_|_|_|_|/
  • |/| | | | |
  • | | | * | |
  • | | | | * |
  • | | * | | |
  • | |/ / / /
  • | | | | *
  • | |_|_|/
  • |/| | |
  • | * | |
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | | | *
  • | |_|/
  • |/| |
  • * | |
  • |\| |
  • | * |
  • | |\ \
  • | |/ /
  • |/| |
  • | * |
  • | |\ \
  • | * \ \
  • | |\ \ \
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | * | | |
  • | |\ \ \ \
  • | * \ \ \ \
  • | |\ \ \ \ \
  • * | | | | | |
  • * | | | | | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | * | |
  • | | | | |\ \ \
  • | | |_|_|/ / /
  • | |/| | | | |
  • | | | | * | |
  • | | | * | | |
  • | |_|/ / / /
  • |/| | | | |
  • * | | | | |
  • |\| | | | |
  • | * | | | |
  • | |\ \ \ \ \
  • | | | * | | |
  • | | | * | | |
  • | |_|/ / / /
  • |/| | | | |
  • * | | | | |
  • |\| | | | |
  • | | * | | |
  • | | * | | |
  • | * | | | |
  • | |\ \ \ \ \
  • | | * | | | |
  • | * | | | | |
  • | |\ \ \ \ \ \
  • | | | | | | * |
  • | |_|_|_|_|/ /
  • |/| | | | | |
  • | | | | | | *
  • | | * | | | |
  • | | * | | | |
  • | |/ / / / /
  • |/| | | | |
  • * | | | | |
  • |\| | | | |
  • | * | | | |
  • | |\ \ \ \ \
  • | * \ \ \ \ \
  • | |\ \ \ \ \ \
  • | * \ \ \ \ \ \
  • | |\ \ \ \ \ \ \
  • | | * | | | | | |
  • | |/ / / / / / /
  • 9dca578c7 (refs/pull/2891/head) Cleanup by Martin Kroeker 2020-10-13 10:14:08 +0200
  • 1e7eb7b7a Fix typos in currently unused sections by Martin Kroeker 2020-10-13 09:17:15 +0200
  • 84949754a Fix bfloat16 conditional by Martin Kroeker 2020-10-13 09:11:36 +0200
  • 2ae878560 Add a POWER9 build with BFLOAT16 enabled by Martin Kroeker 2020-10-13 09:07:50 +0200
  • e05af6575 Fix some overlooked "SHBLAS" entries by Martin Kroeker 2020-10-13 09:05:04 +0200
  • c1643006a Merge pull request #97 from xianyi/develop by Martin Kroeker 2020-10-13 09:01:49 +0200
  • 8d2df7d06 Revert special handling of Windows xNRM2 and enable C+intrinsics kernel for SSUM/DSUM by Martin Kroeker 2020-10-13 00:14:29 +0200
  • 08929430c Merge pull request #2886 from martin-frbg/issue_2767 by Martin Kroeker 2020-10-13 00:04:35 +0200
  • 0c84ffe05 Merge pull request #2881 from mattip/fninit by Martin Kroeker 2020-10-12 23:50:41 +0200
  • 36bd6ba6c (refs/pull/2887/head) Use the new universal intrinsics for s/dSUM across all platforms, and generic C c/zSUM on Windows by Martin Kroeker 2020-10-12 23:45:49 +0200
  • cb4274e3a Merge pull request #2888 from Qiyu8/usimd-sum by Martin Kroeker 2020-10-12 23:22:08 +0200
  • fac9afe64 (refs/pull/2889/head) Reset the FPU stack on Windows to work around a bug in Windows10.19041 by Martin Kroeker 2020-10-12 19:04:01 +0200
  • 403eb513a (refs/pull/2881/head) use emms instead, add WIN guards by Matti Picus 2020-10-12 18:15:01 +0300
  • cb839575e (refs/pull/2886/head) Convert the prototypes of the unimplemented BFLOAT16 functions to the new naming scheme by Martin Kroeker 2020-10-12 14:44:33 +0200
  • 0ed1f0766 (refs/pull/2888/head) Optimize the performance of sum by using universal intrinsics by Qiyu8 2020-10-12 19:48:53 +0800
  • 600054b0a Use generic kernels for xSUM on Windows by Martin Kroeker 2020-10-12 08:24:51 +0200
  • bb74dd29d Restore -msse3 by Martin Kroeker 2020-10-12 00:42:05 +0200
  • 629c497b6 common_sh.h renamed to common_sb.h by Martin Kroeker 2020-10-12 00:27:11 +0200
  • 2c552f107 Change "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-12 00:11:31 +0200
  • 7ae9e8960 Change "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-12 00:08:29 +0200
  • e3a29f6b5 Change "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-12 00:07:37 +0200
  • 006c7f667 Change "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-12 00:06:06 +0200
  • 85154c2e1 Change "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-12 00:05:05 +0200
  • ae1ab5bfd Change "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-12 00:03:21 +0200
  • 052f31bc3 Change "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-12 00:02:16 +0200
  • 3aecafad8 Change "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-12 00:00:55 +0200
  • 756062afa Rename "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-11 23:56:17 +0200
  • 2061f7fdf Rename "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-11 23:54:53 +0200
  • dc8a1afa6 Rename "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-11 23:53:50 +0200
  • 32733ded0 Rename "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-11 23:52:45 +0200
  • 3bc8e8c33 Rename "HALF" and "sh" to "BFLOAT16"and "sb" by Martin Kroeker 2020-10-11 23:51:34 +0200
  • 573508f0e Rename common_sh.h to common_sb.h by Martin Kroeker 2020-10-11 23:50:54 +0200
  • ca31c3269 Rename "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-11 23:49:22 +0200
  • 5800758b4 Rename "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-11 23:44:38 +0200
  • 924fd806d Rename "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-11 23:43:36 +0200
  • 4db09c6ce Rename compare_sgemm_shgemm.c to compare_sgemm_sbgemm.c by Martin Kroeker 2020-10-11 23:42:45 +0200
  • fd9423604 Rename "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-11 23:42:07 +0200
  • 68ce719fa Rename shdot_microk_cooperlake.c to sbdot_microk_cooperlake.c by Martin Kroeker 2020-10-11 23:41:13 +0200
  • d7dd9b396 Rename shdot.c to sbdot.c by Martin Kroeker 2020-10-11 23:40:43 +0200
  • 9ae80490e rename "HALF" and "sh" to "BFLOAT16" and "sb" by Martin Kroeker 2020-10-11 23:39:42 +0200
  • d314d1f49 Rename shgemm_kernel_power10.c to sbgemm_kernel_power10.c by Martin Kroeker 2020-10-11 23:37:38 +0200
  • f0883740e Merge pull request #96 from xianyi/develop by Martin Kroeker 2020-10-11 23:34:36 +0200
  • 1c0b03efb Merge branch 'develop' into develop by Martin Kroeker 2020-10-11 23:34:14 +0200
  • c589c3e2a Merge pull request #2882 from martin-frbg/issue2709 by Martin Kroeker 2020-10-11 22:22:30 +0200
  • ec638a82b Merge pull request #2852 from martin-frbg/issue2588-cmake by Martin Kroeker 2020-10-11 22:21:33 +0200
  • caa0d757c (refs/pull/2852/head) repair TABs by Martin Kroeker 2020-10-11 18:29:34 +0200
  • 6154f72d6 Copy BUILD_ settings to the LAPACK make.inc by Martin Kroeker 2020-10-11 18:25:16 +0200
  • ae8b0d257 Set BUILD_ options to 1 instead of just defining them by Martin Kroeker 2020-10-11 18:08:21 +0200
  • 1da32cc1f Add cblas_xerbla interface by Martin Kroeker 2020-10-11 17:45:41 +0200
  • 8c5e08076 If none of the BUILD_ options is set, enable them all by Martin Kroeker 2020-10-11 17:33:51 +0200
  • 5f23bdf43 remove debug output by Martin Kroeker 2020-10-11 17:23:08 +0200
  • b593e6b65 Merge pull request #2885 from martin-frbg/ifexists by Martin Kroeker 2020-10-11 15:45:24 +0200
  • 082c86a53 Merge pull request #2884 from martin-frbg/sse_fixup by Martin Kroeker 2020-10-11 15:14:03 +0200
  • e396ec8b5 Allow building support for only a subset of variable types by Martin Kroeker 2020-10-11 15:11:15 +0200
  • 68e6823d3 Adapt for supporting only a subset of variable types by Martin Kroeker 2020-10-11 15:01:32 +0200
  • 887e00fd7 Adapt for supporting only a subset of variable types by Martin Kroeker 2020-10-11 14:58:57 +0200
  • 886a8e319 Adapt for supporting only a subset of variable types by Martin Kroeker 2020-10-11 14:57:32 +0200
  • 0f7d73ff6 Allow supporting only a subset of variable types by Martin Kroeker 2020-10-11 14:53:26 +0200
  • 6b6adf8a4 Allow compiling only a subset of kernels for specific variable types by Martin Kroeker 2020-10-11 14:52:09 +0200
  • a6570108c Add Makefile support for enabling only some variable types by Martin Kroeker 2020-10-11 14:49:58 +0200
  • ef552bc57 Add Makefile support for enabling only some variable types by Martin Kroeker 2020-10-11 14:49:06 +0200
  • efe1ad470 Add Makefile support for enabling only some variable types by Martin Kroeker 2020-10-11 14:48:23 +0200
  • b27ca78a2 Adapt to having only a subset of variable types supported by Martin Kroeker 2020-10-11 14:46:24 +0200
  • 93454022a Adapt to having only a subset of variable types supported by Martin Kroeker 2020-10-11 14:45:40 +0200
  • 20cf1d773 Adapt to having only a subset of variable types supported by Martin Kroeker 2020-10-11 14:44:56 +0200
  • 5c657fffa Adapt to having only a subset of variable types supported by Martin Kroeker 2020-10-11 14:44:13 +0200
  • b26205805 Adapt to having only a subset of variable types supported by Martin Kroeker 2020-10-11 14:43:13 +0200
  • bc319cee8 Adapt to having only a subset of variable types supported by Martin Kroeker 2020-10-11 14:42:26 +0200
  • e5966f860 Adapt to having only a subset of variable types supported by Martin Kroeker 2020-10-11 14:41:43 +0200
  • 9df12eb08 Adapt to having only a subset of variable types supported by Martin Kroeker 2020-10-11 14:40:51 +0200
  • cf53970bc Adapt to having only a subset of variable types supported by Martin Kroeker 2020-10-11 14:40:06 +0200
  • dcd51d5c7 Adapt to having only a subset of variable types supported by Martin Kroeker 2020-10-11 14:39:19 +0200
  • b8f95354c Adapt to having only a subset of variable types supported by Martin Kroeker 2020-10-11 14:38:25 +0200
  • d33de97d6 Adapt to having only a subset of variable types supported by Martin Kroeker 2020-10-11 14:36:45 +0200
  • 6a83c591d Adapt for having only a subset of variable types by Martin Kroeker 2020-10-11 14:34:12 +0200
  • f6d2827d0 Adapt ctests to having only a subset of types in the build by Martin Kroeker 2020-10-11 14:32:00 +0200
  • 08f4749eb Adapt tests to having only a subset of types in the build by Martin Kroeker 2020-10-11 14:25:24 +0200
  • 63d7dad04 Adapt utests for builds supportin only some variable types by Martin Kroeker 2020-10-11 14:15:35 +0200
  • ac653c94f Merge branch 'develop' into issue2588-cmake by Martin Kroeker 2020-10-11 13:57:07 +0200
  • 190b74dd2 Add files via upload by Martin Kroeker 2020-10-11 13:26:05 +0200
  • 9d43140d6 (refs/pull/2885/head) Improve check for conflicting config_kernel.h by Martin Kroeker 2020-10-11 12:58:17 +0200
  • 8ef600f1a Merge pull request #95 from xianyi/develop by Martin Kroeker 2020-10-11 12:53:18 +0200
  • 88928650c Merge pull request #2883 from martin-frbg/issue2872 by Martin Kroeker 2020-10-11 10:30:33 +0200
  • 7a5312848 (refs/pull/2884/head) Add whitelist of DYNAMIC_ARCH kernels for which -msse3 needs to be enabled by Martin Kroeker 2020-10-11 01:06:46 +0200
  • 0c773b820 Do not rely on HAVE_SSE3 in DYNAMIC_ARCH builds by Martin Kroeker 2020-10-11 01:04:57 +0200
  • fbda20c85 Merge pull request #94 from xianyi/develop by Martin Kroeker 2020-10-11 01:03:00 +0200
  • 82a497ec5 (refs/pull/2883/head) restore PRESCOTT default for DYNAMIC_LIST by Martin Kroeker 2020-10-11 00:43:09 +0200
  • de27e4f5f Stop DYNAMIC_ARCH build if the toplevel source contains a stray config_kernel.h from a gmake build by Martin Kroeker 2020-10-11 00:40:22 +0200
  • e1b7123bb Merge pull request #2867 from Qiyu8/usimd-floatdot by Martin Kroeker 2020-10-10 12:10:25 +0200
  • f32d34a01 (refs/pull/2867/head) add sse3 compiler flag by Qiyu8 2020-10-10 10:36:15 +0800
  • 599777ecb Merge pull request #2879 from martin-frbg/issue2839 by Martin Kroeker 2020-10-06 23:26:52 +0200
  • 781248609 (refs/pull/2882/head) Use generic C for D/Z nrm2 kernels on Windows to work around fpu exception bug by Martin Kroeker 2020-10-06 21:33:16 +0200
  • a5b164946 add fninit to reset fpu registers before assembler routines by Matti Picus 2020-10-05 22:13:25 +0300
  • a5feea661 (refs/pull/2879/head) make BLAS3_MEM_ALLOC_THRESHOLD configurable on non-Windows by Martin Kroeker 2020-10-04 23:01:06 +0200
  • dc8e4e195 Reduce the BLAS3 heap allocation threshold to 32 and mark it as configurable by Martin Kroeker 2020-10-04 22:59:24 +0200
  • cccd1438d Merge pull request #93 from xianyi/develop by Martin Kroeker 2020-10-04 22:57:11 +0200
  • f032d8966 Merge pull request #2874 from Flamefire/memory_fixes by Martin Kroeker 2020-10-04 15:16:51 +0200
  • f6e4cf2f9 Merge pull request #2876 from Flamefire/omp_fork_fix by Martin Kroeker 2020-10-03 22:52:17 +0200
  • 9828343e1 Merge pull request #2878 from brada4/asms by Martin Kroeker 2020-10-03 22:51:49 +0200
  • d2333e784 (refs/pull/2878/head) aarch64 fix std=c18 compilation by User User-User 2020-10-03 18:00:34 +0300