Commit Graph

  • *
  • | *
  • | | *
  • | |/
  • | | *
  • | |/
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | | *
  • * | | |
  • |\ \ \ \
  • | |/ / /
  • |/| | |
  • | * | |
  • |/ / /
  • | | *
  • * | |
  • |\ \ \
  • | | * |
  • * | | |
  • |\ \ \ \
  • * \ \ \ \
  • |\ \ \ \ \
  • | | * | | |
  • | |/ / / /
  • |/| | | |
  • | * | | |
  • |/ / / /
  • | | | *
  • * | | |
  • |\ \ \ \
  • | * | | |
  • | * | | |
  • |/ / / /
  • | | * |
  • | | * |
  • | | * |
  • | | * |
  • | | * |
  • | |/ /
  • |/| |
  • | * |
  • |/ /
  • | | *
  • | | *
  • | | *
  • | * |
  • * | |
  • |\ \ \
  • | * | |
  • | * | |
  • | * | |
  • | * | |
  • * | | |
  • |\ \ \ \
  • | * | | |
  • |/ / / /
  • | | * |
  • * | | |
  • |\ \ \ \
  • | * | | |
  • | * | | |
  • |/ / / /
  • * | | |
  • |\ \ \ \
  • * \ \ \ \
  • |\ \ \ \ \
  • * \ \ \ \ \
  • |\ \ \ \ \ \
  • | |_|_|/ / /
  • |/| | | | |
  • | | | * | |
  • | |_|/ / /
  • |/| | | |
  • | | * | |
  • | | |\ \ \
  • | |_|/ / /
  • |/| | | |
  • * | | | |
  • |\ \ \ \ \
  • * \ \ \ \ \
  • |\ \ \ \ \ \
  • | * | | | | |
  • | * | | | | |
  • |/ / / / / /
  • | | * | | |
  • | * | | | |
  • |/ / / / /
  • | | | * |
  • * | | | |
  • |\ \ \ \ \
  • * \ \ \ \ \
  • |\ \ \ \ \ \
  • * \ \ \ \ \ \
  • |\ \ \ \ \ \ \
  • | * | | | | | |
  • | | | | | * | |
  • | | | | | |\ \ \
  • | | | * | | | | |
  • | | | | |_|/ / /
  • | | | |/| | | |
  • * | | | | | | |
  • |\ \ \ \ \ \ \ \
  • * \ \ \ \ \ \ \ \
  • |\ \ \ \ \ \ \ \ \
  • | |_|_|_|/ / / / /
  • |/| | | | | | | |
  • * | | | | | | | |
  • |\ \ \ \ \ \ \ \ \
  • | | | | | | | * \ \
  • | | | | | | | |\ \ \
  • | | | | | | | | | * |
  • * | | | | | | | | | |
  • |\ \ \ \ \ \ \ \ \ \ \
  • | | | | * | | | | | | |
  • | |_|_|/ / / / / / / /
  • |/| | | | | | | | | |
  • | | * | | | | | | | |
  • | | | | | | | | | | *
  • | | | | | | | | | | |\
  • | | | | | | | | | | | *
  • | | | | | * | | | | | |
  • | | | * | | | | | | | |
  • | |_|/ / / / / / / / /
  • |/| | | | | | | | | |
  • | * | | | | | | | | |
  • | | |_|/ / / / / / /
  • | |/| | | | | | | |
  • | | | | | | | | | *
  • | | | | | | | | |/
  • | | * | | | | | /
  • | | | |_|_|/ / /
  • | | |/| | | | |
  • | | | | | | * |
  • * | | | | | | |
  • |\ \ \ \ \ \ \ \
  • | |/ / / / / / /
  • |/| | | | | | |
  • * | | | | | | |
  • |\ \ \ \ \ \ \ \
  • | | | | | | | * |
  • * | | | | | | | |
  • |\ \ \ \ \ \ \ \ \
  • | | | | | | | | * |
  • | | | | | | | | | | *
  • | |_|_|_|_|_|_|_|_|/
  • |/| | | | | | | | |
  • | | | * | | | | | |
  • | |_|/ / / / / / /
  • |/| | | | | | | |
  • | | * | | | | | |
  • | | * | | | | | |
  • | |/ / / / / / /
  • |/| | | | | | |
  • | | | | | | | *
  • | | | | | | | |\
  • | | | | | | | | *
  • | | | | | | | |/
  • | | | | | | | | *
  • | |_|_|_|_|_|_|/
  • |/| | | | | | |
  • | * | | | | | |
  • |/ / / / / / /
  • * | | | | | |
  • |\ \ \ \ \ \ \
  • | |/ / / / / /
  • |/| | | | | |
  • | * | | | | |
  • |/ / / / / /
  • | | | | | *
  • | | | | | |\
  • | | | | * | |
  • * | | | | | |
  • 0228d3621 move -fopenmp to CFLAGS by Martin Kroeker 2024-09-30 21:38:05 +0200
  • 7087b0a7d ARM64: Enable SMALL_MATRIX_OPT when compiling with CMake by gxw 2024-09-29 10:31:26 +0800
  • 30af9278d (refs/pull/4904/head) LoongArch64: Enable cmake cross-compilation by gxw 2024-09-26 16:55:06 +0800
  • 48698b2b1 (refs/pull/4900/head) LoongArch64: Rename core by gxw 2024-09-18 17:20:43 +0800
  • c8788208c Fixing block issue with transpose version. by Chip Kerchner 2024-09-27 13:27:03 -0500
  • d7c0d87cd Small changes. by Chip Kerchner 2024-09-26 15:21:29 -0500
  • eb6f3a05e Common MMA code. by Chip Kerchner 2024-09-26 09:28:56 -0500
  • fb287d17f Common code. by Chip Kerchner 2024-09-25 16:31:36 -0500
  • 8ab624577 Small change. by Chip Kerchner 2024-09-24 16:50:21 -0500
  • df1937556 Almost final code for MMA. by Chip Kerchner 2024-09-24 16:30:01 -0500
  • 05aa63e73 More MMA BF16 GEMV code. by Chip Kerchner 2024-09-24 12:54:02 -0500
  • c9ce37d52 Force vector pairs in clang. by Chip Kerchner 2024-09-23 08:43:58 -0500
  • 89a12fa08 MMA BF16 GEMV code. by Chip Kerchner 2024-09-23 06:32:14 -0500
  • e9824ae79 deploy: 92f7a2dc3e by martin-frbg 2024-09-19 12:15:38 +0000
  • 92f7a2dc3 Merge pull request #4899 from martin-frbg/flangmtune by Martin Kroeker 2024-09-19 14:15:06 +0200
  • 969bb949b (refs/pull/4899/head) Strip any mtune option from FFLAGS is the compiler is flang-new by Martin Kroeker 2024-09-19 11:10:28 +0200
  • 30733e7d6 deploy: fca86e359c by martin-frbg 2024-09-16 09:17:50 +0000
  • fca86e359 Merge pull request #4887 from goplanid/develop by Martin Kroeker 2024-09-16 11:17:19 +0200
  • 7947970f9 Move common code. by Chip Kerchner 2024-09-13 06:22:13 -0500
  • 60c1519e0 Merge pull request #4896 from martin-frbg/update_azure_mac_hpc by Martin Kroeker 2024-09-12 21:09:28 +0200
  • c8313d9d8 Merge pull request #4895 from martin-frbg/update_homebrewjob by Martin Kroeker 2024-09-12 21:09:10 +0200
  • b588e922a (refs/pull/4896/head) Update oneAPI download location for Mac to final by Martin Kroeker 2024-09-12 18:13:46 +0200
  • 4178905fa (refs/pull/4895/head) Update version of upload-artifacts following deprecation by Martin Kroeker 2024-09-12 16:39:20 +0200
  • 70ea109d6 deploy: 5f70e245a2 by martin-frbg 2024-09-12 13:10:29 +0000
  • 5f70e245a Merge pull request #4894 from martin-frbg/issue4893 by Martin Kroeker 2024-09-12 15:09:54 +0200
  • 383e0b133 (refs/pull/4894/head) remove suppression of gcc14's incompatible pointer error by Martin Kroeker 2024-09-11 22:21:09 +0200
  • 869a169c5 Fix ZAXPYTEST prototype by Martin Kroeker 2024-09-11 22:18:14 +0200
  • 72216d28c Fix bug with inc_y adding results twice. by Chip Kerchner 2024-09-11 08:47:32 -0500
  • 2f142ee85 More common code. by Chip Kerchner 2024-09-09 14:41:55 -0500
  • 39fd29f1d Minor improvement and turn off BF16 GEMV forwarding by default. by Chip Kerchner 2024-09-08 18:28:31 -0500
  • 8541b25e1 Special case beta is one. by Chip Kerchner 2024-09-06 14:48:48 -0500
  • 76227e294 Initial commit for vectorized BF16 GEMV. Added GEMM_GEMV_FORWARD_BF16 to enable using BF16 GEMV for one dimension matrices. Updated unit test to support inc_x != 1 or inc_y for GEMV. by Chip Kerchner 2024-09-06 14:03:31 -0500
  • 4894c5405 (refs/pull/4887/head) Improve TN case with further unrolling by Deeksha Goplani 2024-09-02 22:22:49 +0530
  • 060c86351 (refs/pull/4885/head) BLD: Add Windows build by Mateusz Sokół 2024-08-25 18:10:15 +0000
  • 6ce99e314 MAINT: Add a configuration for meson format by Rohit Goswami 2024-08-17 16:37:15 -0500
  • 3f9ffecf8 MAINT: Fixup hardcoded build folder by Rohit Goswami 2024-08-17 16:32:41 -0500
  • 4ee4873c2 deploy: 485027563e by martin-frbg 2024-08-17 09:47:57 +0000
  • 485027563 Merge pull request #4883 from ChipKerchner/fixSGEMMUnitTestZeroSize by Martin Kroeker 2024-08-17 11:47:26 +0200
  • 89702e1f4 (refs/pull/4883/head) Fix zero element GEMV test. by Chip Kerchner 2024-08-16 11:37:39 -0500
  • 77f85c7c0 GEMV tests don't like zero elements. by Chip Kerchner 2024-08-16 11:15:32 -0500
  • 868aa857b Change malloc zero to return one byte and update the SBGEMM test to again use sizes of zero. by Chip Kerchner 2024-08-16 10:28:10 -0500
  • b1802f4dc Fix unit test to start at 1 instead of 0 - since malloc zero bytes fails on some systems. by Chip Kerchner 2024-08-16 09:51:37 -0500
  • f61930eb1 Merge pull request #4882 from martin-frbg/issue4805-3 by Martin Kroeker 2024-08-16 11:24:51 +0200
  • dfba3f884 (refs/pull/4882/head) restore the pragma as it is reportedly still needed on 3C6000/gcc14.2 by Martin Kroeker 2024-08-16 11:23:19 +0200
  • 54b868f71 deploy: 7129a64d87 by martin-frbg 2024-08-16 06:47:47 +0000
  • 7129a64d8 Merge pull request #4881 from martin-frbg/issue4805-2 by Martin Kroeker 2024-08-16 08:47:12 +0200
  • 49080b631 (refs/pull/4881/head) remove optimizer pragma again by Martin Kroeker 2024-08-15 22:15:27 +0200
  • e05d98d00 expressly use fld.d/fst.d for floating point registers instead of LD/ST macros by Martin Kroeker 2024-08-15 22:14:29 +0200
  • 3ee9e9d8d Merge pull request #4879 from martin-frbg/issue4868-2 by Martin Kroeker 2024-08-15 22:06:54 +0200
  • dd71df8fa Merge pull request #4880 from ChipKerchner/betterPowerGEMVTail by Martin Kroeker 2024-08-15 20:36:22 +0200
  • a8d6b0219 Merge pull request #4877 from XiWeiGu/fixed_undefined_blas_set_parameter by Martin Kroeker 2024-08-15 15:35:26 +0200
  • d24b3cf39 (refs/pull/4879/head) properly fix buffer allocation and assignment by Martin Kroeker 2024-08-15 15:32:58 +0200
  • a0aeba631 (refs/pull/4880/head) Merge branch 'develop' into betterPowerGEMVTail by Chip Kerchner 2024-08-15 08:00:00 -0500
  • eba8615c1 Merge pull request #4876 from martin-frbg/granite by Martin Kroeker 2024-08-15 13:50:54 +0200
  • bc80e7f02 Merge pull request #4878 from martin-frbg/cirrus-androidndk by Martin Kroeker 2024-08-15 13:50:09 +0200
  • 94c9e0b7a (refs/pull/4878/head) Update ndk version number by Martin Kroeker 2024-08-15 11:30:23 +0200
  • ed0321563 fix installation of NDK in armv7 crossbuild by Martin Kroeker 2024-08-15 11:11:07 +0200
  • fd033467a (refs/pull/4877/head) Fixed the undefined reference to blas_set_parameter by gxw 2024-08-15 16:48:48 +0800
  • 1b8e40874 (refs/pull/4876/head) Add autodetection support for Intel Granite Rapids as Sapphire Rapids by Martin Kroeker 2024-08-15 09:33:42 +0200
  • cbfe72ca7 deploy: 4944148e66 by martin-frbg 2024-08-15 07:32:47 +0000
  • 4944148e6 Merge pull request #4875 from ChipKerchner/addGEMVtoBF16Test by Martin Kroeker 2024-08-15 09:32:11 +0200
  • a388c4b83 Merge pull request #4872 from chenx97/ls3a-fix-stack-fpr-len by Martin Kroeker 2024-08-15 00:10:16 +0200
  • f24b52170 Merge pull request #4787 from vlad0x00/patch-1 by Martin Kroeker 2024-08-15 00:09:53 +0200
  • 2d84ed7e7 (refs/pull/4787/head) Update README.md by Vladimir Nikolić 2024-08-14 14:31:35 -0700
  • 083faf755 Merge branch 'develop' into betterPowerGEMVTail by Chip Kerchner 2024-08-14 15:56:03 -0500
  • c23897f58 (refs/pull/4875/head) Add GEMV testing to SBGEMx vs SGEMx testing. by Chip Kerchner 2024-08-14 15:55:23 -0500
  • 0d8ee96f1 Merge pull request #4874 from martin-frbg/issue4869 by Martin Kroeker 2024-08-14 22:49:12 +0200
  • b80671d89 Merge pull request #4871 from martin-frbg/issue4868 by Martin Kroeker 2024-08-14 20:53:39 +0200
  • 6452f7b46 Merge pull request #4873 from ChipKerchner/fixSBGEMMDefaults by Martin Kroeker 2024-08-14 19:22:03 +0200
  • 75472b830 Merge branch 'develop' into betterPowerGEMVTail by Chip Kerchner 2024-08-14 10:52:46 -0500
  • 9842a6cf2 deploy: ca7777de18 by martin-frbg 2024-08-14 15:37:07 +0000
  • ca7777de1 Merge pull request #4870 from chenx97/fix-recursive-make-var by Martin Kroeker 2024-08-14 16:03:50 +0200
  • f6469e21b (refs/pull/4874/head) move gelqs and geqrs to lapack-deprecated by Martin Kroeker 2024-08-14 16:00:43 +0200
  • 31226740d (refs/pull/4873/head) Cleanup of SBGEMM unit test. by Chip Kerchner 2024-08-14 08:10:25 -0500
  • 070183571 Merge pull request #24 from HaoZeke/sharedLib by Rohit Goswami 2024-08-14 06:03:04 -0700
  • 04d9a533b BLD: Use `both_libraries` to build libs by Mateusz Sokół 2024-08-14 10:45:26 +0000
  • ef94b9653 (refs/pull/4872/head) Use ldc1 and sdc1 for the prologue and epilogue on LOONGSON3A by Henry Chen 2024-08-13 14:53:37 +0800
  • 23b5d66a8 (refs/pull/4871/head) Ensure a memory buffer has been allocated for each thread before invoking it by Martin Kroeker 2024-08-14 10:35:44 +0200
  • 20bdb6588 (refs/pull/4870/head) Fix recursive variable expansion in Makefiles for LOONGSON3A by Henry Chen 2024-08-12 16:22:31 +0800
  • adea56954 BLD: Create OpenBLAS shared object by Mateusz Sokół 2024-08-13 09:42:46 +0000
  • b1737698d Fix DEFAULTS in SBGEMM for POWER10. Also comparisons for SBGEMM unit test can be exactly due to epilison differences. by Chip Kerchner 2024-08-13 07:01:21 -0500
  • 62d1a3cf3 deploy: e5525036e7 by martin-frbg 2024-08-13 05:20:43 +0000
  • e5525036e Merge pull request #4865 from martin-frbg/issue4856 by Martin Kroeker 2024-08-13 07:20:06 +0200
  • fd52d0949 Merge pull request #4864 from martin-frbg/issue4862 by Martin Kroeker 2024-08-13 00:16:45 +0200
  • f332ecbf1 deploy: 35dd625adf by martin-frbg 2024-08-12 20:06:18 +0000
  • 35dd625ad Merge pull request #4859 from martin-frbg/cooper_sb by Martin Kroeker 2024-08-12 22:05:43 +0200
  • a48b11763 Update version information for 0.3.28 by Martin Kroeker 2024-08-12 18:22:20 +0200
  • da6393ab9 (refs/pull/4866/head) set larger threshold for POWER10 by Hong Bo Peng 2024-08-12 09:13:01 -0400
  • d8f740791 (refs/pull/4865/head) tweak threshold a little more to cover POWER10 fma by Martin Kroeker 2024-08-12 14:50:49 +0200
  • 73e13b027 (refs/pull/4864/head) flesh out HERK prototype by Martin Kroeker 2024-08-12 14:45:40 +0200
  • 824306baa flesh out HERK prototype by Martin Kroeker 2024-08-12 14:44:13 +0200
  • cf98f7afc Merge pull request #23 from HaoZeke/mesonDocs by Rohit Goswami 2024-08-12 11:27:20 +0000
  • ff42a9f4f DOC: Meson build docs by Mateusz Sokół 2024-08-09 14:52:38 +0200
  • 05a72c7a7 (refs/pull/4860/head) Update azure-pipelines.yml by Martin Kroeker 2024-08-11 10:42:17 +0200
  • 7ca835a82 (refs/pull/4859/head) address clang array overflow warning by Martin Kroeker 2024-08-10 13:44:56 +0200
  • a87c4d26d Merge pull request #4857 from nekopsykose/ppc by Martin Kroeker 2024-08-10 00:15:28 +0200
  • 1265eee85 (refs/pull/4857/head) fix cmake typo for power10 cc version check by psykose 2024-08-09 20:38:05 +0200
  • 6d31ff0b1 Merge pull request #17 from HaoZeke/multiArch by Rohit Goswami 2024-08-09 08:22:53 +0000
  • f0e9e93a2 deploy: cb38d666da by martin-frbg 2024-08-09 01:41:29 +0000
  • cd3945b99 Update version to 0.3.28.dev by Martin Kroeker 2024-08-08 23:09:45 +0200