Commit Graph

  • *
  • |\
  • | | *
  • | |/|
  • |/| |
  • | | | *
  • | |_|/|
  • |/| | |
  • | | | | *
  • | |_|_|/|
  • |/| | | |
  • * | | | |
  • |\ \ \ \ \
  • | | | | | | *
  • | * | | | | |
  • | |\ \ \ \ \ \
  • | |/ / / / / /
  • |/| | | | | |
  • * | | | | | |
  • |\ \ \ \ \ \ \
  • | * | | | | | |
  • | | | | | | | | *
  • | |_|_|_|_|_|_|/|
  • |/| | | | | | | |
  • | * | | | | | | |
  • | * | | | | | | |
  • |/ / / / / / / /
  • | * / / / / / /
  • |/ / / / / / /
  • | | | * | | |
  • | | | * | | |
  • | |_|/ / / /
  • |/| | | | |
  • | | | | | | *
  • | |_|_|_|_|/|
  • |/| | | | | |
  • | | | | | | | *
  • | |_|_|_|_|_|/|
  • |/| | | | | | |
  • | | | | | | | | *
  • | |_|_|_|_|_|_|/|
  • |/| | | | | | | |
  • | | | | | | | | | *
  • | |_|_|_|_|_|_|_|/|
  • |/| | | | | | | | |
  • | | | | | | | | | | *
  • | |_|_|_|_|_|_|_|_|/|
  • |/| | | | | | | | | |
  • | | | | | | | | | | | *
  • | |_|_|_|_|_|_|_|_|_|/|
  • |/| | | | | | | | | | |
  • | | | | | | | | | | | | *
  • | |_|_|_|_|_|_|_|_|_|_|/|
  • |/| | | | | | | | | | | |
  • | | | | | | | | | | | | | *
  • | |_|_|_|_|_|_|_|_|_|_|_|/|
  • |/| | | | | | | | | | | | |
  • | * | | | | | | | | | | | |
  • |/ / / / / / / / / / / / /
  • | | * / / / / / / / / / /
  • | |/ / / / / / / / / / /
  • |/| | | | | | | | | | |
  • | | | | | | | | | | | | *
  • | |_|_|_|_|_|_|_|_|_|_|/|
  • |/| | | | | | | | | | | |
  • | | | | | | | | | | | | | *
  • | |_|_|_|_|_|_|_|_|_|_|_|/|
  • |/| | | | | | | | | | | | |
  • | | | | | | | | | | | | | | *
  • | |_|_|_|_|_|_|_|_|_|_|_|_|/|
  • |/| | | | | | | | | | | | | |
  • | | | | | | | | | | | | | | | *
  • | |_|_|_|_|_|_|_|_|_|_|_|_|_|/|
  • |/| | | | | | | | | | | | | | |
  • | | | | | | | | | | | | | | | | *
  • | |_|_|_|_|_|_|_|_|_|_|_|_|_|_|/|
  • |/| | | | | | | | | | | | | | | |
  • | | * | | | | | | | | | | | | | |
  • * | | | | | | | | | | | | | | | |
  • |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
  • | | | * | | | | | | | | | | | | | |
  • * | | | | | | | | | | | | | | | | |
  • |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
  • | | | | | | | | | | | | | | | | | * |
  • | | | | | | | | | | | | | | | | | * |
  • | | | | | | | | | | | | | | | | | * |
  • | | | | | | | | | | | | | | | | | * |
  • | | | | | | | | | | | | | | | | | * |
  • | | | | | | | | | | | | | | | | | | | *
  • | |_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|/|
  • |/| | | | | | | | | | | | | | | | | | |
  • | * | | | | | | | | | | | | | | | | | |
  • |/ / / / / / / / / / / / / / / / / / /
  • | * | | | | | | | | | | | | | | | | |
  • | |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
  • | |/ / / / / / / / / / / / / / / / / /
  • |/| | | | | | | | | | | | | | | | | |
  • | | | * | | | | | | | | | | | | | | |
  • * | | | | | | | | | | | | | | | | | |
  • |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
  • | | * | | | | | | | | | | | | | | | | |
  • | |/ / / / / / / / / / / / / / / / / /
  • |/| | | | | | | | | | | | | | | | | |
  • | * | | | | | | | | | | | | | | | | |
  • |/ / / / / / / / / / / / / / / / / /
  • | | | | | | | | | | | | | | | * | |
  • | | | | | | | | | | | | | | | * | |
  • | | | | | | | | | | | | | | | * | |
  • | | | | | | | | | | | | | | | * | |
  • | | | | | | | | | | | | | | | * | |
  • | | | | | | | | | | | | | | | * | |
  • | | | | | | | | | | | | | | | * | |
  • | | | | | | | | | | | | | | | * | |
  • | | | | | | | | | | | | | | | * | |
  • | | | | | | | | | | | | | | | * | |
  • | | | | | | | | | | | | | | | * | |
  • | | | | | | | | | | | | | | | * | |
  • | | | | | | | | | | | | | | | * | |
  • | | | | | | | | | | | | | | | | | | *
  • | |_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|/|
  • |/| | | | | | | | | | | | | | | | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | | | | | | | | | | | | | | | * | | |
  • | |_|_|_|_|_|_|_|_|_|_|_|_|_|/ / / /
  • |/| | | | | | | | | | | | | | | | |
  • | | * | | | | | | | | | | | | | | |
  • * | | | | | | | | | | | | | | | | |
  • |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
  • * \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
  • |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
  • | | | | * | | | | | | | | | | | | | | |
  • * | | | | | | | | | | | | | | | | | | |
  • |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
  • | | * | | | | | | | | | | | | | | | | | |
  • | | | | | * | | | | | | | | | | | | | | |
  • * | | | | | | | | | | | | | | | | | | | |
  • |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
  • | | | * | | | | | | | | | | | | | | | | | |
  • | | | | * | | | | | | | | | | | | | | | | |
  • | |_|_|/ / / / / / / / / / / / / / / / / /
  • |/| | | | | | | | | | | | | | | | | | | |
  • | | * | | | | | | | | | | | | | | | | | |
  • | |/ / / / / / / / / / / / / / / / / / /
  • |/| | | | | | | | | | | | | | | | | | |
  • | * | | | | | | | | | | | | | | | | | |
  • | * | | | | | | | | | | | | | | | | | |
  • |/ / / / / / / / / / / / / / / / / / /
  • | | | * | | | | | | | | | | | | | | |
  • * | | | | | | | | | | | | | | | | | |
  • | | | | | | | | | | | | | | | | | | | *
  • | | | | | | | | | | | | | | | | | | | *
  • | | | | | | | | | | | | | * | | | | | |
  • | |_|_|_|_|_|_|_|_|_|_|_|/ / / / / / /
  • |/| | | | | | | | | | | | | | | | | |
  • | | | | | | | | | | | | | | | | | | *
  • | | | | | | | | | | | | | | | | | | *
  • | | | * | | | | | | | | | | | | | | |
  • * | | | | | | | | | | | | | | | | | |
  • |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
  • | * | | | | | | | | | | | | | | | | | |
  • 27304fb29 (refs/pull/5432/merge) Merge 7fcad02dc2 into c31861ea62 by Mark Ryan 2025-09-03 13:36:58 -0400
  • 9fd0a04e5 (refs/pull/5187/merge) Merge b537c1be49 into c31861ea62 by Martin Kroeker 2025-09-02 18:57:29 -0600
  • 01ab5e059 (refs/pull/5434/merge) Merge ed6c223105 into c31861ea62 by Ruiyang Wu 2025-09-02 18:44:21 -0400
  • e42106ab7 (refs/pull/5431/merge) Merge ce79fe12fd into c31861ea62 by Mark Ryan 2025-09-02 23:12:13 +0200
  • c31861ea6 (HEAD -> develop) Merge pull request #5435 from martin-frbg/update_rvv_ci by Martin Kroeker 2025-09-02 14:11:16 -0700
  • 9b28fed6a (gh-pages) deploy: 6d070820fc by martin-frbg 2025-09-02 19:28:31 +0000
  • 57c2936a4 (refs/pull/5435/head) Merge branch 'OpenMathLib:develop' into update_rvv_ci by Martin Kroeker 2025-09-02 12:09:30 -0700
  • 6d070820f Merge pull request #5436 from martin-frbg/update_osx_ci by Martin Kroeker 2025-09-02 12:09:09 -0700
  • 1c7251ca2 (refs/pull/5436/head) remove the -llto_library option for any osx fortran compiler by Martin Kroeker 2025-09-02 18:36:02 +0200
  • 728c1e7c3 (refs/pull/3748/merge) Merge c1a5a71d1c into 06c09deee9 by Markus Mützel 2025-09-02 11:18:14 -0400
  • a1331406a drop (re)installation of cmake on osx runners by Martin Kroeker 2025-09-02 15:39:08 +0200
  • c42fccccb Drop installation of cmake by Martin Kroeker 2025-09-02 15:36:32 +0200
  • 4c1a4e60a Update toolchain to its latest nightly build by Martin Kroeker 2025-09-02 14:54:08 +0200
  • ed6c22310 (refs/pull/5434/head) CMake: Improve the wording of the OpenMP mixed linkage check by Ruiyang Wu 2025-09-01 22:33:52 -0400
  • fd8f0d4f8 CMake: Demote the OpenMP mixed linkage check to NOTICE by Ruiyang Wu 2025-08-31 22:21:01 -0400
  • cef77f8e3 (refs/pull/4833/merge) Merge 5f8744d4e4 into 06c09deee9 by Christopher Sidebottom 2025-08-29 18:09:13 +0800
  • 96210915c (refs/pull/4080/merge) Merge 806073ccbc into 06c09deee9 by aitap 2025-08-29 18:07:43 +0800
  • 5a27c6cf6 (refs/pull/5256/merge) Merge 8e47512286 into 06c09deee9 by Han Gao 2025-08-29 06:06:28 -0400
  • 488ecb444 (refs/pull/5326/merge) Merge ed457343d5 into 06c09deee9 by مهدي شينون (Mehdi Chinoune) 2025-08-29 18:05:46 +0800
  • 8f0fbfe79 (refs/pull/5413/merge) Merge 52792f6da7 into 06c09deee9 by Martin Kroeker 2025-08-29 10:24:11 +0200
  • 7e9193509 (refs/pull/5318/merge) Merge 2049628f22 into 06c09deee9 by مهدي شينون (Mehdi Chinoune) 2025-08-29 14:50:47 +0900
  • 1e581357f (refs/pull/4270/merge) Merge 4da1a0b1da into 06c09deee9 by Markus Mützel 2025-08-29 14:50:21 +0900
  • a80d1d3cc (refs/pull/5010/merge) Merge dc68a48ddd into 06c09deee9 by Martin Kroeker 2025-08-28 10:52:09 -0400
  • 7fcad02dc (refs/pull/5432/head) fix RVV 1.0 detection code by Mark Ryan 2025-08-28 14:08:46 +0000
  • ce79fe12f (refs/pull/5431/head) disable fp16 flags on RISC-V unless BUILD_HFLOAT16=1 by Mark Ryan 2025-08-27 10:15:09 +0000
  • bcbfbde10 (refs/pull/5418/merge) Merge b04ac31f6e into 06c09deee9 by Alexandru Ardelean 2025-08-27 22:34:20 +0530
  • 59b189099 (refs/pull/5338/merge) Merge fe783000d8 into 06c09deee9 by Menno Deij - van Rijswijk 2025-08-27 17:52:09 +0530
  • 2d406ebde (refs/pull/5393/merge) Merge 06ced6da16 into 06c09deee9 by xctan 2025-08-27 09:23:12 +0100
  • ef8a44d98 (refs/pull/5423/merge) Merge 2b5d8c789d into 06c09deee9 by Martin Kroeker 2025-08-26 16:30:37 +0530
  • d2ae5fe70 (refs/pull/1752/merge) Merge 8450c13fb1 into 06c09deee9 by Sacha 2025-08-26 17:15:44 +0900
  • 620cc8daf deploy: 06c09deee9 by martin-frbg 2025-08-26 08:10:46 +0000
  • 06c09deee Merge pull request #5426 from hideaki-motoki/issue5417_axpy_sve by Martin Kroeker 2025-08-26 01:10:14 -0700
  • 60ef0f758 deploy: da7d0f4a38 by martin-frbg 2025-08-25 13:46:13 +0000
  • da7d0f4a3 Merge pull request #5427 from yuanjia111/develop by Martin Kroeker 2025-08-25 06:45:44 -0700
  • 2b5d8c789 (refs/pull/5423/head) remove debugging printout by Martin Kroeker 2025-08-24 13:50:08 -0700
  • 1b88c9c74 remove debugging printouts by Martin Kroeker 2025-08-24 13:48:22 -0700
  • b4fc09e9e Add registers d8 to d15 to clobber lists as the code does not expressly save them by Martin Kroeker 2025-08-23 14:39:27 -0700
  • 8e50b8d52 Add d8 to d15 to clobber lists as the code does not expressly save them by Martin Kroeker 2025-08-23 14:36:49 -0700
  • 7f89c6f35 smh-based direct sgemm currently requires leading dimensions to be same as matrix dimension by Martin Kroeker 2025-08-23 14:20:15 -0700
  • 765853c88 (refs/pull/4313/merge) Merge d67a534b9e into b3f247ae5a by Martin Kroeker 2025-08-23 22:33:47 +0800
  • c2cc7a360 (refs/pull/5427/head) riscv64: optimize gemv_t_vector.c by yuanjia 2025-08-22 16:14:14 +0800
  • e23f9c664 (refs/pull/5426/head) Merge remote-tracking branch 'upstream/develop' into issue5417_axpy_sve by h-motoki 2025-08-21 22:16:28 +0900
  • af13d97bc deploy: b3f247ae5a by martin-frbg 2025-08-21 12:14:03 +0000
  • b3f247ae5 Merge pull request #5425 from martin-frbg/fixup5389 by Martin Kroeker 2025-08-21 05:13:34 -0700
  • 855945bef Implementing SVE in [SD]AXPY Kernels for A64FX and Graviton3E by h-motoki 2025-08-21 20:56:58 +0900
  • 7c1839899 (refs/pull/5425/head) Increase assumed L2 sizes for RISCV X280 / ZVL256B and for SVE-capable ARM64 by Martin Kroeker 2025-08-21 11:57:07 +0200
  • 1ee8879c7 Add VORTEXM4 by Martin Kroeker 2025-08-20 09:59:32 -0700
  • edaa73fd2 Hide the local 2VLx2VL symbol as static is insufficient for this with gcc by Martin Kroeker 2025-08-20 06:33:28 -0700
  • 501728a35 adjust register 20 accesses to 21 after moving x18 by Martin Kroeker 2025-08-20 06:24:38 -0700
  • 107c883c8 Update SME-related kernels by Martin Kroeker 2025-08-19 05:13:28 -0700
  • 05dbb5436 Delete misplaced file by Martin Kroeker 2025-08-19 05:12:09 -0700
  • 4609732e6 Relax version number requirement for AppleClang by Martin Kroeker 2025-08-18 14:54:20 -0700
  • bf98e448e Add VORTEXM4 to DYNAMIC_ARCH list by Martin Kroeker 2025-08-18 14:43:08 -0700
  • 0bc19a133 Update SME kernel details by Martin Kroeker 2025-08-18 14:38:16 -0700
  • 426b5f23e Add compiler options for VORTEXM4 by Martin Kroeker 2025-08-18 14:35:36 -0700
  • 4328c91e2 relax requirements in compiler SME capability check by Martin Kroeker 2025-08-18 14:34:51 -0700
  • c794d0a4c Add VORTEXM4 by Martin Kroeker 2025-08-18 14:33:24 -0700
  • a4f5fec46 Add compiler options for VORTEXM4 by Martin Kroeker 2025-08-18 14:32:07 -0700
  • ca542f319 Add VORTEXM4 by Martin Kroeker 2025-08-18 08:41:38 -0700
  • 3bbee42dc (refs/pull/4054/merge) Merge 6d05b63bce into 9c43301b6d by Peter Edwards 2025-08-18 20:34:24 +0500
  • 18f9582f3 Add VORTEXM4 by Martin Kroeker 2025-08-18 01:54:09 -0700
  • 4e2a8c18e Split VORTEXM4 from VORTEX target due to SME support by Martin Kroeker 2025-08-18 01:53:04 -0700
  • 30970460b Add VORTEXM4 target by Martin Kroeker 2025-08-18 01:52:05 -0700
  • b0a00fbd6 Add minimal compiler flags for VORTEXM4 by Martin Kroeker 2025-08-18 01:51:10 -0700
  • ccfd0170f Enable SME on MacOS and add VORTEXM4 to DYNAMIC_ARCH list by Martin Kroeker 2025-08-18 01:50:13 -0700
  • ef0b883df Add sgemm_direct_performant for ARM64 by Martin Kroeker 2025-08-18 01:48:08 -0700
  • e76c39099 Add sgemm_direct_performant for ARM64 by Martin Kroeker 2025-08-18 01:47:17 -0700
  • 202a7a0e2 Separate VORTEXM4 from VORTEX and ARMV9SME by Martin Kroeker 2025-08-18 01:45:40 -0700
  • de91afd2a Move SGEMM_DIRECT after the CBLAS parameter check and add sgemm_direct_performant for ARM64 by Martin Kroeker 2025-08-18 01:44:21 -0700
  • 0203657f4 Add sgemm_direct_performant for ARM64 by Martin Kroeker 2025-08-18 01:42:32 -0700
  • e82bcd274 Update ARM64 sgemm_direct object generation by Martin Kroeker 2025-08-18 01:41:13 -0700
  • 731f4dd68 Add VORTEXM4 settings by Martin Kroeker 2025-08-18 01:39:35 -0700
  • 53d3bb50c Get symbol name from build system; change b.first to b.mi for AppleClang compatibility by Martin Kroeker 2025-08-18 01:37:50 -0700
  • 08a00326a Build symbol name from build system variables by Martin Kroeker 2025-08-18 01:35:41 -0700
  • 89898fc49 Add sgemm_direct_performant for switching between direct and regular kernels by Martin Kroeker 2025-08-18 01:31:40 -0700
  • 22c6607db Use ASMNAME to get symbol name from build system; leave x18 unused as reserved on MacOS by Martin Kroeker 2025-08-18 01:30:10 -0700
  • ca22e28ca Rename sgemm_direct_sme1.S to sgemm_direct_sme1_2VLx2VL.S by Martin Kroeker 2025-08-18 01:25:44 -0700
  • c9814eb96 deploy: 9c43301b6d by martin-frbg 2025-08-17 10:03:37 +0000
  • 9c43301b6 Merge pull request #5421 from reibax-marcus/develop by Martin Kroeker 2025-08-17 03:03:05 -0700
  • 9d6df1dd3 Merge pull request #5422 from ChipKerchner/addRVVVectorizedPacking by Martin Kroeker 2025-08-16 13:45:35 -0700
  • 003c8d1aa deploy: f3b2a15fad by martin-frbg 2025-08-16 19:07:26 +0000
  • f3b2a15fa Merge pull request #5420 from yuanjia111/develop by Martin Kroeker 2025-08-16 12:06:53 -0700
  • 64401b441 (refs/pull/5422/head) Disable vectorized packing for DGEMM - since it is slower than scalar. by Chip Kerchner 2025-08-13 13:41:12 +0000
  • b37ea80dd deploy: 5e43ba948c by martin-frbg 2025-08-13 09:39:28 +0000
  • 5e43ba948 Merge pull request #5419 from Mousius/bgemm-optimisation by Martin Kroeker 2025-08-13 02:10:20 -0700
  • c00afc86a Add and use vectorized packing to ZVL128B and ZVL256B. Up to 3x+ faster than generic scalar functions. by Chip Kerchner 2025-08-12 17:18:56 +0000
  • 3a6b79c50 (refs/pull/5421/head) fix: broken cblas installation when using makefile based builds by Xabier Marquiegui 2025-08-12 14:40:15 +0200
  • 803e8d483 (refs/pull/5420/head) Move the value assignment of vector x in gemv_n_sve.c to the outermost loop to reduce the repeated data retrieval. 1.Verify correctness using BLAS-Tester 2.Using the built-in benchmark to verify performance, the performance of float and doule type improved by about 60% and about 40% respectively.The test command is: export OMP_NUM_THREADS=1;numactl -C 10 -l ./sgemv.goto 3000 4000 100 export OMP_NUM_THREADS=1;numactl -C 10 -l ./dgemv.goto 3000 4000 100 by yuanjia 2025-08-12 18:03:16 +0800
  • 5f47b872f (refs/pull/5419/head) Remove older kernels for BGEMM on NEOVERSEV1 by Chris Sidebottom 2025-08-10 17:56:05 +0000
  • 114316f36 Optimize SBGEMM / BGEMM for NEOVERSEV1 further by Chris Sidebottom 2025-08-10 16:29:03 +0000
  • 50965e597 deploy: 75c6ab4036 by martin-frbg 2025-08-09 10:28:54 +0000
  • 75c6ab403 CI: Update WoA job to use LLVM 20.1.8 and avoid stray preinstalled LLVM19 (#5411) by Martin Kroeker 2025-08-09 03:28:24 -0700
  • fbefd2a52 (refs/pull/5411/head) Update windows_arm64.yml by Martin Kroeker 2025-08-08 14:34:09 +0200
  • 1aed477ee Update windows_arm64.yml by Martin Kroeker 2025-08-08 14:06:20 +0200
  • b04ac31f6 (refs/pull/5418/head) add level3 defaults for x86 by Alexandru Ardelean 2025-08-08 13:14:56 +0300
  • 599e4e372 add modules search path by Martin Kroeker 2025-08-08 11:55:27 +0200
  • c77aec697 Update windows_arm64.yml by Martin Kroeker 2025-08-08 11:04:35 +0200
  • 35fd2c4b8 deploy: 5c5f852ee3 by martin-frbg 2025-08-04 11:29:54 +0000
  • 5c5f852ee Merge pull request #5415 from martin-frbg/Fixum-5399 by Martin Kroeker 2025-08-04 04:29:26 -0700
  • f1ee61ea3 (refs/pull/5415/head) Include NEON header for the bfloat conversion functions by Martin Kroeker 2025-08-04 00:21:39 -0700