Commit Graph

  • *
  • *
  • | *
  • |/
  • *
  • |\
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • * |
  • |\ \
  • | | | *
  • | |_|/
  • |/| |
  • * | |
  • |\ \ \
  • | | * |
  • | |/ /
  • |/| |
  • * | |
  • |\ \ \
  • | * | |
  • |/ / /
  • * | |
  • |\ \ \
  • | | * |
  • | | * |
  • * | | |
  • |\ \ \ \
  • * \ \ \ \
  • |\ \ \ \ \
  • | * | | | |
  • | * | | | |
  • | * | | | |
  • |/ / / / /
  • * | | | |
  • |\ \ \ \ \
  • | |_|_|/ /
  • |/| | | |
  • | * | | |
  • | * | | |
  • | * | | |
  • | * | | |
  • | * | | |
  • | * | | |
  • |/ / / /
  • | | * /
  • | |/ /
  • |/| |
  • | * |
  • |/ /
  • * |
  • |\ \
  • | * |
  • |/ /
  • * |
  • |\ \
  • | * |
  • | * |
  • |/ /
  • * |
  • |\ \
  • * \ \
  • |\ \ \
  • | |_|/
  • |/| |
  • | | *
  • | | *
  • | * |
  • |/ /
  • | | *
  • | |/
  • |/|
  • * |
  • |\ \
  • * \ \
  • |\ \ \
  • | |_|/
  • |/| |
  • | * |
  • |/ /
  • * |
  • |\ \
  • | | *
  • | |/
  • |/|
  • * |
  • |\ \
  • | * |
  • |/ /
  • * |
  • |\ \
  • | * |
  • | * |
  • |/ /
  • * |
  • |\ \
  • | * |
  • |/ /
  • | *
  • | *
  • * |
  • |\ \
  • | * \
  • | |\ \
  • | |/ /
  • |/| |
  • | | *
  • | | | *
  • | | * |
  • | |/ /
  • |/| |
  • * | |
  • |\ \ \
  • | |_|/
  • |/| |
  • | * |
  • | * |
  • |/ /
  • * |
  • |\ \
  • | * \
  • | |\ \
  • | |/ /
  • |/| |
  • | | *
  • | | | *
  • | | | |\
  • | * | | |
  • | | | * |
  • | | | |\ \
  • * | | | \ \
  • |\ \ \ \ \ \
  • * \ \ \ \ \ \
  • |\ \ \ \ \ \ \
  • | * | | | | | |
  • | * | | | | | |
  • | * | | | | | |
  • | * | | | | | |
  • | * | | | | | |
  • | * | | | | | |
  • | * | | | | | |
  • | * | | | | | |
  • | * | | | | | |
  • | * | | | | | |
  • | * | | | | | |
  • d2b7f0f9c define CGEMM INCOPY/ITCOPY kernels by Martin Kroeker 2023-12-30 20:49:53 +0100
  • 2327b13b3 define CGEMM INCOPY/ITCOPY kernels by Martin Kroeker 2023-12-30 20:48:40 +0100
  • 0f648ebcd (refs/pull/4399/head) use alternate download for the CLFS cross-compiler package by Martin Kroeker 2023-12-30 20:31:32 +0100
  • 519b40fad Merge pull request #4398 from yinshiyou/la-dev by Martin Kroeker 2023-12-30 19:51:08 +0100
  • a5d0d2137 (refs/pull/4398/head) loongarch64: Add zgemm and cgemm optimization by pengxu 2023-12-29 15:10:01 +0800
  • 546f13558 loongarch64: Add {c/z}swap and {c/z}sum optimization by gxw 2023-12-29 11:03:53 +0800
  • edabb9366 loongarch64: Refine axpby optimization functions. by Hao Chen 2023-12-29 15:08:10 +0800
  • 1ec5dded4 loongarch64: Add c/zrot optimization functions. by Hao Chen 2023-12-28 21:23:59 +0800
  • 3c53ded31 loongarch64: Add c/znrm2 optimization functions. by Hao Chen 2023-12-28 20:26:01 +0800
  • fbd612f8c loongarch64: Add ic/zamin optimization functions. by Hao Chen 2023-12-28 20:07:58 +0800
  • d97272cb3 loongarch64: Add c/zdot optimization functions. by Hao Chen 2023-12-28 19:09:18 +0800
  • 65a0aeb12 loongarch64: Add c/zcopy optimization functions. by Hao Chen 2023-12-28 17:45:17 +0800
  • 2a34fb4b8 loongarch64: Add and refine scal optimization functions. by Hao Chen 2023-12-27 18:17:51 +0800
  • 8785e948b loongarch64: Add camin optimization function. by Hao Chen 2023-12-27 17:04:46 +0800
  • 0753848e0 loongarch64: Refine and add axpy optimization functions. by Hao Chen 2023-12-27 16:54:01 +0800
  • 06fd5b599 loongarch64: Add and Refine asum optimization functions. by Hao Chen 2023-12-27 10:44:02 +0800
  • e771be185 Optimize copy functions with lsx. by guxiwei 2023-12-21 14:28:06 +0800
  • 179ed51d3 Add dgemm_kernel_8x4.S file. by Hao Chen 2023-12-21 14:18:39 +0800
  • 173a65d4e loongarch64: Add and refine iamax optimization functions. by Hao Chen 2023-12-25 15:11:04 +0800
  • ea70e165c loongarch64: Refine rot optimization. by zhoupeng 2023-12-28 20:07:59 +0800
  • 116aee752 loongarch64: Refine imin optimization. by zhoupeng 2023-12-28 15:17:28 +0800
  • 8be265419 loongarch64: Refine imax optimization. by zhoupeng 2023-12-28 10:24:24 +0800
  • 154baad45 loongarch64: Refine iamin optimization. by zhoupeng 2023-12-27 16:04:33 +0800
  • 36c12c497 loongarch64: Refine copy,swap,nrm2,sum optimization. by Shiyou Yin 2023-12-27 11:30:17 +0800
  • c6996a80e loongarch64: Refine amax,amin,max,min optimization. by Shiyou Yin 2023-12-08 16:06:17 +0800
  • 21564bde2 Merge pull request #4394 from martin-frbg/dyn_vortex by Martin Kroeker 2023-12-28 13:35:55 +0100
  • 75fe9c21e (refs/pull/4397/head) Scale P and Q with L2 cache size for SVE by Chris Sidebottom 2023-12-27 17:52:19 +0000
  • e9c32ed16 Merge pull request #4384 from yetist/develop by Martin Kroeker 2023-12-27 14:05:01 +0100
  • e7a895e71 (refs/pull/4394/head) Add Apple M as NeoverseN1 by Martin Kroeker 2023-12-25 12:36:05 +0100
  • 474ce0ace Merge pull request #4393 from martin-frbg/pr4389-2 by Martin Kroeker 2023-12-25 12:30:56 +0100
  • 1106460bb (refs/pull/4393/head) remove redundant targets from the default ARM64 DYNAMIC_ARCH list by Martin Kroeker 2023-12-25 12:29:56 +0100
  • 236acee70 Merge pull request #4389 from Mousius/reduce-dynamic-targets by Martin Kroeker 2023-12-25 12:27:42 +0100
  • d2f4f1b28 (refs/pull/4384/head) CI: update toolchains for LoongArch64 by Xiaotian Wu 2023-12-20 14:13:04 +0800
  • 0baf462db Fix: build failed on LoongArch by Wu Xiaotian 2023-12-20 10:34:47 +0800
  • 63a83939a Merge pull request #4390 from Mousius/reduce-kernel-duplication by Martin Kroeker 2023-12-24 18:04:26 +0100
  • dba404055 Merge pull request #4392 from martin-frbg/lapack959 by Martin Kroeker 2023-12-24 10:44:15 +0100
  • c6fa92102 (refs/pull/4392/head) Add tests for ?GEDMD (Reference-LAPACK PR 959) by Martin Kroeker 2023-12-23 23:39:53 +0100
  • 283713e4c Add tests for ?GEDMD (Reference-LAPACK PR 959) by Martin Kroeker 2023-12-23 23:32:45 +0100
  • 201f22f49 Fix issues related to ?GEDMD (Reference-LAPACK PR 959) by Martin Kroeker 2023-12-23 23:27:38 +0100
  • 05dde8ef0 Merge pull request #4391 from martin-frbg/lapack942 by Martin Kroeker 2023-12-23 23:11:46 +0100
  • 45ef0d736 (refs/pull/4391/head) Handle corner cases of LWORK (Reference-LAPACK PR 942) by Martin Kroeker 2023-12-23 20:16:33 +0100
  • c082669ad Handle corner cases of LWORK (Reference-LAPACK PR 942) by Martin Kroeker 2023-12-23 20:05:03 +0100
  • 29d6024ec Handle corner cases of LWORK (Reference-LAPACK PR 942) by Martin Kroeker 2023-12-23 19:44:11 +0100
  • 0814491d9 Handle corner cases of LWORK (Reference-LAPACK PR 942) by Martin Kroeker 2023-12-23 19:37:03 +0100
  • 5c11b2ff4 Handle corner cases of LWORK (Reference-LAPACK PR 942) by Martin Kroeker 2023-12-23 19:27:20 +0100
  • 8ce44c18a Handle corner cases of LWORK (Reference-LAPACK PR 942) by Martin Kroeker 2023-12-23 19:24:10 +0100
  • dc20a7818 (refs/pull/4389/head) Use functionally equivalent dynamic targets by Chris Sidebottom 2023-12-23 12:19:33 +0000
  • ecae1389d (refs/pull/4390/head) Reduce duplication in kernel definitions by Chris Sidebottom 2023-12-23 12:21:48 +0000
  • 68ef2328e Merge pull request #4388 from martin-frbg/issue4387 by Martin Kroeker 2023-12-21 22:21:44 +0100
  • a7ed60bfe (refs/pull/4388/head) Add lower limit for multithreading by Martin Kroeker 2023-12-21 20:05:23 +0100
  • 67779177b Merge pull request #4383 from martin-frbg/fixlapatest by Martin Kroeker 2023-12-20 14:01:59 +0100
  • e67a0eaaf (refs/pull/4383/head) Restore OpenBLAS-specific build rule changes by Martin Kroeker 2023-12-19 23:15:11 +0100
  • bb8b91e9f restore OpenBLAS-specific test paths by Martin Kroeker 2023-12-19 23:13:02 +0100
  • fa220b296 Merge pull request #4382 from Mousius/sve-dot-again by Martin Kroeker 2023-12-19 18:46:18 +0100
  • 3f46d0c79 Merge pull request #4381 from darshanp4/issue_4323 by Martin Kroeker 2023-12-19 16:53:53 +0100
  • 60e66725e (refs/pull/4382/head) Use numeric labels to allow repeated inlining by Chris Sidebottom 2023-12-19 13:11:06 +0000
  • 7a4fef4f6 Tweak SVE dot kernel by Chris Sidebottom 2023-12-15 12:50:48 +0000
  • dab0da824 (refs/pull/4381/head) Update GEMM param for NEOVERSEV1 by Darshan Patel 2023-12-19 13:56:55 +0530
  • 5bdde6299 (refs/pull/4380/head) test loading numpy/openblas on neoversen1 by Martin Kroeker 2023-12-17 18:42:30 +0100
  • 3b520a56a Merge pull request #4378 from martin-frbg/issue3871 by Martin Kroeker 2023-12-15 21:58:56 +0100
  • 563daadc9 Merge pull request #4379 from barracuda156/ppc970 by Martin Kroeker 2023-12-15 20:03:44 +0100
  • 8c143331b (refs/pull/4379/head) PPC970: drop -mcpu=970 which seems to produce faulty code by barracuda156 2023-12-15 22:55:52 +0800
  • d2f1594bc Merge pull request #4368 from martin-frbg/issue4073 by Martin Kroeker 2023-12-15 14:49:52 +0100
  • 544cb8630 (refs/pull/4378/head) Mention C906V instruction set limitation and update DYNAMIC_ARCH lists by Martin Kroeker 2023-12-15 14:03:59 +0100
  • 8793601e8 Merge pull request #4375 from martin-frbg/issue4352 by Martin Kroeker 2023-12-15 13:35:18 +0100
  • f06b53556 (refs/pull/4375/head) Use C kernel for dgemv_t due to limitations of the old assembly one by Martin Kroeker 2023-12-15 09:58:44 +0100
  • 293131d6b Merge pull request #4370 from barracuda156/unbreak_powerpc by Martin Kroeker 2023-12-14 10:30:03 +0100
  • 981e315b3 (refs/pull/4370/head) cc.cmake: use -force_cpusubtype_ALL for Darwin PPC by barracuda156 2023-12-14 12:01:31 +0800
  • d9653af01 KERNEL.PPC970, KERNEL.PPCG4: unbreak CMake parsing by barracuda156 2023-12-13 19:23:50 +0800
  • 302ca7edc Merge pull request #4371 from barracuda156/970 by Martin Kroeker 2023-12-13 14:32:37 +0100
  • a8d3619f6 (refs/pull/4371/head) cc.cmake: add optflags for G5 and G4 kernels by barracuda156 2023-12-13 19:42:56 +0800
  • aa46f1e4e (refs/pull/4368/head) revert addition of MSVC-compatible complex (moved to lapacke_config.h) by Martin Kroeker 2023-12-12 23:07:48 +0100
  • dcdc35127 Add MSVC-compatible complex types by Martin Kroeker 2023-12-12 23:06:22 +0100
  • 55a0718f7 Merge pull request #4369 from ChipKerchner/power10Copies by Martin Kroeker 2023-12-12 18:49:21 +0100
  • 93747fb37 (refs/pull/4369/head) Merge remote-tracking branch 'origin/develop' into power10Copies by Chip-Kerchner 2023-12-12 09:32:49 -0600
  • dcf6999c4 remove extraneous endif by Martin Kroeker 2023-12-12 11:27:17 +0100
  • 6bd7c54af introduce MT_TRACE to clean up SMP_DEBUG code by Mark Seminatore 2023-12-11 15:13:04 -0800
  • 330101e0b Add complex type definitions for MSVC by Martin Kroeker 2023-12-11 21:52:00 +0100
  • d9f147806 Merge pull request #4367 from barracuda156/unbreak_powerpc by Martin Kroeker 2023-12-11 21:38:32 +0100
  • 9dbc8129b (refs/pull/4367/head) cpuid_power.c: add CPU_SUBTYPE_POWERPC_7400 case by barracuda156 2023-12-11 21:09:06 +0800
  • c732f275a system_check.cmake: fix arch detection for Darwin PowerPC by barracuda156 2023-12-11 21:05:31 +0800
  • e60fb0f39 Merge pull request #4359 from mseminatore/win_perf by Martin Kroeker 2023-12-09 23:40:26 +0100
  • efa9515a2 (refs/pull/4359/head) Merge branch 'OpenMathLib:develop' into win_perf by Mark Seminatore 2023-12-09 10:09:49 -0800
  • 4e738e561 Replace two vector loads with one vector pair load and fix endianess of stores. by Chip-Kerchner 2023-12-08 12:36:08 -0600
  • 1332f8a82 Merge pull request #4159 from OMaghiarIMG/risc-v-tail-policy by Martin Kroeker 2023-12-08 10:25:41 +0100
  • edac80d7e some cleanup, dynamically scale threads, add missing WIN_CASE defn by Mark Seminatore 2023-12-07 14:59:27 -0800
  • 2d316c292 Merge pull request #4125 from OMaghiarIMG/risc-v by Martin Kroeker 2023-12-07 14:50:58 +0100
  • 5b09833b1 Merge pull request #4019 from uniontech-lilinjie/develop by Martin Kroeker 2023-12-07 14:46:17 +0100
  • 3193aa9c7 Merge pull request #4362 from yinshiyou/la-dev by Martin Kroeker 2023-12-07 09:15:15 +0100
  • d32f38fb3 (refs/pull/4362/head) loongarch64: Add optimizations for nrm2. by yancheng 2023-12-07 13:15:55 +0800
  • f9b468990 loongarch64: Add optimizations for rot. by yancheng 2023-12-07 13:12:29 +0800
  • c80e7e27d loongarch64: Add optimizations for sum and asum. by yancheng 2023-12-07 13:08:03 +0800
  • d4c96a35a loongarch64: Add optimizations for axpy and axpby. by yancheng 2023-12-07 13:02:03 +0800
  • 360acc0a4 loongarch64: Add optimizations for swap. by yancheng 2023-12-07 12:57:05 +0800
  • 174c25766 loongarch64: Add optimizations for copy. by yancheng 2023-12-07 12:15:46 +0800
  • 49829b2b7 loongarch64: Add optimizations for iamin. by yancheng 2023-12-07 12:11:30 +0800
  • be83f5e4e loongarch64: Add optimizations for iamax. by yancheng 2023-12-07 12:07:30 +0800
  • e3fb2b5af loongarch64: Add optimizations for imin. by yancheng 2023-12-07 12:01:05 +0800
  • e46b48e37 loongarch64: Add optimizations for imax. by yancheng 2023-12-07 11:56:41 +0800
  • 702fc1d56 loongarch64: Add optimization for min. by yancheng 2023-12-07 11:51:19 +0800