1409 Commits (72888497e2ffb6233ffd18ccf0b4d4bb01701b17)

Author SHA1 Message Date
  ZhangDanfeng bc6fd20a40 fix INIT8x4 5 years ago
  Martin Kroeker 89091e6b64
Merge pull request #2645 from martin-frbg/misc_fixes 5 years ago
  Martin Kroeker c3574ffe53
Merge pull request #2646 from wjc404/develop 5 years ago
  wjc404 0e3ac4a06b
Add files via upload 5 years ago
  Martin Kroeker 7f60fb6b91
Delete spurious copy of common_param.h 5 years ago
  ZhangDanfeng 9b7877ccf1 sgemm copy source init 5 years ago
  ZhangDanfeng f82fa802d1 Insert prefetch 5 years ago
  Martin Kroeker b1ee81228a
Change complex DOT and ROT to generic kernels and switch CGEMM 5 years ago
  张丹枫 9df79ae9a3 update sgemm and strmm kernel selecting strategy 5 years ago
  张丹枫 a1fc6041cd use general register to speedup 5 years ago
  张丹枫 edb423d772 align general register using to strmm_kernel_8x8 5 years ago
  zhangdanfeng 0e6eb8c247 sgemm kernel use sgemm_kernel_8x8_cortexa53 5 years ago
  zhangdanfeng d475db29c6 optimized for cortex-a53 5 years ago
  Marius Hillenbrand 89fe17f20e s390x: Use new sgemm kernel also for DGEMM and DTRMM on Z14 5 years ago
  Marius Hillenbrand bdd795ed03 s390x/GEMM: replace 0-init with peeled first iteration 5 years ago
  Marius Hillenbrand 2840432e49 s390x: improvise vector alignment hints for older compilers 5 years ago
  Marius Hillenbrand 1b0b4349a1 s390x/Z14: Change register blocking for SGEMM to 16x4 5 years ago
  Marius Hillenbrand 71b6eaf459 s390x: Use new sgemm kernel also for strmm on Z14 and newer 5 years ago
  Marius Hillenbrand 43c0d4f312 s390x: Add vectorized sgemm kernel for Z14 and newer 5 years ago
  Martin Kroeker 2271c3506b
Work around excessive LAPACK test failures on Skylake-X 5 years ago
  Rajalakshmi Srinivasaraghavan bd9ff820bc Fix cmake compilation issue - POWER9 5 years ago
  Ashwin Sekhar T K 8353cb245a ARM64: Improve DAXPY for ThunderX2 5 years ago
  Martin Kroeker 90dba9f716
Duplicate earlier Clang 9.0.0 workaround for corresponding Apple Clang version 5 years ago
  Martin Kroeker 5dd14e3d48
Make building the bfloat16 functions conditional on option BUILD_HALF (#2590) 5 years ago
  Martin Kroeker 06208c8d01
Limit this fix to ELFv2 builds 5 years ago
  Martin Kroeker f5c4c28b98
Work around POWER8BE bugs on FreeBSD (ELFv2) 5 years ago
  Martin Kroeker fa42588e1f
Merge pull request #2565 from martin-frbg/mips24k 5 years ago
  Martin Kroeker e55ec82bb9
Delete KERNEL.1004K 5 years ago
  Martin Kroeker 7353ea5afc
Delete KERNEL.24K 5 years ago
  Martin Kroeker 6a04efb122
Rename KERNEL files to include MIPS prefix 5 years ago
  Martin Kroeker d712ea724c
Add MIPS24K support 5 years ago
  Rajalakshmi Srinivasaraghavan 22bb50fb81 cmake fixes 5 years ago
  Rajalakshmi Srinivasaraghavan 67cc4b9e16 Fix warnings in clang and export symbol 5 years ago
  Rajalakshmi Srinivasaraghavan a87793e03c Fix DYNAMIC_ARCH compilation errors 5 years ago
  Rajalakshmi Srinivasaraghavan ff010f496e Build shgemm for all architecture 5 years ago
  Rajalakshmi Srinivasaraghavan 7eb55504b1 RFC : Add half precision gemm for bfloat16 in OpenBLAS 5 years ago
  Martin Kroeker 5b0093b5fe
Convert aligned moves to unaligned 5 years ago
  Martin Kroeker e9bfa2291a
Fix parameter overflow 5 years ago
  gxw 8d07cf9b67 Fix compilation problem on loongson platform 5 years ago
  Martin Kroeker 806f89166e
Make ARMV7 compile with xcode and add a CI job for it (#2537) 5 years ago
  Martin Kroeker c6af9bbb32
Merge pull request #2534 from martin-frbg/issue2496 5 years ago
  Martin Kroeker 144be81ca1
fix initialization to zero in the NEON SGEMM_BETA kernel as well 5 years ago
  Martin Kroeker 07cdd5d05c
Fix zero initialization for beta=0 case 5 years ago
  Martin Kroeker 567d2760e6
Merge pull request #2520 from wjc404/develop 5 years ago
  wjc404 b8307768e2
Add files via upload 5 years ago
  Martin Kroeker af8a619e1f
Merge pull request #2517 from wjc404/develop 5 years ago
  wjc404 62b9608986
Update KERNEL.SKYLAKEX 5 years ago
  Martin Kroeker a1b181cea2
Merge pull request #2516 from wjc404/develop 5 years ago
  wjc404 cdc0e9011e
Update KERNEL.ZEN 5 years ago
  wjc404 fa049d49c2
AVX2 STRSM kernel 5 years ago