122 Commits (2df4235e00a73ad61b7997c74497fd86eb278ebf)

Author SHA1 Message Date
  Martin Kroeker 775a87242d
Rename KERNEL.SILICON to KERNEL.VORTEX 5 years ago
  Martin Kroeker 80794fe8fd
Create KERNEL.SILICON 5 years ago
  Ashwin Sekhar T K 4e1be0e481 ARM64: Add THUNDERX3T110 Target 5 years ago
  ZhangDanfeng bc6fd20a40 fix INIT8x4 5 years ago
  ZhangDanfeng 9b7877ccf1 sgemm copy source init 5 years ago
  ZhangDanfeng f82fa802d1 Insert prefetch 5 years ago
  张丹枫 9df79ae9a3 update sgemm and strmm kernel selecting strategy 5 years ago
  张丹枫 a1fc6041cd use general register to speedup 5 years ago
  张丹枫 edb423d772 align general register using to strmm_kernel_8x8 5 years ago
  zhangdanfeng 0e6eb8c247 sgemm kernel use sgemm_kernel_8x8_cortexa53 5 years ago
  zhangdanfeng d475db29c6 optimized for cortex-a53 5 years ago
  Ashwin Sekhar T K 8353cb245a ARM64: Improve DAXPY for ThunderX2 5 years ago
  Martin Kroeker 144be81ca1
fix initialization to zero in the NEON SGEMM_BETA kernel as well 5 years ago
  Martin Kroeker 07cdd5d05c
Fix zero initialization for beta=0 case 5 years ago
  s00548429 bec7923a0d Fix the functional bugs for zamax. 5 years ago
  Ali Saidi c623a965f9 Add Neoverse-N1 core 6 years ago
  Martin Kroeker e57b11acca
Add preliminary support for EMAG8180 6 years ago
  Martin Kroeker 456ee2e1f0
Merge pull request #2357 from chenxuqiang/dgemm_beta_zero 6 years ago
  shengyang 80db5f11e1 update 6 years ago
  chenxuqiang 52de4cc8fd kernel/arm64/dgemm_beta.S: add beta == zero branch 6 years ago
  Martin Kroeker 44028581cc
Merge pull request #2355 from Zeyiii/dev-zeyi2 6 years ago
  Martin Kroeker 86ab939936
Merge pull request #2354 from ZuoQ3/develop 6 years ago
  shengyang 8d84403205 Use arm neon instructions to optimize ncopy operation 6 years ago
  w00421467 0833a4846a Use arm neon instructions to optimize sgemm_beta operation 6 years ago
  zq 50f7fc1401 [WIP] Use arm neon instructions to optimize tcopy operation 6 years ago
  w00421467 3ccf8885ac prefetching for dgemm_beta 6 years ago
  w00421467 b7cc69ee62 declare DGEMM_BETA in KERNEL.ARMV8 rather than the generic KERNEL 6 years ago
  w00421467 aeef942c4f use arm neon instructions to optimize gemm beta operation 6 years ago
  Martin Kroeker 85ccdce8c4
Remove the IOS fallbacks to generic C kernels 6 years ago
  Martin Kroeker a448884a63
Remove automatic label postfixes from macro included only once 6 years ago
  Martin Kroeker 3a2df19db6
Fix accidental duplication of jump instruction 6 years ago
  Martin Kroeker 56837e9d92
Make local labels in macro compatible with the xcode assembler 6 years ago
  Martin Kroeker 3e3ccb9011
Add ARM64 implementations of ?sum 6 years ago
  maomao194313 783ba8058f
HiSilicon tsv110 CPUs optimization branch 7 years ago
  Martin Kroeker 7639f2e1f0
Rewrite the conditional for OSX to fix cmake parsing on others 7 years ago
  Martin Kroeker 6ba30e270d
Fix typo that broke CNRM2 on ARMV8 since 0.3.0 7 years ago
  Renato Golin 310ea55f29 Simplifying ARMv8 build parameters 7 years ago
  Ashwin Sekhar T K d5aeff636f ARM64: Enable DYNAMIC_ARCH 7 years ago
  Ashwin Sekhar T K d50abc8903 ARM64: Move parameters from parameter.c to param.h 7 years ago
  Ashwin Sekhar T K 351a0c777c ARM64: Remove XGENE1 references 7 years ago
  Ashwin Sekhar T K 21f46a1cf2 ARM64: Use THUNDERX2T99 Neon Kernels for ARMV8 7 years ago
  Ashwin Sekhar T K caf339412f ARM64: Remove dependency of THUNDERX2T99 Makefile on CORTEXA57 Makefile 7 years ago
  Ashwin Sekhar T K 8001fdcd2a ARM64: Remove dependency of THUNDERX Makefile on ARMV8 Makefile 7 years ago
  Ashwin Sekhar T K 162e312832 ARM64: Remove dependency of CORTEXA57 Makefile on ARMV8 Makefile 7 years ago
  Ashwin Sekhar T K c3d93caa8d ARM64: Remove dependency of XGENE1 Makefile on ARMV8 Makefile 7 years ago
  Martin Kroeker 1cb7b9015e
Conditional compilation of assembly files that IOS does not like 7 years ago
  Martin Kroeker a4bd41e9f2
Fix paths to C kernels for nrm2 7 years ago
  Craig Donner c2545b0fd6 Fixed a few more unnecessary calls to num_cpu_avail. 7 years ago
  Ashwin Sekhar T K fa9ca65c0e ARM64: Fix utest dsdot errors 8 years ago
  Martin Kroeker c9d408064a
Use dot.S also for DSDOT on CORTEXA57 8 years ago