136 Commits (4dec151d0b0327b59329b780491f41f640792e9c)

Author SHA1 Message Date
  Martin Kroeker 7c51cc8527
Merge branch 'develop' into develop 7 years ago
  AbdelRauf 853a18bc17 power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself 7 years ago
  Martin Kroeker 03d7110900
Merge pull request #2042 from maomao194313/develop 7 years ago
  maomao194313 7e3eb9b25d
make DYNAMIC_ARCH=1 package work on TSV110 7 years ago
  ken-cunningham-webuse b0c714ef60 param.h : enable defines for PPC970 on DarwinOS 7 years ago
  Martin Kroeker bdc73a49e0
Add parameters for Z14 7 years ago
  Martin Kroeker bbfdd6c0fe
Increase Zen SWITCH_RATIO to 16 7 years ago
  Arjan van de Ven b28f75cd7e set GEMM_PREFERED_SIZE for HASWELL 7 years ago
  Arjan van de Ven cdc668d82b Add a "sgemm direct" mode for small matrixes 7 years ago
  Renato Golin 310ea55f29 Simplifying ARMv8 build parameters 7 years ago
  Arjan van de Ven 5b708e5eb1 sgemm/dgemm: add a way for an arch kernel to specify prefered sizes 7 years ago
  Ashwin Sekhar T K d50abc8903 ARM64: Move parameters from parameter.c to param.h 7 years ago
  Ashwin Sekhar T K 21f46a1cf2 ARM64: Use THUNDERX2T99 Neon Kernels for ARMV8 7 years ago
  Martin Kroeker 4cf7315a5d
Adjust ARMV8 SGEMM unrolling when using the C fallback kernel_2x2 for IOS 7 years ago
  Arjan van de Ven 6eb4b9ae7c Tune HASWELL SWITCH_RATIO as well 8 years ago
  Arjan van de Ven 5c6f008365 Tune param.h for SkylakeX 8 years ago
  Arjan van de Ven 99c7bba8e4 Initial support for SkylakeX / AVX512 8 years ago
  Martin Kroeker d94d7baf7e
Add mips32r2 api target 8 years ago
  Shivraj Patil e3d844b062 Added mips I6500 core 8 years ago
  Gian-Carlo Pascutto 832a272784 Revert Zen param.h to Haswell values (instead of Excavator). 9 years ago
  Denis Steckelmacher c9ff735da6 Add ZEN support (tested for auto-detected static backend) 9 years ago
  Martin Kroeker cd135e2b59 Merge pull request #1130 from quickwritereader/develop 9 years ago
  Abdurrauf 08786c4b95 strmm and ctrmm 9 years ago
  Abdurrauf 82e80fa82b initial strmm(sgemm). not tuned yet 9 years ago
  Martin Kroeker ffc1d6c468 Merge pull request #1108 from ashwinyes/develop_20170203_thunderx2t99 9 years ago
  Ashwin Sekhar T K 19ba133383 THUNDERX2T99: Add Optimized ZGEMM Implementation 9 years ago
  Abdurrauf 0d96b0e2a7 Merge branch 'z13' into develop 9 years ago
  Abdurrauf 848cb27b1e ztrmm kernel. 9 years ago
  Ashwin Sekhar T K 2757b49767 THUNDERX2T99: Add Optimized CGEMM Implementation 9 years ago
  Ashwin Sekhar T K f279ff4789 THUNDERX2T99: Add Optimized SGEMM Implementation 9 years ago
  Ashwin Sekhar T K 4b55fae337 ARM64: Add Cavium THUNDERX2T99 Target 9 years ago
  Andrew Pinski fb200c7245 ARM64: Add Cavium THUNDERX Target 9 years ago
  Ashwin Sekhar T K 4713e7c47f ARM64: Add the VULCAN Target 9 years ago
  Zhang Xianyi b678471d65 Merge branch 'z13' into develop 9 years ago
  Abdurrauf 6418667818 dtrmm and dgemm for z13 9 years ago
  Shivraj Patil 9687437928 MIPS n32 ABI and build time mips simd support check 9 years ago
  Shivraj Patil d1c6469283 MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS 9 years ago
  Shivraj Patil beb1d076a4 Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions 9 years ago
  Zhang Xianyi 8a592ee386 Merge pull request #924 from ashwinyes/develop_aarch64_improvements_20160714 9 years ago
  Ashwin Sekhar T K 0a5ff9f9f9 Improvements to TRMM and GEMM kernels 9 years ago
  Shivraj Patil 57df7956ee Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM. Updated macros in SGEMM, DGEMM, STRMM. 10 years ago
  Shivraj Patil c4ba40e308 SGEMM optimization for MIPS P5600 and I6400 using MSA. Unrolled k loop in DGEMM kernel function 10 years ago
  Werner Saar 88011f625d Merge pull request #876 from wernsaar/develop 10 years ago
  Werner Saar 8310d4d3f7 optimized dgemm for 20 threads 10 years ago
  Shivraj Patil 085cf236c2 conflict resolved by syncing with 'xianyi:develop' 10 years ago
  Shivraj Patil b7b3d8ec8e DGEMM optimization for MIPS P5600 and I6400 using MSA 10 years ago
  Zhang Xianyi cd7af5260a Merge pull request #847 from sva-img/develop 10 years ago
  Werner Saar 782f75ba94 optimized param.h for POWER8 10 years ago
  Werner Saar 0d0c6f7d7d optimized dgemm for POWER8 10 years ago
  Werner Saar 40ac64ae4f updated param.h for EXCAVATOR 10 years ago