768 Commits (4b55fae337f2a2cc327ad3e7afe61c2aed2f3ae8)

Author SHA1 Message Date
  Ashwin Sekhar T K 4b55fae337 ARM64: Add Cavium THUNDERX2T99 Target 9 years ago
  Andrew Pinski 95649dee28 THUNDERX: Add optimized version of daxpy 10 years ago
  Andrew Pinski 8fdb0655e9 THUNDERX: Add an optimized version of ddot 10 years ago
  Andrew Pinski fb200c7245 ARM64: Add Cavium THUNDERX Target 9 years ago
  Ashwin Sekhar T K 0b8e876d89 VULCAN: Add optimized DGEMM implementation 9 years ago
  Ashwin Sekhar T K 4713e7c47f ARM64: Add the VULCAN Target 9 years ago
  Ashwin Sekhar T K 6085386b10 CORTEXA57: Add assembly kernels for copy routines 9 years ago
  kaustubh 1480f3df71 Add msa optimization for AXPY, COPY, SCALE, SWAP 9 years ago
  kaustubh 88afb3bc94 Add msa optimization for AXPY, COPY, SCALE, SWAP 9 years ago
  Zhang Xianyi b678471d65 Merge branch 'z13' into develop 9 years ago
  Zhang Xianyi 864e202afd Add USE_TRMM=1 for IBM z13 in kernel/Makefile.L3 9 years ago
  Abdurrauf 6418667818 dtrmm and dgemm for z13 9 years ago
  Shivraj Patil a9bf8a781a Added prefetch to CGEMV and ZGEMV. 9 years ago
  kaustubh 5f93aa5f87 Updated data prefetch in TRSM, ASUM, DOT functions 9 years ago
  kaustubh 9db451acd0 Updated data prefetch in TRSM, ASUM, DOT functions 9 years ago
  kaustubh 3eaff85191 Updated data prefetch in TRSM, ASUM, DOT functions 9 years ago
  kaustubh 00abce3b93 Add data prefetch in DOT and ASUM functions 9 years ago
  Andrew becf8bc7a0 remove dead code 9 years ago
  kaustubh f3419e634c SGEMM, DGEMM, CGEMM, ZGEMM functions data prefetch 9 years ago
  Zhang Xianyi 7472c79ea6 Merge pull request #984 from ksraste/develop 9 years ago
  kaustubh 90e2321ac3 STRSM, DTRSM functions data prefetch 9 years ago
  Martin Kroeker 4998e19869 Change file comments to work around clang 3.9 assembler bug 9 years ago
  Martin Kroeker 91610f3835 Update zdot_msa.c 9 years ago
  Martin Kroeker 6e22ecf102 Update zdot.c 9 years ago
  Martin Kroeker 6221d6df5f Update zdot.c 9 years ago
  Martin Kroeker 16446d1d23 Remove explicit include of complex.h 9 years ago
  Martin Kroeker a6e9e0b94b Remove explicit include of complex.h 9 years ago
  Martin Kroeker 3178e4fea0 Remove explicit include of complex.h 9 years ago
  Martin Kroeker 95c245ddb0 Remove explicit include of complex.h 9 years ago
  Martin Kroeker 4b1b27347f Remove explicit include of complex.h 9 years ago
  Shivraj Patil 54747fe24a DGEMM function split and data prefech 9 years ago
  Zhang Xianyi 515bc56ea9 Refs #946. Use nrm2 reference implementation for Power8. 9 years ago
  Zhang Xianyi ae70b916f4 Refs #929. Deal with zero and NaNs for scale. 9 years ago
  Shivraj Patil 9687437928 MIPS n32 ABI and build time mips simd support check 9 years ago
  Shivraj Patil d1c6469283 MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS 9 years ago
  Ashwin Sekhar T K c54a29bb48 Cortex A57: Improvements to DGEMM 8x4 kernel 9 years ago
  Shivraj Patil beb1d076a4 Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions 9 years ago
  Zhang Xianyi 8a592ee386 Merge pull request #924 from ashwinyes/develop_aarch64_improvements_20160714 9 years ago
  Ashwin Sekhar T K 0a5ff9f9f9 Improvements to TRMM and GEMM kernels 9 years ago
  Ashwin Sekhar T K 8a40f1355e Improvements to GEMV kernels 9 years ago
  Ashwin Sekhar T K 78782485b6 Improvements to COPY and IAMAX kernels 9 years ago
  Shivraj Patil 57df7956ee Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM. Updated macros in SGEMM, DGEMM, STRMM. 9 years ago
  Zhang Xianyi 4a30a2584a Merge pull request #897 from ksraste/develop 9 years ago
  Werner Saar f04af36ad0 Merge pull request #898 from wernsaar/develop 9 years ago
  Kaustubh Raste 011431b9d7 STRSM optimized for MSA 9 years ago
  Kaustubh Raste c8a7860eb3 STRSM optimized 9 years ago
  Zhang Xianyi 2daad2bcb5 Merge pull request #893 from biddisco/develop 9 years ago
  John Biddiscombe 053044ae4d Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR 9 years ago
  Aleksey Kuleshov fca66262c4 mips64/axpy: fix error when INCY == 0 9 years ago
  Werner Saar 412bcd187a optimized dtrsm_logic_LT_16x4_power8.S and dtrsm_macros_LT_16x4_power8.S 9 years ago