5436 Commits (af2b0d0205bfc9637242b73fddacacb787e150ce)
 

Author SHA1 Message Date
  Martin Kroeker af2b0d0205
Merge pull request #3066 from martin-frbg/buffsizefix 5 years ago
  Martin Kroeker 4bf988959a
Merge pull request #3062 from austinpagan/GemmPreferedSize3 5 years ago
  Martin Kroeker a0e4fb3a28
Merge pull request #3061 from martin-frbg/arm64-pgi 5 years ago
  Martin Kroeker 2c445be8ba
Merge pull request #3051 from martin-frbg/rocketlake 5 years ago
  Martin Kroeker 6bbe6d5b92
Make compile-time BUFFERSIZE setting actually reach the compiler/preprocessor 5 years ago
  Martin Kroeker 89ae305e11
Workaround for cmake having its own C_COMPILER variable 5 years ago
  Martin Kroeker da8d7f09f1
try to work around gcc update problems 5 years ago
  Martin Kroeker e18a2c22db
Merge pull request #3060 from martin-frbg/dyn_arm64 5 years ago
  Martin Kroeker b716c0ef01
Add workaround for NVIDIA HPC 5 years ago
  Martin Kroeker 2efa3b70dc
Add workaround for NVIDIA HPC 5 years ago
  Martin Kroeker 49959d4f1c
Add workaround for NVIDIA HPC 5 years ago
  Martin Kroeker 0f27a03607
Add workaround for NVIDIA HPC mishandling of the asm DOT kernels 5 years ago
  Martin Kroeker c2a8ebfe69
Add workaround for NVIDIA HPC mishandling of the asm DOT kernels 5 years ago
  Martin Kroeker 43aac5bacc
Support NVIDIA HPC compiler 5 years ago
  Martin Kroeker bff2b7c94d
Support compilation with NVIDIA HPC compilers (which do not take gcc-style arch options) 5 years ago
  Martin Kroeker 2d45a262d9
Support compilation with nvfortran 5 years ago
  Gordon Fossum ed652d8136 Added definitions for GEMM_PREFERED_SIZE and SWITCH_RATIO to the POWER9 and POWER10 specific sections of param.h. 5 years ago
  Martin Kroeker 6fe0f1fab9
Label get_cpu_ftr as volatile to keep gcc from rearranging the code 5 years ago
  Martin Kroeker 018dec8588
Merge pull request #7 from xianyi/develop 5 years ago
  Martin Kroeker 5d6209e1f9
Merge pull request #3055 from RajalakshmiSR/swapp10 5 years ago
  Rajalakshmi Srinivasaraghavan 601b711c78 Optimize swap function for POWER10 5 years ago
  Martin Kroeker 78702753f2
Merge pull request #3053 from pkubaj/patch-1 5 years ago
  pkubaj 7aa1ff8ff6
Fix build on FreeBSD/powerpc64le 5 years ago
  Martin Kroeker d6c97cf010
Merge pull request #3052 from ashwinyes/arm64_fix_nrm2 5 years ago
  Ashwin Sekhar T K 1b2508362b arm64: Fix nrm2 for input vectors with Inf 5 years ago
  Martin Kroeker cd898af59f
Merge pull request #3050 from aurel32/riscv64-openblas-supported 5 years ago
  Aurelien Jarno 0a535e58d8 getarch.c: define OPENBLAS_SUPPORTED for riscv64 5 years ago
  Martin Kroeker 9ce9e295fe
Merge pull request #3049 from martin-frbg/readme 5 years ago
  Martin Kroeker 9a38592c79
Add pointers to the netlib documentation and Gilbert Strang's linear algebra primers 5 years ago
  Martin Kroeker 9b3965b08c
Merge pull request #6 from xianyi/develop 5 years ago
  Martin Kroeker 531cb4f673
Merge pull request #3035 from Joshua-Ashton/patch-1 5 years ago
  Martin Kroeker 3559c5d7a2
Merge pull request #3048 from martin-frbg/issue2998 5 years ago
  Martin Kroeker 8631e2976a
Temporarily revert to the old nrm2 kernels 5 years ago
  Martin Kroeker 2768bc1764
Temporarily revert to the old nrm2 kernels 5 years ago
  Martin Kroeker 6f4698ee1f
Temporarily revert to the old nrm2 kernel 5 years ago
  Martin Kroeker 85e5165e98
Merge pull request #3046 from martin-frbg/nvidiasdk-ppc 5 years ago
  Martin Kroeker 17c16f2a71
Implement builtin_cpu_is and limit cpu choices to P8 and P9 for NVIDIA compilers 5 years ago
  Martin Kroeker 91c3f86c2b
NVIDIA compiler does not yet support POWER10 5 years ago
  Martin Kroeker 75b1f3becc
Limit POWERPC DYNAMIC_CORE list to P8 and P9 for NVIDIA compilers 5 years ago
  Martin Kroeker 07c5e549b2
Merge pull request #3045 from martin-frbg/nvidiasdk 5 years ago
  Martin Kroeker 114eb159a4
Disable FMA intrinsics in the srot kernel when the compiler is PGI/NVIDIA 5 years ago
  Martin Kroeker 005cce5507
Amend SkylakeX options to support the NVIDIA compiler 5 years ago
  Martin Kroeker b859b6e79d
Add nvfortran 5 years ago
  Martin Kroeker b212a2fb9f
Add/modify "PGI" compiler options for NVIDIA SDK 20.11 5 years ago
  Martin Kroeker e40416567a
Add version printout for PGI/NVIDIA compiler 5 years ago
  Martin Kroeker b37e5fa2f8
Merge pull request #5 from xianyi/develop 5 years ago
  Martin Kroeker 326469ef4a
Merge pull request #3042 from martin-frbg/develop 5 years ago
  Martin Kroeker c73d8ee40d
Conditionally add -mfma to compiler options where needed 5 years ago
  Martin Kroeker abef2ea770
Move -fma option setting to kernel/Makefile.L1 5 years ago
  Martin Kroeker b26e32c3af
Merge pull request #3040 from martin-frbg/fixfcheck 5 years ago