Xianyi Zhang
7834c10e2f
Add PingTouGe contribution credit.
5 years ago
Martin Kroeker
53e0837809
Merge pull request #3022 from jinboson/develop
Fix test errors reported by cblas_cgemm & cblas_ctrmm
5 years ago
Martin Kroeker
8fef5876d1
Merge pull request #3024 from martin-frbg/sparc
Fix 32 and 64bit builds on SPARC with SolarisStudio compilers
5 years ago
Martin Kroeker
6c7d557a16
Fix compiler options for 32 and 64bit SPARC builds with SolarisStudio
5 years ago
Martin Kroeker
b660008c7e
Work around DOT and SWAP test failures
5 years ago
Martin Kroeker
f8346603cf
Fix compilation with SolarisStudio
5 years ago
Martin Kroeker
93473174d6
Fix utest build with SolarisStudio compilers
5 years ago
Martin Kroeker
b0b14f4e9b
Change comments to C style for compatibility
5 years ago
Martin Kroeker
3a1b1b7c8c
Fix complex ABI for 32bit SolarisStudio builds
5 years ago
Martin Kroeker
da6d5d675c
Fix hostarch detection for sparc
5 years ago
Martin Kroeker
04fa17322c
Fix build options for SolarisStudio compilers
5 years ago
Martin Kroeker
3853014ea1
Merge pull request #1 from xianyi/develop
rebase
5 years ago
Jin Bo
65de6f5957
Fix test errors reported by cblas_cgemm & cblas_ctrmm
The file cgemm_kernel_8x4_msa.c holds the MSA optimization
codes of cblas_cgemm and cblas_ctrmm. It defines two
macros: CGEMM_SCALE_1X2 and CGEMM_TRMM_SCALE_1X2. The pc1
array index in the two macros should be 0 and 1.
5 years ago
Martin Kroeker
f21618684b
Merge pull request #3018 from martin-frbg/issue3015
Avoid concurrent inclusion of libgomp and libomp in clang+gfortran builds
5 years ago
Martin Kroeker
441c08c9ff
Merge pull request #3016 from xiegengxin/complex-asum
Improve the performance of zasum and casum with AVX512 intrinsic
5 years ago
Martin Kroeker
66302b3c06
Merge pull request #3013 from martin-frbg/gcc46
Fix 32bit x86 builds and add workaround for x86_64 miscompilations by gcc 4.6 (including our Travis setup)
5 years ago
Martin Kroeker
07e9a12349
Merge pull request #3011 from cyyever/fix_link
link math lib on FreeBSD
5 years ago
Martin Kroeker
dd1adbdec4
Merge pull request #3019 from RajalakshmiSR/dgemm_param
POWER10: Update param.h
5 years ago
Martin Kroeker
a1eecccda2
Update f_check
5 years ago
Rajalakshmi Srinivasaraghavan
41fe6e864e
POWER10: Update param.h
Increasing the values of DGEMM_DEFAULT_P and DGEMM_DEFAULT_Q helps
in improving performance ~10% for DGEMM.
5 years ago
Martin Kroeker
74b5850581
Add libomp to the LAPACK(-test) dependencies in clang/gfortran builds
5 years ago
Martin Kroeker
da0c94c76f
Avoid linking both GNU libgomp and LLVM libomp in clang/gfortran builds
5 years ago
Martin Kroeker
a6692dc129
use gfortran-10 with xcode 12
5 years ago
Martin Kroeker
72a553f5bc
Update .travis.yml
5 years ago
Martin Kroeker
dcbb3b5ef1
fix misplaced lines
5 years ago
Martin Kroeker
57456c248b
fix gfortran requirement in osx interface64 test
5 years ago
Martin Kroeker
c361313564
Disable deprecated 32bit xcode
5 years ago
Gengxin Xie
0cb7a403b2
fix error declare function blas_level1_thread_with_return_value
5 years ago
Martin Kroeker
77a538d4ba
Update an overlooked instance of xcode 10.0 as well
5 years ago
Martin Kroeker
9621062eba
Update OSX xcode version to 11.5
5 years ago
Gengxin Xie
b766c1e9bb
Improve the performance of zasum and casum with AVX512 intrinsic
5 years ago
Martin Kroeker
22574b474e
Suppress -mfma as well for gcc 4.6
5 years ago
Martin Kroeker
f662022994
Move the version check to avoid overwriting unprocessed compiler data
5 years ago
Martin Kroeker
5e81e81478
Merge pull request #3014 from RajalakshmiSR/dgemvnp10
POWER10: Optimize dgemv_n
5 years ago
Rajalakshmi Srinivasaraghavan
7d46e31de1
POWER10: Optimize dgemv_n
Handling as 4x8 with vector pairs gives better performance than
existing code in POWER10.
5 years ago
Martin Kroeker
62a2eb884f
Add SSE flags for x86
5 years ago
Martin Kroeker
2e99e2699b
Add workaround for gcc 4.6 miscompiling assembly kernels with -mavx
5 years ago
Martin Kroeker
006b13299f
Merge pull request #3012 from martin-frbg/restore-getarch
Restore RISCV entries accidentally trashed by my PR 3005
5 years ago
Martin Kroeker
ca17d3dc3d
Restore RISCV entries accidentally trashed by my PR 3005
5 years ago
Martin Kroeker
52ed2741c5
Merge pull request #3010 from ggouaillardet/topic/fj_compilers
add Fujitsu compilers
5 years ago
cyy
3b4c016110
link math lib on FreeBSD
5 years ago
Gilles Gouaillardet
358100ec15
add Fujitsu compilers
Co-authored-by: Tomoki Karatsu <karatsu.spack@gmail.com>
5 years ago
Martin Kroeker
3788b6d156
Merge pull request #3005 from martin-frbg/ssefix
Add -msse for x86 and silence build warning in getarch
5 years ago
Martin Kroeker
bc5b1ddf0d
Merge pull request #3004 from martin-frbg/bsd_getauxval
ARM64 DYNAMIC_ARCH build fix for BSD/OSX
5 years ago
Martin Kroeker
2f42d23104
Merge pull request #3002 from martin-frbg/issue3000
Ensure that all targets in a DYNAMIC_ARCH build on POWER use the same buffer size
5 years ago
Martin Kroeker
b72dd007dc
Merge pull request #3001 from martin-frbg/issue2996
Fix ambiguous ifdefs in tests for user-defined options in Makefiles
5 years ago
Martin Kroeker
11ebe5fa25
Avoid redefinition warning
5 years ago
Martin Kroeker
01f01dae98
Add -msse if supported
5 years ago
Martin Kroeker
e7bf8ced6c
Build fix for systems that do not support getauxval
5 years ago
Martin Kroeker
0256294921
Fix syntax mixup
5 years ago