Martin Kroeker
4fffa556d8
Merge pull request #2611 from RajalakshmiSR/bench_half
Include shgemm in benchtest
5 years ago
Rajalakshmi Srinivasaraghavan
ce90e2bd3f
Include shgemm in benchtest
This patch is to enable benchtest for half precision gemm
when BUILD_HALF is set during make.
5 years ago
Martin Kroeker
948b6712ba
Merge pull request #2610 from martin-frbg/issue2552-3
Temporary workaround for excessive LAPACK test failures with COMPLEX on Skylake-X
5 years ago
Martin Kroeker
2271c3506b
Work around excessive LAPACK test failures on Skylake-X
Something in the plain C parts of x86_64 cscal.c and zscal.c appears to be miscompiled by both gfortran9 and ifort when compiling for skylakex-avx512, even when the optimized Haswell microkernel is not in use.
5 years ago
Martin Kroeker
db00b21445
Merge pull request #2609 from martin-frbg/issue2552-2
Correct ifort options
5 years ago
Martin Kroeker
58d26b4448
Correct ifort options
to same as suggested by reference-lapack
5 years ago
Martin Kroeker
8e47d14053
Merge pull request #2608 from martin-frbg/issue2604
Handle trailing whitespace and empty variables in KERNEL files
5 years ago
Martin Kroeker
cd10b35fe9
Handle trailing spaces and empty condition variables
5 years ago
Martin Kroeker
9472dd99cd
Merge pull request #57 from xianyi/develop
rebase
5 years ago
Martin Kroeker
7181665452
Merge pull request #2605 from RajalakshmiSR/cmake-power
Fix cmake compilation issue - POWER9
5 years ago
Rajalakshmi Srinivasaraghavan
bd9ff820bc
Fix cmake compilation issue - POWER9
This patch removes extra space in the sgemmotcopy filename
thereby allowing it to create entry in kernel/Makefile
created by cmake.
5 years ago
Martin Kroeker
63e45def70
Merge pull request #2603 from martin-frbg/issue2552
Add FFLAGS_DRV entry to the generated make.inc to fix lapack-test failure with Intel compilers
5 years ago
Martin Kroeker
ec0f228632
Add FFLAGS_DRV to the generated make.inc to fix lapack-test on x86_64 with icc/ifort
fixes #2552
5 years ago
Martin Kroeker
90e2941c61
Merge pull request #56 from xianyi/develop
rebase
5 years ago
Martin Kroeker
10d5f3c87b
Merge pull request #2602 from ashwinyes/thunderx2_develop
DAXPY Optimizations for ThunderX2
5 years ago
Ashwin Sekhar T K
8353cb245a
ARM64: Improve DAXPY for ThunderX2
Improve performance of DAXPY for ThunderX2
when the vector fits in L1 Cache.
5 years ago
Martin Kroeker
ec2dd7b875
Merge pull request #2601 from martin-frbg/issue818
Undefine NAME/CNAME etc in Makefile.system before defining them
5 years ago
Martin Kroeker
4e82eb9f8a
Undefine ASMNAME/NAME/CNAME before defining them
to avoid redefinition warning when environment variables like CFLAGS are being used (fixes #818 )
5 years ago
Martin Kroeker
61300bb735
Merge pull request #55 from xianyi/develop
rebase
5 years ago
Martin Kroeker
33e9b12464
Merge pull request #2597 from martin-frbg/appleclang
Use Clang 9.0.0 miscompilation fix for corresponding AppleClang version as well
5 years ago
Martin Kroeker
90dba9f716
Duplicate earlier Clang 9.0.0 workaround for corresponding Apple Clang version
As discussed on the original PR #2329 , the "Apple Clang 11.0.3" that appears to be based the same LLVM release produces the same miscompilation of this file.
5 years ago
Martin Kroeker
424d551e01
Merge pull request #53 from xianyi/develop
rebase
5 years ago
Martin Kroeker
596f5df9e8
Merge pull request #2591 from RajalakshmiSR/testhalf
Add test for shgemm
5 years ago
Martin Kroeker
5dd14e3d48
Make building the bfloat16 functions conditional on option BUILD_HALF ( #2590 )
* make building the bfloat16 BLAS functions conditional on BUILD_HALF
* pass the BUILD_HALF option to gensymbol
* Pass BUILD_HALF as a compiler define for dynamic_arch builds
5 years ago
Martin Kroeker
a54e35e780
Merge pull request #2586 from martin-frbg/miscfixes
Trivial fix for compiler warnings
5 years ago
Rajalakshmi Srinivasaraghavan
564b0d39ef
Add test for shgemm
This patch has Makefile changes to add test for shgemm which
compares sgemm and shgemm result.
5 years ago
Martin Kroeker
5d58b11101
Merge pull request #52 from xianyi/develop
rebase
5 years ago
Martin Kroeker
d394d4e677
Merge pull request #2585 from martin-frbg/mips64fix
Increase default BUFFER_SIZE on MIPS64
5 years ago
Martin Kroeker
f4248af26e
Fix compiler warnings
5 years ago
Martin Kroeker
2d89603e9d
Increase BUFFER_SIZE on mips64 to match SGEMM parameters
5 years ago
Martin Kroeker
26bc15258a
Merge pull request #51 from xianyi/develop
rebase
5 years ago
Martin Kroeker
141998dce2
Merge pull request #2584 from martin-frbg/issue2583
[WIP] Have CMAKE parse conditional lines in KERNEL files
5 years ago
Martin Kroeker
3bd56846bb
Silence a debug message
5 years ago
Martin Kroeker
e7bbdfdf84
Have CMAKE parse conditional lines in KERNEL files
Supports ifeq and ifneq, but requires both to have an else branch
5 years ago
Martin Kroeker
b6795db731
Merge pull request #2582 from martin-frbg/mips32fix
Increase BUFFER_SIZE on MIPS32 to accomodate SGEMM requirements
5 years ago
Martin Kroeker
5e0dbf8dfe
Increase default BUFFER_SIZE to accomodate SGEMM parameters
in response to compile-time warning from #2551
5 years ago
Martin Kroeker
955d73127f
Merge pull request #50 from xianyi/develop
rebase
5 years ago
Martin Kroeker
a8c1bea7ae
Merge pull request #2581 from martin-frbg/raji
Fix travis configuration and update CONTRIBUTORS.md
5 years ago
Martin Kroeker
e43b49e064
Drop the set -e from travis scripts
5 years ago
Martin Kroeker
3e28db7f38
Update CONTRIBUTORS.md
5 years ago
Martin Kroeker
4b69ee31af
Merge pull request #2580 from martin-frbg/issue2538-3
Increase POWER8 ZGEMM_R and use same R values for POWER9
5 years ago
Martin Kroeker
03ff213c51
Increase POWER8 ZGEMM_R and use same R values for POWER9
fixes lapack-test zger failures seen in #2299 after application of my PR #2551
5 years ago
Martin Kroeker
299d1c8de0
Merge pull request #2578 from martin-frbg/issue2576
Quote getarch include paths in prebuild.cmake
5 years ago
Martin Kroeker
70869d571f
Quote include paths for getarch to protect any embedded spaces
5 years ago
Martin Kroeker
cba87222b2
Merge pull request #49 from xianyi/develop
rebase
5 years ago
Martin Kroeker
f80dd2151e
xcode 11.4.1 for homebrew ?
5 years ago
Martin Kroeker
4412ee1754
Switch homebrew build env to new xcode 11.4
default 11.3.1 in the github image is causing brew to fail with "outdated xcode" message
5 years ago
Martin Kroeker
f6104b68c1
Merge pull request #2571 from martin-frbg/issue2299
Work around IDAMAX/IZAMAX bugs on POWER8BE with ELFv2 FreeBSD
5 years ago
Martin Kroeker
84f2c71e93
Merge pull request #2573 from martin-frbg/issue2572
Enable cblas interfaces to GEMM3M in CMAKE builds
5 years ago
Martin Kroeker
06208c8d01
Limit this fix to ELFv2 builds
5 years ago