Martin Kroeker
68b1713c30
Merge pull request #2811 from martin-frbg/issue2806
Make NO_AVX512 option override the AVX512 compile test in CMAKE builds as well
5 years ago
Martin Kroeker
0a4c5c4c44
Merge pull request #2807 from martin-frbg/issue2804
Work around ARMV8 build-time cpu detection problems on non-Linux systems
5 years ago
Martin Kroeker
5feb087c05
Handle Apple labeling armv8 as arm64 rather than aarch64
5 years ago
Martin Kroeker
7c0977c267
Add OpenMP dependency to pkgconfig file if needed
5 years ago
Martin Kroeker
bd3207b4b4
Update system.cmake
5 years ago
Martin Kroeker
b8ebfc9335
Update system.cmake
5 years ago
Martin Kroeker
7c1986640b
fallback from cooperlake to skylake if gcc<10
5 years ago
Martin Kroeker
71d33c952d
Typo fix
5 years ago
Martin Kroeker
6a3c074786
-march=cooperlake requires gcc10
5 years ago
Martin Kroeker
430f741b30
-march=cooperlake requires gcc10
5 years ago
Chen, Guobing
e740c4873d
Enable COOPERLAKE build target
Enable new build target platform -- COOPERLAKE. This target platform
supports all the SKYLAKEX supported ISAs + avx512bf16. So all the
SKYLAKEX specific kernels/drivers and related code are now extended
to be also active on COOPERLAKE. Besides, new BF16 related kernels
are active under this target.
5 years ago
Martin Kroeker
cb097beba2
Merge pull request #2741 from martin-frbg/issue2739
Adjust A53 SGEMM parameters to reflect recent switch to 8x8 kernel
5 years ago
Martin Kroeker
64e2e4aaf3
missing braces
5 years ago
Martin Kroeker
921ec4e9e2
Adjust A53 SGEMM parameters to reflect move to 8x8 kernel
5 years ago
Ashwin Sekhar T K
4e1be0e481
ARM64: Add THUNDERX3T110 Target
5 years ago
Martin Kroeker
9e21a100e3
Add trivial check for stdatomic.h
5 years ago
Martin Kroeker
9d000ecaa2
include CheckLanguage module
5 years ago
Martin Kroeker
a847d00366
handle missing lack of fortran compiler more gracefully
5 years ago
Martin Kroeker
6eaeb01263
Merge pull request #2658 from RajalakshmiSR/p10
powerpc: Add support for future processor
5 years ago
Martin Kroeker
6876221cf3
Remove optimization level limit for flang again and add -fno-unroll-loops for AOCC flang 2.x instead
5 years ago
Martin Kroeker
1dd712131e
Fix spelling of flang option -Mrecursive and add -Kieee
5 years ago
Rajalakshmi Srinivasaraghavan
9fe930f205
powerpc: Add support for future processor
This is the initial patch to support build infrastructure
for POWER10 architecture.
5 years ago
Martin Kroeker
3ce469a34f
Limit optimization level to O1 for flang and add -frecursive
5 years ago
Martin Kroeker
79cd69fea4
Merge pull request #2644 from martin-frbg/cmake-maxstack
Add CMAKE support for MAX_STACK_ALLOC setting
5 years ago
Martin Kroeker
bb12c2c854
Limit MAX_STACK_ALLOC availability to non-Wndows
5 years ago
Martin Kroeker
6e97df7b47
Add CMAKE support for MAX_STACK_ALLOC setting
5 years ago
Martin Kroeker
4db00121dc
Disable EXPRECISION and add -lm on OSX (same as the BSDs and Linux)
5 years ago
Martin Kroeker
cd10b35fe9
Handle trailing spaces and empty condition variables
5 years ago
Martin Kroeker
5dd14e3d48
Make building the bfloat16 functions conditional on option BUILD_HALF ( #2590 )
* make building the bfloat16 BLAS functions conditional on BUILD_HALF
* pass the BUILD_HALF option to gensymbol
* Pass BUILD_HALF as a compiler define for dynamic_arch builds
5 years ago
Martin Kroeker
3bd56846bb
Silence a debug message
5 years ago
Martin Kroeker
e7bbdfdf84
Have CMAKE parse conditional lines in KERNEL files
Supports ifeq and ifneq, but requires both to have an else branch
5 years ago
Martin Kroeker
70869d571f
Quote include paths for getarch to protect any embedded spaces
5 years ago
Martin Kroeker
4f70512b97
Update kernel.cmake
5 years ago
Martin Kroeker
d0737b0142
Update kernel.cmake
5 years ago
Martin Kroeker
a83a59b038
Use generic kernels for ishama,shasum,shdot,shrot
5 years ago
Martin Kroeker
0a19bd813c
Use generic codes for shamax and shcopy
5 years ago
Martin Kroeker
f361de30a3
Use generic axpy.c for SHAXPY as x86 lacks saxpy.c
5 years ago
Martin Kroeker
9f6d6f6cb6
use saxpy.c instead of axpy.S for SHAXPY
5 years ago
Rajalakshmi Srinivasaraghavan
22bb50fb81
cmake fixes
5 years ago
Rajalakshmi Srinivasaraghavan
7eb55504b1
RFC : Add half precision gemm for bfloat16 in OpenBLAS
This patch adds support for bfloat16 data type matrix multiplication kernel.
For architectures that don't support bfloat16, it is defined as unsigned short
(2 bytes). Default unroll sizes can be changed as per architecture as done for
SGEMM and for now 8 and 4 are used for M and N. Size of ncopy/tcopy can be
changed as per architecture requirement and for now, size 2 is used.
Added shgemm in kernel/power/KERNEL.POWER9 and tested in powerpc64le and
powerpc64. For reference, added a small test compare_sgemm_shgemm.c to compare
sgemm and shgemm output.
This patch does not cover OpenBLAS test, benchmark and lapack tests for shgemm.
Complex type implementation can be discussed and added once this is approved.
5 years ago
Martin Kroeker
a05243d0f2
ifort and pgfort need "recursive" for compiling LAPACK as well
as shown in Reference-LAPACK issue 401 (their PR 403)
5 years ago
Martin Kroeker
8c7c1395da
Merge pull request #2521 from martin-frbg/cm-avx512
Use proper extension on the avx512 testcase filename
5 years ago
Martin Kroeker
1d9773b800
Use proper extension on the avx512 testcase filename
The need to call it .tmp existed only when it was generated by a tmpfile call, and the "-x c" option to tell the compiler it is actually a C source is not universally supported (this broke the test with clang-cl at least)
5 years ago
Martin Kroeker
6d54c94760
Make ifort on Windows create lowercase symbols with appended underscore
tentative fix for #2472
5 years ago
مهدي شينون (Mehdi Chinoune)
21f6c4b5a9
fixes #2480
6 years ago
Ali Saidi
c623a965f9
Add Neoverse-N1 core
The implementation is a hybird of the ARMV8 one with some of the
improved TX2 rountines along with specifying -march=v8.2-a
6 years ago
Martin Kroeker
ca4f7dceff
Add parameters for EMAG8180 DYNAMIC_ARCH support with cmake
6 years ago
Martin Kroeker
1ddf9f1067
Add EMAG8180 to arm64 DYNAMIC_ARCH list for cmake
6 years ago
Martin Kroeker
7f0d523b42
Make BUFFER_SIZE configurable
6 years ago
Martin Kroeker
8dc9fd4dfe
Add -march option for AVX512
6 years ago