Vaisakh K V
d23eb3b93e
Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API
* Added ARMV9SME target
* Added SGEMM_DIRECT kernel based on SME1
1 year ago
Martin Kroeker
9db51f790a
Remove any optimization flags from DEBUG builds on POWER architecture
1 year ago
Caroline Newcombe
760bf7aa37
Update Fortran return for complex data types (Cray and Nvidia compilers)
1 year ago
Chip Kerchner
36bd3eeddf
Vectorize BF16 GEMV (VSX & MMA). Use GEMM_GEMV_FORWARD_BF16 (for Power).
1 year ago
Martin Kroeker
a492181665
filter out Loongarch -mabi options for flang-new
1 year ago
Martin Kroeker
a1073f5eed
Merge pull request #4900 from XiWeiGu/la64_core_rename
LoongArch64: Rename core
1 year ago
gxw
48698b2b1d
LoongArch64: Rename core
Use microarchitecture name instead of meaningless strings to name the core,
the legacy core is still retained.
1. Rename LOONGSONGENERIC to LA64_GENERIC
2. Rename LOONGSON3R5 to LA464
3. Rename LOONGSON2K1000 to LA264
1 year ago
Martin Kroeker
969bb949b1
Strip any mtune option from FFLAGS is the compiler is flang-new
1 year ago
Martin Kroeker
383e0b133e
remove suppression of gcc14's incompatible pointer error
1 year ago
Martin Kroeker
42d8865234
fix typo
1 year ago
Martin Kroeker
fcb88b9d52
enable GEMM/GEMV forwarding for riscv and ppc
1 year ago
Chris Sidebottom
b26424c6a2
Allow opt into GEMM -> GEMV forwarding
1 year ago
Martin Kroeker
a4e56e0452
Merge pull request #4806 from Mousius/small-gemm
Small GEMM for AArch64 with SVE
1 year ago
yamazaki-mitsufumi
821ef34635
Add A64FX to the list of CPUs supported by DYNAMIC_ARCH
1 year ago
Mark Ryan
3b715e6162
Add autodetection for riscv64
Implement DYNAMIC_ARCH support for riscv64. Three cpu types are
supported, riscv64_generic, riscv64_zvl256b, riscv64_zvl128b.
The two non-generic kernels require CPU support for RVV 1.0 to
function correctly. Detecting that a riscv64 device supports
RVV 1.0 is a little complicated as there are some boards on the
market that advertise support for V via hwcap but only support
RVV 0.7.1, which is not binary compatible with RVV 1.0. The
approach taken is to first try hwprobe. If hwprobe is not
available, we fall back to hwcap + an additional check to distinguish
between RVV 1.0 and RVV 0.7.1.
Tested on a VM with VLEN=256, a CanMV K230 with VLEN=128 (with only
the big core enabled), a Lichee Pi with RVV 0.7.1 and a VF2 with no
vector.
A compiler with RVV 1.0 support must be used to build OpenBLAS for
riscv64 when DYNAMIC_ARCH=1.
Signed-off-by: Mark Ryan <markdryan@rivosinc.com>
1 year ago
gxw
8ab2e9ec65
LoongArch: DGEMM small matrix opt
2 years ago
Martin Kroeker
4376b6f7d2
Restore Loongson LA64ARCH handling
1 year ago
Martin Kroeker
fc10673fd3
Merge branch 'develop' into hugetlb-doc
1 year ago
Martin Kroeker
9c4e10fbd1
sort hugetlb and shm alloc options
1 year ago
Martin Kroeker
7c915e64ca
Silence a GCC14 warning/error in the f2c-converted LAPACK
1 year ago
Martin Kroeker
ae695d4ca0
Merge pull request #4642 from XiWeiGu/loongarch64_clang
CI: Add clang test for loongarch64
1 year ago
gxw
7cd438a5ac
loongarch64: Fixed clang compilation issues
1 year ago
Martin Kroeker
0ec0746ae4
Update Makefile.system
1 year ago
Martin Kroeker
d6b0badc05
Fix declarations for EMBEDDED
1 year ago
Martin Kroeker
00ee5d0367
On ARM, do not assume -marm by default if OS_EMBEDDED=1
1 year ago
Chip Kerchner
1c13cda3fc
Remove -openmp flag from XLF (since it doesn't support it).
1 year ago
Martin Kroeker
52b71a1673
Filter out FFLAGS that flang-new from LLVM18 no longer supports ( #4569 )
* Filter out FFLAGS that flang-new from LLVM18 no longer supports
1 year ago
Martin Kroeker
a14176440a
Add version macro for GCC12
1 year ago
Martin Kroeker
56fad407d1
Merge pull request #4527 from ChipKerchner/fixAIXBuildIssues
Fix LAPACK unit testing build issues.
1 year ago
Chris Sidebottom
7a6fa699f2
Small GEMM for AArch64
This is a fairly conservative addition of small matrix kernels using
SVE.
1 year ago
Martin Kroeker
d1409407a0
Omit redundant prefixes or suffixes in library naming
1 year ago
Chip-Kerchner
3e030cc5fe
Fix LAPACK unit testing build issues. Limit AIX builds to 32 threads (to eliminate failures of some systems).
1 year ago
Martin Kroeker
2e86faa657
Merge branch 'develop' into issue4468
1 year ago
Ayappan Perumal
892f8ff3e5
Shared library support for AIX
1 year ago
Martin Kroeker
ca6b4961e4
updates to fix option conflicts and config file generation
1 year ago
Martin Kroeker
bb96e466ae
Introduce LIBNAMEPREFIX to avoid messing with the internal LIBPREFIX
1 year ago
Martin Kroeker
1ed69ea1c0
improve naming
2 years ago
Martin Kroeker
63fbffddf8
Add option FIXED_LIBNAME to suppress versioning and softlinking
2 years ago
Dirreke
ec89466e14
Add CSKY support
2 years ago
Chris Sidebottom
dc20a78188
Use functionally equivalent dynamic targets
Similar to `drivers/other/dynamic.c`, I've looked for functionally
equivalent targets and mapped them in the default DYNAMIC_ARCH build.
Users can still build specific cores using DYNAMIC_LIST.
2 years ago
Martin Kroeker
47b03fd4b4
Copy XCode15-specific workaround to Fortran flags to fix build of tests
2 years ago
Martin Kroeker
9c3c1cfbd6
Merge pull request #4304 from martin-frbg/issue4277
Move clang/gfortran OpenMP dependency rewriting out of f_check
2 years ago
Martin Kroeker
1a308a0066
Move OpenMP dependency handling for clang/gfortran combo
2 years ago
Chip Kerchner
206e76187e
Fix FCOMMON_OPT for power. Error out for certain C and Fortran compiler combos in AIX.
2 years ago
Rajalakshmi Srinivasaraghavan
980f702f72
POWER: AIX: Make use of power10 optimization
POWER10 optimizations are disabled when using default AIX assembler.
As we have fixed many issues recently, enabling optimization path
for default assembler.
2 years ago
Martin Kroeker
b41cab0875
Need to use override to actually strip down the already defined FFLAGS for NAG and CCE Fortran
2 years ago
Martin Kroeker
103d6f4e42
Require "classic ld" with XCODE 15.x on Mac
2 years ago
Rajalakshmi Srinivasaraghavan
a11e1e10f4
powerpc: Fix build errors with xlf
This patch fixes errors when using xlf as fortran compiler on Linux.
Tested with gcc/xlf and clang/xlf compiler combinations.
2 years ago
Martin Kroeker
bb47183222
Force -qextname for trailing underscore generation when IBM xlf is used with gcc
2 years ago
Martin Kroeker
09911f077e
Disable SVE targets for DYNAMIC_ARCH when compiling with (homebrew)gcc on macOS/arm64
2 years ago