Martin Kroeker
3785c0e82b
Merge pull request #2663 from martin-frbg/issue2654
Respect predefined defaults for AR, AS, LD and RANLIB
5 years ago
Martin Kroeker
f2d8879af6
Merge pull request #2661 from martin-frbg/issue2660
Report selected DYNAMIC_ARCH kernel rather than one of its aliases in gotoblas_corename
5 years ago
Martin Kroeker
6876221cf3
Remove optimization level limit for flang again and add -fno-unroll-loops for AOCC flang 2.x instead
5 years ago
Martin Kroeker
79cdcde717
Re-enable higher optimization levels for flang while disabling loop unrolling for AOCC flang
5 years ago
Martin Kroeker
18a11137f1
Update BLAS tests to correspond to Reference-LAPACK 3.9.0
replaces calculation of machine precision with call to epsilon intrinsic and removes the requirement for previous output files to be removed before rerunning tests
5 years ago
Martin Kroeker
1dd712131e
Fix spelling of flang option -Mrecursive and add -Kieee
5 years ago
Martin Kroeker
0ed2adf0b2
Fix spelling of flang option -Mrecursive and add -Kieee
5 years ago
Martin Kroeker
abf670757b
Respect predefined defaults for AR, AS, LD and RANLIB
5 years ago
Simon Märtens
41fc6f3cd2
Added missing exported symbols.
5 years ago
Martin Kroeker
007d9f97d7
Make gotoblas_corename report the name of the selected TARGET rather than its aliases
5 years ago
Martin Kroeker
63d26090f5
Merge pull request #64 from xianyi/develop
rebase
5 years ago
Rajalakshmi Srinivasaraghavan
9fe930f205
powerpc: Add support for future processor
This is the initial patch to support build infrastructure
for POWER10 architecture.
5 years ago
Martin Kroeker
3a1b58d54a
Merge pull request #2653 from craft-zhang/cortex-a53
fix INIT8x4 of SGEMM on Arm Cortex-A53
5 years ago
Martin Kroeker
f7659be4a0
Merge pull request #2652 from martin-frbg/flang-fixes
Fixes for compilation with flang binary release 20190329
5 years ago
ZhangDanfeng
bc6fd20a40
fix INIT8x4
Signed-off-by: ZhangDanfeng <467688405@qq.com>
5 years ago
Martin Kroeker
3ce469a34f
Limit optimization level to O1 for flang and add -frecursive
5 years ago
Martin Kroeker
ba2c5b404d
When building with flang, use it also for the final link step to get dependencies right
5 years ago
Martin Kroeker
f07a80354b
Apply previously AOCC-specific workaround to all versions of flang
5 years ago
Martin Kroeker
fdd1b50263
Merge pull request #63 from xianyi/develop
rebase
5 years ago
Leonard Lausen
b98923f33a
Test enforce -O1 for flang
5 years ago
Leonard Lausen
4cb1db0e3b
Test flang build
5 years ago
Martin Kroeker
430e8b45fe
Merge pull request #2648 from martin-frbg/lapack411
Break out of potentially infinite rescaling loop in LAPACK xLARGV/xLARTG/xLARTGP
5 years ago
Martin Kroeker
88fe85f4e0
Merge pull request #2647 from martin-frbg/aocc-flang
Small fixes for flang in general and the AMD AOCC version of it in particular
5 years ago
Martin Kroeker
89091e6b64
Merge pull request #2645 from martin-frbg/misc_fixes
Miscellaneous fixes
5 years ago
Martin Kroeker
522aaf53bf
Break out of potentially infinite rescaling loop in LAPACK xLARGV/xLARTG/xLARTGP
Reference-LAPACK issue 411
5 years ago
Martin Kroeker
c3574ffe53
Merge pull request #2646 from wjc404/develop
Optimize AVX512 parallel DGEMM performance
5 years ago
Martin Kroeker
4e28dc6353
Use only -O1 with AMD AOCC version of flang
to prevent miscompilation of LAPACK codes and tests on Ryzen
5 years ago
Martin Kroeker
13c28889a2
Update "cosmetic fixes for non-C99 compilers"
5 years ago
wjc404
0e3ac4a06b
Add files via upload
5 years ago
Martin Kroeker
28915eed72
Cosmetic fixes for non-C99 compilers
5 years ago
Martin Kroeker
7f60fb6b91
Delete spurious copy of common_param.h
5 years ago
Martin Kroeker
0464e662ad
make blas_quickdivide unsigned and guard against miscompilation
5 years ago
Martin Kroeker
0f9a935a5a
Merge pull request #62 from xianyi/develop
rebase
5 years ago
Martin Kroeker
79cd69fea4
Merge pull request #2644 from martin-frbg/cmake-maxstack
Add CMAKE support for MAX_STACK_ALLOC setting
5 years ago
Martin Kroeker
bb12c2c854
Limit MAX_STACK_ALLOC availability to non-Wndows
5 years ago
Martin Kroeker
32c1c1e125
Update azure-pipelines.yml
5 years ago
Martin Kroeker
f1953b8b81
Update azure-pipelines.yml
5 years ago
Martin Kroeker
6e97df7b47
Add CMAKE support for MAX_STACK_ALLOC setting
5 years ago
Martin Kroeker
729303e5ed
Merge pull request #2643 from craft-zhang/cortex-a53
Improve performance of SGEMM on Arm Cortex-A53
5 years ago
Martin Kroeker
547965530f
Merge pull request #2638 from leezu/actions
Add Github Actions test for DYNAMIC_ARCH builds on Linux and macOS
5 years ago
ZhangDanfeng
9b7877ccf1
sgemm copy source init
Signed-off-by: ZhangDanfeng <467688405@qq.com>
5 years ago
ZhangDanfeng
f82fa802d1
Insert prefetch
Signed-off-by: ZhangDanfeng <467688405@qq.com>
5 years ago
Martin Kroeker
3eda3d34c3
Merge pull request #2641 from martin-frbg/ppcg4
Work around PPC G4 test failures
5 years ago
Martin Kroeker
a8f42ae85c
set cmake build type to Release
5 years ago
Martin Kroeker
e6e2e531bc
revert clang pragma
5 years ago
Martin Kroeker
456dc04441
Update sgemm_kernel_16x4_skylakex_3.c
5 years ago
Martin Kroeker
89323458a9
preset optimization level for apple clang
5 years ago
Martin Kroeker
e153bdeb70
Update dynamic_arch.yml
5 years ago
Martin Kroeker
c2001f7756
Make cmake build verbose to see options in use
5 years ago
Martin Kroeker
c2b3f0b3f6
Revert "keep Apple Clang from optimizing this"
5 years ago