Martin Kroeker
d2df5bd72c
Update param.h
4 years ago
Martin Kroeker
af4d4e55d1
Update param.h
4 years ago
Martin Kroeker
f5e7fe0ec4
Update param.h
4 years ago
Martin Kroeker
93cec29c8f
Update param.h
4 years ago
Martin Kroeker
fca8259062
Update param.h
4 years ago
Martin Kroeker
656b17b4bf
Update param.h
4 years ago
Martin Kroeker
c684cae97c
Update param.h
4 years ago
Martin Kroeker
a7a05b78fe
Update param.h
4 years ago
Martin Kroeker
49878cad51
Update param.h
4 years ago
Martin Kroeker
699c0a0365
Update param.h
4 years ago
Martin Kroeker
3ce413d1db
Update param.h
4 years ago
Martin Kroeker
1049dfefa1
Update param.h
4 years ago
Martin Kroeker
3e409b156d
Update param.h
4 years ago
Martin Kroeker
4217096c92
Update param.h
4 years ago
Martin Kroeker
ceb535c1ea
Update param.h
4 years ago
Martin Kroeker
2b3d2ef789
Update param.h
4 years ago
Martin Kroeker
17376df24f
Update param.h
4 years ago
Martin Kroeker
2cc76cc843
Update param.h
4 years ago
Martin Kroeker
1489e977bf
Update param.h
4 years ago
Martin Kroeker
0a92a783b1
Update param.h
4 years ago
Martin Kroeker
4224f7ee5d
Update param.h
4 years ago
Martin Kroeker
98548457e8
Update param.h
4 years ago
Martin Kroeker
fa7e4d86fc
try 512/512 for neoverse dgemm
4 years ago
Martin Kroeker
4b5c24b45f
double neoverse dgemm p&q again
4 years ago
Martin Kroeker
8a1e00bba8
increase dgemm pq for neoverse
4 years ago
Wangyang Guo
8356a604f0
sbgemm: cooperlake: tuning for block params
4 years ago
Niyas Sait
7cddbf99b1
Make explicit conversion condition on _WIN64 flag
4 years ago
Niyas Sait
d1ed72fa87
[win/arm64]: Explicit casting for GMEMM_DEFAULT_ALIGN to create 64-bit value
Win64 uses LLP64 datamodel and unsigned long is only 32-bit. For 64-bit
architecture we need 64-bit mask to correctly generate address
4 years ago
gxw
af0a69f355
Add support for LOONGARCH64
4 years ago
Martin Kroeker
a6351e32f0
Remove BLASLONG casts from SPARC entries
in response to https://github.com/xianyi/OpenBLAS/pull/3266#issuecomment-878637675
4 years ago
User User-User
b7da75e4fd
WiP CORTEX A55 support
4 years ago
Martin Kroeker
7dfc45e840
Remove casts for PPC/POWER and complete parameters for POWER3/4
5 years ago
Gordon Fossum
198adea961
Changed default P/Q values for CGEMM and ZGEMM (Power10 only)
5 years ago
Martin Kroeker
8cdf0825de
Add workaround for older gcc on ppc64be not supporting casts in defines
5 years ago
Martin Kroeker
ecb4babcf4
remove inclusion of common.h again to avoid circular dependency
5 years ago
Martin Kroeker
30d835168a
Merge pull request #3088 from xoviat/msvc
add misc fixes.
5 years ago
austinpagan
9579bd47e5
Modifying a couple paramaters in the "POWER10"-specific section of param.h, for performance enhancements for SGEMM and DGEMM.
5 years ago
Rajalakshmi Srinivasaraghavan
63fa6c832e
Fix build issue on POWER8 with DYNAMIC_ARCH
Running make DYNAMIC_ARCH=1 on POWER 8 BE with gcc10.2 version, gives
the following error due to the difference in UNROLL_M/N.
'No rule to make target 'dgemm_incopy_POWER10.o', needed by kernel'
5 years ago
xoviat
457ccc42c9
Merge branch 'develop' into msvc
5 years ago
Gordon Fossum
ed652d8136
Added definitions for GEMM_PREFERED_SIZE and SWITCH_RATIO to the POWER9 and POWER10 specific sections of param.h.
5 years ago
Martin Kroeker
83de62c20d
Merge pull request #3026 from martin-frbg/revert747
Revert PR747 - SYRK parameter changes for Haswell and related targets
5 years ago
gxw
4b548857d6
Add msa support for loongson
1. Using core loongson3r3 and loongson3r4 for loongson
2. Add DYNAMIC_ARCH for loongson
Change-Id: I1c6b54dbeca3a0cc31d1222af36a7e9bd6ab54c1
5 years ago
Martin Kroeker
d71fe4ed4e
Remove GEMM_DEFAULT_UNROLL_MN parameters for Haswell and ZEN (introduced in PR747)
5 years ago
Martin Kroeker
b0b14f4e9b
Change comments to C style for compatibility
5 years ago
Rajalakshmi Srinivasaraghavan
41fe6e864e
POWER10: Update param.h
Increasing the values of DGEMM_DEFAULT_P and DGEMM_DEFAULT_Q helps
in improving performance ~10% for DGEMM.
5 years ago
Xianyi Zhang
fc35b72ae1
Refs #2899
Merge branch 'openblas-open-910' of git://github.com/damonyu1989/OpenBLAS into damonyu1989-openblas-open-910
5 years ago
Xianyi Zhang
913cc9a4ca
Merge branch 'develop' into risc-v
5 years ago
Rajalakshmi Srinivasaraghavan
dd7a9cc5bf
POWER10: Change dgemm unroll factors
Changing the unroll factors for dgemm to 8 shows improved performance with
POWER10 MMA feature. Also made some minor changes in sgemm for edge cases.
5 years ago
Zhang Xianyi
d7ba7679b6
Merge branch 'develop' into risc-v
5 years ago
damonyu
ef8e7d0279
Add the support for RISC-V Vector.
Change-Id: Iae7800a32f5af3903c330882cdf6f292d885f266
5 years ago