Rajalakshmi Srinivasaraghavan
63fa6c832e
Fix build issue on POWER8 with DYNAMIC_ARCH
Running make DYNAMIC_ARCH=1 on POWER 8 BE with gcc10.2 version, gives
the following error due to the difference in UNROLL_M/N.
'No rule to make target 'dgemm_incopy_POWER10.o', needed by kernel'
5 years ago
Martin Kroeker
074d9bff7f
Merge pull request #3104 from martin-frbg/issue3103
Enable optimized Haswell/AVX2 kernels for sasum/dasum and srot/drot on Ryzen
5 years ago
Martin Kroeker
f36862603a
Merge pull request #3101 from jake-arkinstall/issue-3100
Addressed issue #3100 - removing an unnecessary write to the include directory
5 years ago
Martin Kroeker
47691c031f
Use Haswell optimizations for Zen as well
5 years ago
Martin Kroeker
ce7ddd8921
Use Haswell optimizations for Zen as well
5 years ago
Martin Kroeker
950c047b49
Use Haswell optimizations for Zen as well
5 years ago
Martin Kroeker
46509953a9
Use Haswell optimizations for Zen as well
5 years ago
Martin Kroeker
db348dcff2
Enable optimized srot/drot kernels from Haswell
5 years ago
Martin Kroeker
a33f471065
Merge pull request #3102 from martin-frbg/issue3099
Strip pkgversion info from compiler version string before comparing
5 years ago
Martin Kroeker
ece3ce581e
Strip parenthesized (pkgversion) data from GCC version string to avoid misinterpretation
5 years ago
Martin Kroeker
8189a98d85
Merge pull request #12 from xianyi/develop
rebase
5 years ago
Jake Arkinstall
d7a77091a3
Addressed issue #3100 , removing an unnecessary write to the include directory
5 years ago
Martin Kroeker
3e1e74fca6
Merge pull request #3094 from xoviat/patch-1
build openmp on appveyor
5 years ago
Martin Kroeker
33b5670122
Merge pull request #3096 from martin-frbg/fixclangcmake
Fix Cooperlake/DYNAMIC_ARCH builds with clang on Windows
5 years ago
Martin Kroeker
95e19e2e23
fix case in compiler name check
Co-authored-by: xoviat <49173759+xoviat@users.noreply.github.com>
5 years ago
Martin Kroeker
99ac042702
remove spurious lines (probably editor malfunction)
5 years ago
Martin Kroeker
774b9f8653
handle AppleClang in Cooperlake support condition
5 years ago
Martin Kroeker
eb1d2344f7
Fix compiler version check for Intel Cooperlake support (clang-cl does not accept -dumpversion)
5 years ago
xoviat
6fa9860dbe
appveyor: cleanup and add openmp run
5 years ago
Martin Kroeker
0cc36770f1
Merge pull request #3073 from xoviat/embedded
add embedded option
5 years ago
Martin Kroeker
558cd543bf
Merge pull request #3093 from martin-frbg/fix3064
fix copy-paste error in build rules for cblas_crotg and cblas_zrotg
5 years ago
Martin Kroeker
bd906e3410
fix copy-paste error in build rules for cblas_crotg and cblas_zrotg
5 years ago
Martin Kroeker
35086cb501
Merge pull request #3092 from RajalakshmiSR/cscal_p10
Optimize cscal function for POWER10
5 years ago
Rajalakshmi Srinivasaraghavan
2056ffc227
Optimize cscal function for POWER10
This patch makes use of new POWER10 vector pair instructions for
loads and stores.
5 years ago
Martin Kroeker
7745439312
Merge pull request #3091 from martin-frbg/lapack477-2
Fix calculation of the non-exceptional shift values in LAPACK complex QZ
5 years ago
Martin Kroeker
c4b5abbe43
fix data type
5 years ago
Martin Kroeker
f87842483e
fix calculation of non-exceptional shift (from Reference-LAPACK PR 477)
5 years ago
Martin Kroeker
3dbb32c734
Merge pull request #11 from xianyi/develop
rebase
5 years ago
Martin Kroeker
00880c720a
Merge pull request #3087 from martin-frbg/lapack477
Apply Reference-LAPACK PR 477 for convergence problems in CHGEQZ/ZHGEQZ
5 years ago
Martin Kroeker
856bc36533
Add exceptional shift to fix rare convergence problems
5 years ago
Martin Kroeker
fe71887b68
Merge pull request #10 from xianyi/develop
rebase
5 years ago
Martin Kroeker
10094bd885
Merge pull request #3076 from martin-frbg/dyn-thunderx
Add Ci job for ARM64/gcc10 DYNAMIC_ARCH
5 years ago
Martin Kroeker
eea0c0f2ed
Merge pull request #3085 from alexhenrie/memory_alloc
Fix null pointer check in blas_memory_alloc
5 years ago
Martin Kroeker
85be43e0df
Merge pull request #3083 from martin-frbg/develop
Add DYNAMIC_LIST support for ARM64
5 years ago
Martin Kroeker
0cb9e9fc8d
Remove the VORTEX support bits again for now
5 years ago
Martin Kroeker
cb61d3b46b
Add DYNAMIC_LIST support for ARM64
5 years ago
Alex Henrie
113840da12
Fix null pointer check in blas_memory_alloc
5 years ago
Martin Kroeker
deb2e66bcc
Add DYNAMIC_LIST support for ARM64
5 years ago
Martin Kroeker
9b2d69aa80
Add DYNAMIC_LIST option for ARM64
5 years ago
Martin Kroeker
e3ff4cdd23
Merge pull request #9 from xianyi/develop
rebase
5 years ago
Martin Kroeker
0745ba43a4
Merge pull request #3082 from RajalakshmiSR/scalp10
Optimize s/dscal function for POWER10
5 years ago
Rajalakshmi Srinivasaraghavan
3ede843d50
Optimize s/dscal function for POWER10
This patch makes use of new POWER10 vector pair instructions for
loads and stores.
5 years ago
xoviat
2e8d6e8690
add functions for embedded
5 years ago
Martin Kroeker
69a5558203
Merge pull request #3059 from Guobing-Chen/BF16_gemm
Initial code for Cooperlake BF16 GEMM kernel
5 years ago
Martin Kroeker
d6905403e3
Merge pull request #3068 from alexhenrie/scan-build
scan-build fixes
5 years ago
Martin Kroeker
411926b572
Merge pull request #3079 from RajalakshmiSR/rotp10
Optimize s/drot function for POWER10
5 years ago
Rajalakshmi Srinivasaraghavan
439b93f6d2
Optimize s/drot function for POWER10
This patch makes use of new POWER10 vector pair instructions for
loads and stores.
5 years ago
Martin Kroeker
d6cf67778c
Merge pull request #3075 from martin-frbg/issue3074
Fix DYNAMIC_ARCH compilation on POWER with gcc <11
5 years ago
Martin Kroeker
b94dab5250
patch to support power10 in builtin_cpu_is was backported to gcc 10.2, so allow that as wel
5 years ago
Martin Kroeker
6178974cd9
Update .drone.yml
5 years ago