Martin Kroeker
114bbbc6d7
Merge pull request #3212 from martin-frbg/lapack463
Initialize X and Y to zero for N=0 in xGGGLM (Reference-LAPACK PR463)
4 years ago
Martin Kroeker
b67a92c19f
Merge pull request #3211 from martin-frbg/lapack471
Handle norm NaN value in xGESDD (Reference LAPACK PR471)
4 years ago
Martin Kroeker
4bf00da8fb
Avoid allocating the transposed triangular matrix (Reference-LAPACK PR382)
4 years ago
Martin Kroeker
c26780d451
Initialize X and Y to zero for N=0 (Reference-LAPACK PR463)
4 years ago
Martin Kroeker
d77d9bc920
Handle norm NaN value (Reference LAPACK PR471)
4 years ago
Martin Kroeker
37d3e2bd94
Merge pull request #3210 from martin-frbg/lapack502
Fix possible division by zero in LAPACK xTGSJA (Reference-LAPACK PR502)
4 years ago
Martin Kroeker
de8656769c
Fix possible division by zero in xTGSJA (Reference-LAPACK PR502)
4 years ago
Martin Kroeker
d43e07198d
Merge pull request #3208 from martin-frbg/lapack534
Apply MKL team fixes to the LAPACKE interfaces (Reference-LAPACK PR 534)
4 years ago
Martin Kroeker
da16764c7a
Merge pull request #3209 from martin-frbg/issue3160
Add casts to prevent overflow of intermediate results
4 years ago
Martin Kroeker
98ebc8ac59
Add casts to prevent overflow of intermediate result
4 years ago
Martin Kroeker
904b221f03
Add cast to prevent overflow of intermediate result
4 years ago
Martin Kroeker
5cc35abc3d
Apply MKL team fixes to the LAPACKE interfaces (Reference-LAPACK PR 534)
Removed spurious checks for INFO in xLACPY,xLASET after routines not returning any,and redundant requirements for ldvt in xGESVD_WORK
4 years ago
Martin Kroeker
254774f5a6
Add const qualifiers
4 years ago
Martin Kroeker
ae9cdee753
Merge pull request #3207 from hjl-tools/hjl/cet/develop
x86: Enable Intel CET
4 years ago
H.J. Lu
53ee0b76bb
x86: Enable Intel CET
When Intel CET is enabled, we need to include <cet.h> in assembly codes
to mark Intel CET support and place _CET_ENDBR at the function entry.
4 years ago
Martin Kroeker
dc6b04c375
Merge pull request #3206 from martin-frbg/lapack480535
Import packing improvements to LAPACK xLAQR from Reference-LAPACK (PR 480+535)
4 years ago
pnp
3d4ccd2a13
fix for build error
4 years ago
pnp
c59652f0ce
optimize on sgemv_n for small n
4 years ago
Martin Kroeker
87d2e314db
Import packing improvements in LAPACK xLAQR from Reference-LAPACK PR 480+535
4 years ago
Martin Kroeker
3a30c12019
Merge pull request #25 from xianyi/develop
rebase
4 years ago
Martin Kroeker
c9a82f54d1
Merge pull request #3204 from martin-frbg/lapack506
Correct INFO value returned by SLASQ2/DLASQ2 (Reference-LAPACK 506)
4 years ago
Martin Kroeker
444cb78be5
correct INFO value (Reference-LAPACK 506)
4 years ago
Martin Kroeker
171c20e3b6
Merge pull request #3202 from martin-frbg/issue3201
Fix division by zero in the non-x86 codepath of C/ZROTG
4 years ago
Martin Kroeker
c5fb91f1bc
Fix division by zero in the non-x86 codepath
4 years ago
Martin Kroeker
9a36a283d3
Merge pull request #3199 from martin-frbg/lapack537
Add LAPACKE fixes from Reference-LAPACK PR 537
4 years ago
Martin Kroeker
7e35d25ea0
Merge pull request #3198 from martin-frbg/lapack539
Apply fixes from Reference-LAPACK PR468 and 539 for array declarations in ?ORGBR/?UNGBR
4 years ago
Martin Kroeker
3704f5e5b0
Add missing break statements in the ?lascl functions
4 years ago
Martin Kroeker
6b76066632
Add const qualifiers
4 years ago
Martin Kroeker
2b01132515
Clean up misdeclaration of the dummy stand-in for A in ?ORGBR/?UNGBR workspace queries (Reference-LAPACK PR 468 and 530)
4 years ago
Martin Kroeker
8e95a1e18d
Merge pull request #3195 from martin-frbg/lapack536
Apply lapack-testing fix from Reference-LAPACK PR536
4 years ago
Wangyang Guo
aa7b3dc3db
GEMM: skylake: improve the performance when m is small
4 years ago
Martin Kroeker
13a29d13fd
Apply lapack-testing fix from Reference-LAPACK PR536
fixes changing back from a single OMP thread for error exit testing to the originally requested number of threads for computational tests
4 years ago
Martin Kroeker
a6c2cb8417
Merge pull request #3193 from martin-frbg/lapack538
Apply lapack-testing fixes from Reference-LAPACK PR538
4 years ago
Martin Kroeker
d511a7bb4f
Merge pull request #3191 from martin-frbg/issue3188
Delay creation of the (soft)link until after the library has been built
4 years ago
Martin Kroeker
3526ff2507
Apply fixes from Reference-LAPACK PR538
4 years ago
Martin Kroeker
adcfe7b789
Merge pull request #3190 from martin-frbg/issue3128-2
Replace spurious AVX512 requirement in the Haswell drot microkernel with an AVX2/FMA3 guard
4 years ago
damonyu
ceb44bef14
update the intrinsic api to the offical name.
4 years ago
damonyu1989
ed473267df
Merge pull request #1 from xianyi/develop
update
4 years ago
Martin Kroeker
0608bc5d82
delay creation of the softlink until after the library has been created
4 years ago
Martin Kroeker
3d511f0e66
replace spurious avx512 requirement with fma check
4 years ago
Martin Kroeker
0b8a436af9
Add mixed clang/ifort build on OSX to Azure CI ( #3185 )
* Add mixed clang/ifort build on OSX to the Azure CI config based on https://github.com/oneapi-src/oneapi-ci
(and remove debugging tools from the clang+gfortran job)
* Remove extraneous libgfortran dependency of ifort builds
* remove FEXTRALIB from link line of shared library as ifort keeps track of dependencies (and they are different for a .dylib than what f_check got for an executable)
4 years ago
Martin Kroeker
352efdd13a
Merge pull request #24 from xianyi/develop
rebase
4 years ago
Martin Kroeker
4855af02a3
Merge pull request #3184 from martin-frbg/ctestfix
Fix obscure ctest crashes on OSX and add OSX builds to Azure CI
4 years ago
Martin Kroeker
94a5a1f0f1
Add OSX build variations to Azure CI
4 years ago
Martin Kroeker
751d127d7c
Include cblas_test.h to achieve int/long size change with INTERFACE64
4 years ago
Martin Kroeker
fc101b67e5
Merge pull request #23 from xianyi/develop
rebase
4 years ago
Martin Kroeker
b0239a05fd
Merge pull request #3183 from martin-frbg/2715-x
Restore __volatile__ keyword in ARM64 DYNAMIC_ARCH detection mechanism
4 years ago
Martin Kroeker
623d580b4c
Restore __volatile__ keyword
4 years ago
Martin Kroeker
974acb39ff
Merge pull request #3181 from RajalakshmiSR/dgemmp10vp
POWER10: Improve dgemm performance
4 years ago
Rajalakshmi Srinivasaraghavan
2379abaa5e
POWER10: Improve dgemm performance
This patch uses vector pair pointer for input load operation
which helps to generate power10 lxvp instructions.
4 years ago