Angelika Schwarz
db3a43c8ed
Simplify rotg
* The check da != ZERO is no longer necessary since there
is a special case ada == ZERO, where ada = |da|.
* Add the missing check c != ZERO before the division.
Note that with these two changes the long double code
follows the float/double version of the code.
2 years ago
Angelika Schwarz
6876ae0c3b
Fix division by zero in zrotg
The cases
[ c s ] * [ 0 ] = [ |db_i| ]
[-s c ] [ i*db_i ] [ 0 ]
and
[ c s ] * [ 0 ] = [ |db_r| ]
[-s c ] [ db_r ] [ 0 ]
computed s incorrectly. To flip the entries of vector,
s should be conjg(db)/|db| and not conjg(db) / da,
where da == 0.0.
2 years ago
Martin Kroeker
42909ce57d
Merge branch 'xianyi:develop' into issue4130
2 years ago
Martin Kroeker
a2a184572c
update zrotg
2 years ago
Martin Kroeker
214be14c1d
Correct INFO returned for lda in non-CBLAS s/dgeadd
2 years ago
Martin Kroeker
4cc804c754
Prepare for INCX < 0 in new NRM2 implementation from BLAS 3.10
2 years ago
Martin Kroeker
04cdf5efb4
fix typo and missing declaration
2 years ago
Martin Kroeker
5e1103b8d7
Update rotg.c
2 years ago
Martin Kroeker
7c75c8b2fe
fix truncated edit
2 years ago
Martin Kroeker
0f2ce93904
typo fix
2 years ago
Martin Kroeker
e08743d977
Update to use safe scaling algorithm from Reference-LAPACK PR 527
2 years ago
Martin Kroeker
7e93ab1b9e
Fix info code returned for invalid ldb
2 years ago
Martin Kroeker
bb862b82d5
Fix integer overflow in multithreading threshold calculation for SYMM/SYRK ( #4116 )
* Fix potential integer overflow
2 years ago
Martin Kroeker
c3a2d407a0
Merge pull request #4048 from imzhuhl/spr_sbgemm_fix
Sapphire Rapids sbgemm fix
2 years ago
Angelika Schwarz
899c3a6f6a
Improve input argument checks of gemmt
* Fix return value for invalid info
* Add missing checks for ldA, ldB
* Use reference-LAPACK like checks (ie ld=0,nrows=0 is invalid)
2 years ago
Honglin Zhu
71e4125795
Fix syscall error on non-x86 platform
2 years ago
Honglin Zhu
90f041e348
Invoke the syscall to allow the use of amx tiles
2 years ago
Ken Ho
df1b1f6a91
More detailed error message in [z]imatcopy.c.
2 years ago
Ken Ho
7a86c437b5
Change some "if" statements to "else if" following suggestion by @mmuetzel .
2 years ago
Ken Ho
33ab415f68
Bug fix and improvements for [z]imatcopy interface.
2 years ago
Martin Kroeker
1f6f7328eb
remove redundant declaration
2 years ago
Martin Kroeker
7152d6b06d
fix cblas_gemmt
2 years ago
Martin Kroeker
38d7a7b562
Fix ?GEMMT
2 years ago
Martin Kroeker
912d713b52
redo lost edit
2 years ago
Martin Kroeker
dc15c18efc
Fix build failures seen with the NO_LAPACK option - cspr/csymv/csyr belong on the LAPACK list
2 years ago
H. Vetinari
f2659516ef
remove unqualified ifdef's for NO_LAPACK(E)
2 years ago
Martin Kroeker
f2d6b1c70e
Add multithreading threshold
2 years ago
Martin Kroeker
a495ffc554
Rework multithreading threshold
2 years ago
Martin Kroeker
244147495a
Do not use multithreading for small workloads
2 years ago
Martin Kroeker
ab32f832a8
fix stray blank on continuation line
2 years ago
Martin Kroeker
e359787e28
restore C/Z SPMV, SPR, SYR,SYMV
2 years ago
Martin Kroeker
f10c266b4d
Fix stride in shortcut path for small N
3 years ago
Martin Kroeker
8c99d5d1b6
Merge pull request #3796 from martin-frbg/gemmt
Add a trivial GEMMT implementation based on a looped GEMV
3 years ago
Martin Kroeker
e6204d254f
Update CMakeLists.txt
3 years ago
Martin Kroeker
1b77764182
Conditionally leave out bits of LAPACK to be overridden by ReLAPACK
3 years ago
Martin Kroeker
c970717157
fix missing t in xgemmt rule
Co-authored-by: Alexis <35051714+amontoison@users.noreply.github.com>
3 years ago
Martin Kroeker
e7fd8d21a6
Add GEMMT based on looped GEMV
3 years ago
Martin Kroeker
a3e02742f2
Add USE_PERL fallback option for create script used with FUNCTION_PROFILE
3 years ago
Martin Kroeker
f1c570a5f1
Add back original PERL-based script under new name
3 years ago
Owen Rafferty
42c7a27e6b
rewrite perl scripts in universal shell
3 years ago
Martin Kroeker
7656aba00e
Merge pull request #3493 from martin-frbg/casts+cleanup
WIP casts and cleanups
4 years ago
Martin Kroeker
d2b5fbf80f
Exclude some complex (LAPACK) functions when NO_LAPACK is set
4 years ago
Martin Kroeker
64365c919e
fix function typecasts
4 years ago
gxw
25f99fa9f8
Add cblas_{c/z}srot cblas_{c/z}rotg support
4 years ago
Martin Kroeker
4b3769823a
Revert #3252
4 years ago
Martin Kroeker
2845f54eb8
Remove dangerous optimization from previous #3252 - buffer is never unused here
4 years ago
Martin Kroeker
c35739db5e
Add separate entries for BFLOAT16 functions and fix missing cblas_xerbla
4 years ago
Martin Kroeker
1085775bc6
really remove the unused variable
4 years ago
Martin Kroeker
20581bf303
Remove unused variable
4 years ago
Wangyang Guo
4289cf048d
sbgemm: avoid falling into SGEMM_KERNEL_DIRECT
4 years ago