tingbo.liao
3c8df6358f
Further rearranged the rotm kernel for the different architectures.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
1 year ago
Martin Kroeker
b613754143
Update scal..c
1 year ago
Martin Kroeker
73751218a4
make NAN handling depend on dummy2 parameter
1 year ago
Martin Kroeker
a5c04e326a
Update scal.c
1 year ago
Martin Kroeker
536200bc9e
fix handling of INF or NAN
1 year ago
Martin Kroeker
9ff4e9714e
additional fixes for handling INF arguments
1 year ago
Martin Kroeker
23796f8d31
fix loop condition for incx < 0
1 year ago
Martin Kroeker
bf93459746
fix loop condition for incx < 0
1 year ago
Martin Kroeker
0d2e486edf
Handle NAN and INF
2 years ago
Martin Kroeker
a2d867f4d1
Allow negative iNCX (API change from version 3.10 of the reference implementation)
2 years ago
Martin Kroeker
0a4546b742
Typo fix
5 years ago
Martin Kroeker
b1eed27a54
Replace naive omatcopy_rt with 4x4 blocked implementation
as suggested by MigMuc in issue 2532
5 years ago
Martin Kroeker
43aac5bacc
Support NVIDIA HPC compiler
5 years ago
Martin Kroeker
f8346603cf
Fix compilation with SolarisStudio
5 years ago
Qiyu8
c4c591ac5a
fix sum optimize issues
5 years ago
Martin Kroeker
28d2dfe2b3
Fix macro name used in ifdef
5 years ago
Qiyu8
bfdf4b56da
Add double precision universal intrinsics for X86/ARM
5 years ago
Qiyu8
0ed1f07660
Optimize the performance of sum by using universal intrinsics
5 years ago
Martin Kroeker
bf1f0734ff
Use OPENBLAS_MAKE_COMPLEX_FLOAT on PPC only
5 years ago
Martin Kroeker
7c6e56b5df
Rewrite assignment to complex for better portability
5 years ago
Martin Kroeker
806f89166e
Make ARMV7 compile with xcode and add a CI job for it ( #2537 )
* Add an ARMV7 iOS build on Travis
* thread_local appears to be unavailable on ARMV7 iOS
* Add no-thumb option for ARMV7 IOS build to get it to accept DMB ISH
* Make local labels in macros of nrm2_vfpv3.S compatible with the xcode assembler
5 years ago
Martin Kroeker
74c10b57c6
Use generic kernels for complex (I)AMAX to support softfp
6 years ago
Martin Kroeker
c5495d2056
Ensure correct output for DAMAX with softfp
6 years ago
Martin Kroeker
c70496b108
Separate implementations of AMAX and IAMAX on arm
As noted in #1912 and comment on #1942 , the combined implementation happens to "do the right thing" on hardfp, but cannot return both value and index on softfp where they would have to share the return register
6 years ago
Martin Kroeker
94ab4e6fb2
Add ARM implementations of ?sum
(trivial copies of the respective ?asum with the fabs calls removed)
6 years ago
Martin Kroeker
808410c2c7
Fix wrong comparison that made IMIN identical to IMAX
as suggested in #1990
7 years ago
Martin Kroeker
9b2a7ad40d
Convert fldmia/fstmia instructions to UAL syntax for clang7
second part of fix for #1774 , containing files missed in #1775
7 years ago
Martin Kroeker
7e5df34e6a
Convert fldmia/fstmia instructions to UAL syntax for clang7
fixes #1774
7 years ago
Martin Kroeker
b83e4c60c7
Remove premature exit for INC_X or INC_Y zero
7 years ago
Martin Kroeker
e344db269b
Remove premature exit for INC_X or INC_Y zero
7 years ago
Martin Kroeker
545b82efd3
Remove premature exit for INC_X or INC_Y zero
7 years ago
Martin Kroeker
e322a951fe
Remove premature exit for INC_X or INC_Y zero
7 years ago
Martin Kroeker
2d0929fa7c
Move the test for zero incx,incy in ARMV7 ROT
to pass the related utest (see #1469 )
7 years ago
Martin Kroeker
125343cc88
Drop test for zero incx,incy in armv7 AXPY
...to pass the related utest (see #1469 )
7 years ago
Martin Kroeker
6e70287776
Use generic/dot.c for DSDOT on ARMV5 and above
The default arm/dot.c is less precise when used for DSDOT, as shown by utest
8 years ago
Zhang Xianyi
d5ef0dee9a
Merge pull request #1226 from ashwinyes/develop_arm_clang_ual_fix
arm: Fix clang compilation for ARMv7
8 years ago
Ashwin Sekhar T K
f02d535fde
arm: Fix clang compilation for ARMv7
clang is not recognizing some pre-UAL VFP mnemonics like fnmacs, fnmacd,
fnmuls and fnmuld. Replaced them with equivalent UAL mnemonics which are
vmls.f32, vmls.f64, vnmul.f32 and vnmul.f64 respectively.
8 years ago
Ashwin Sekhar T K
97d671eb61
arm: add softfp support in zgemm/ztrmm vfp kernels
8 years ago
Ashwin Sekhar T K
305cd2e8b4
arm: add softfp support in cgemm/ctrmm vfp kernels
8 years ago
Ashwin Sekhar T K
09bc6ebe5b
arm: add softfp support in dgemm/dtrmm vfp kernels
8 years ago
Ashwin Sekhar T K
872a11a2bf
arm: add softfp support in sgemm/strmm vfp kernels
8 years ago
Ashwin Sekhar T K
8f83d3f961
arm: add softfp support in vfp gemv kernels
8 years ago
Ashwin Sekhar T K
83bd547517
arm: add softfp support in kernel/arm/swap_vfp.S
8 years ago
Ashwin Sekhar T K
e25f4c01d6
arm: add softfp support in kernel/arm/nrm2_vfp*.S
8 years ago
Ashwin Sekhar T K
54915ce343
arm: add softfp support in kernel/arm/*dot_vfp.S
8 years ago
Ashwin Sekhar T K
0150fabdb6
arm: add softfp support in kernel/arm/rot_vfp.S
8 years ago
Ashwin Sekhar T K
4f0773f07d
arm: add softfp support in kernel/arm/axpy_vfp.S
8 years ago
Ashwin Sekhar T K
aa5edebc80
arm: add softfp support in kernel/arm/asum_vfp.S
8 years ago
Ashwin Sekhar T K
89924b3d5b
arm: Use assembly implementations based on the ARM abi
In case of softfp abi, assembly implementations of only those APIs are
used which doesnt have a floating point argument or return value.
In case of hard abi, all assembly implementations are used.
8 years ago
Zhang Xianyi
b5c96fcfcd
Support ARM SOFTFP ABI for saxpy, sdot, snrm2, sscal, sgemv, sger.
8 years ago