kaustubh
f3419e634c
SGEMM, DGEMM, CGEMM, ZGEMM functions data prefetch
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
9 years ago
Zhang Xianyi
7472c79ea6
Merge pull request #984 from ksraste/develop
STRSM, DTRSM functions data prefetch
9 years ago
kaustubh
90e2321ac3
STRSM, DTRSM functions data prefetch
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
9 years ago
Martin Kroeker
4998e19869
Change file comments to work around clang 3.9 assembler bug
9 years ago
Martin Kroeker
91610f3835
Update zdot_msa.c
9 years ago
Martin Kroeker
6e22ecf102
Update zdot.c
9 years ago
Martin Kroeker
6221d6df5f
Update zdot.c
9 years ago
Martin Kroeker
16446d1d23
Remove explicit include of complex.h
9 years ago
Martin Kroeker
a6e9e0b94b
Remove explicit include of complex.h
9 years ago
Martin Kroeker
3178e4fea0
Remove explicit include of complex.h
9 years ago
Martin Kroeker
95c245ddb0
Remove explicit include of complex.h
9 years ago
Martin Kroeker
4b1b27347f
Remove explicit include of complex.h
9 years ago
Shivraj Patil
54747fe24a
DGEMM function split and data prefech
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
515bc56ea9
Refs #946 . Use nrm2 reference implementation for Power8.
9 years ago
Zhang Xianyi
ae70b916f4
Refs #929 . Deal with zero and NaNs for scale.
9 years ago
Shivraj Patil
9687437928
MIPS n32 ABI and build time mips simd support check
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Shivraj Patil
d1c6469283
MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Ashwin Sekhar T K
c54a29bb48
Cortex A57: Improvements to DGEMM 8x4 kernel
9 years ago
Shivraj Patil
beb1d076a4
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
8a592ee386
Merge pull request #924 from ashwinyes/develop_aarch64_improvements_20160714
Improvements to Aarch64 kernels
9 years ago
Ashwin Sekhar T K
0a5ff9f9f9
Improvements to TRMM and GEMM kernels
9 years ago
Ashwin Sekhar T K
8a40f1355e
Improvements to GEMV kernels
9 years ago
Ashwin Sekhar T K
78782485b6
Improvements to COPY and IAMAX kernels
9 years ago
Shivraj Patil
57df7956ee
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM. Updated macros in SGEMM, DGEMM, STRMM.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
10 years ago
Zhang Xianyi
4a30a2584a
Merge pull request #897 from ksraste/develop
STRSM optimized for MSA
10 years ago
Werner Saar
f04af36ad0
Merge pull request #898 from wernsaar/develop
added experimental support for optimized lapack fortran functions
10 years ago
Kaustubh Raste
011431b9d7
STRSM optimized for MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
10 years ago
Kaustubh Raste
c8a7860eb3
STRSM optimized
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
10 years ago
Zhang Xianyi
2daad2bcb5
Merge pull request #893 from biddisco/develop
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PRO…
10 years ago
John Biddiscombe
053044ae4d
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
10 years ago
Aleksey Kuleshov
fca66262c4
mips64/axpy: fix error when INCY == 0
10 years ago
Werner Saar
412bcd187a
optimized dtrsm_logic_LT_16x4_power8.S and dtrsm_macros_LT_16x4_power8.S
10 years ago
Werner Saar
bd06b246cc
Merge pull request #890 from wernsaar/develop
optimized dtrsm_kernel_LT for POWER8
10 years ago
Werner Saar
8b140220c8
optimized dtrsm_kernel_LT for POWER8
10 years ago
Werner Saar
8fb5a1aaff
added optimized dtrsm_LT kernel for POWER8
10 years ago
Kaustubh Raste
ad9f317870
STRSM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
10 years ago
Shivraj Patil
c4ba40e308
SGEMM optimization for MIPS P5600 and I6400 using MSA. Unrolled k loop in DGEMM kernel function
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
10 years ago
Zhang Xianyi
7a19065369
Merge pull request #878 from ksraste/develop
DTRSM bug fix for MIPS P5600 and I6400
10 years ago
Werner Saar
6a2bde7a2d
optimized dgemm and dgetrf for POWER8
10 years ago
Kaustubh Raste
d7cbc7ac13
DTRSM bug fix for MIPS P5600 and I6400
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
10 years ago
Werner Saar
88011f625d
Merge pull request #876 from wernsaar/develop
optimized dgemm on power8 for 20 threads
10 years ago
Werner Saar
8310d4d3f7
optimized dgemm for 20 threads
10 years ago
Kaustubh Raste
edb5980c13
DTRSM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
10 years ago
Shivraj Patil
085cf236c2
conflict resolved by syncing with 'xianyi:develop'
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
10 years ago
Shivraj Patil
b7b3d8ec8e
DGEMM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
10 years ago
Zhang Xianyi
cd7af5260a
Merge pull request #847 from sva-img/develop
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
10 years ago
Werner Saar
56948dbf0f
optimized dgemm for POWER8
10 years ago
Werner Saar
0d0c6f7d7d
optimized dgemm for POWER8
10 years ago
Werner Saar
298b13bba4
updated some kernel files for EXCAVATOR
10 years ago
Werner Saar
78b05f6476
bugfix for EXCAVATOR and DYNAMIC_ARCH
10 years ago