Zhang Xianyi
1f217a6175
Merge pull request #943 from ibmsoe/IBMMASS_Support
Added support of IBM's MASS library that optimizes performance on Pow…
9 years ago
nishidha@us.ibm.com
78348a2853
Added support of IBM's MASS library that optimizes performance on Power architectures
9 years ago
Zhang Xianyi
b544be914d
Merge pull request #933 from ashwinyes/develop_aarch64_20160726_Dgemm_8x4_Opts
Cortex A57: Improvements to DGEMM 8x4 kernel
9 years ago
Ashwin Sekhar T K
c54a29bb48
Cortex A57: Improvements to DGEMM 8x4 kernel
9 years ago
Zhang Xianyi
ff4c5deafa
Merge pull request #930 from sva-img/develop
P6600/I6400 Build fix.
9 years ago
Shivraj Patil
22b9c2747d
P6600/I6400 Build fix. Reverted the changes which was done to support for MIPS n32 ABI
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
27b5211ccd
Merge pull request #927 from sva-img/develop
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
9 years ago
Shivraj Patil
beb1d076a4
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
9e44f3ddd0
Refs #917 Avoid detecting gfortran bug on IBM POWER + Ubuntu
9 years ago
Zhang Xianyi
eece9fd889
Merge pull request #926 from vriera/develop
Complete support for MIPS n32 ABI
9 years ago
Zhang Xianyi
5dfa0712c3
Merge pull request #925 from martin-frbg/develop
Update zgetrf2.f, cpuid_x86.c, dynamic.c
9 years ago
Zhang Xianyi
8a592ee386
Merge pull request #924 from ashwinyes/develop_aarch64_improvements_20160714
Improvements to Aarch64 kernels
9 years ago
Zhang Xianyi
7f2409a8e1
Merge pull request #918 from sva-img/develop
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM.
9 years ago
Vicente Olivert Riera
7f28cd1f88
Complete support for MIPS n32 ABI
Signed-off-by: Vicente Olivert Riera <Vincent.Riera@imgtec.com>
9 years ago
Martin Kroeker
154729908e
Update cpuid_x86.c
9 years ago
Martin Kroeker
97bd1e42c8
Update cpuid_x86.c
9 years ago
Martin Kroeker
7de829f713
Update dynamic.c
Add Braswell (extended model 4, model 12) N3150 as Nehalem
9 years ago
Martin Kroeker
9b69d8a8e5
Update zgetrf2.f
Trivial typo correction (ZERBLA => XERBLA) to fix #910
9 years ago
Ashwin Sekhar T K
0a5ff9f9f9
Improvements to TRMM and GEMM kernels
9 years ago
Ashwin Sekhar T K
8a40f1355e
Improvements to GEMV kernels
9 years ago
Ashwin Sekhar T K
78782485b6
Improvements to COPY and IAMAX kernels
9 years ago
Ashwin Sekhar T K
8d86d14d3f
Add time prints in benchmark output
9 years ago
Ashwin Sekhar T K
925d4e1dc6
Add IAMAX and NRM2 benchmarks
9 years ago
Shivraj Patil
57df7956ee
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM. Updated macros in SGEMM, DGEMM, STRMM.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
10 years ago
Zhang Xianyi
437c7d64f2
Merge pull request #913 from dpfoose/develop
Small change to allow compiling with USE_OPENMP on MSVC
10 years ago
Zhang Xianyi
ca5c25c870
Merge pull request #907 from jeromerobert/bug786
Fix z/ctrmv stack allocation on AMD bulldozer and barcelona target
10 years ago
Zhang Xianyi
4a30a2584a
Merge pull request #897 from ksraste/develop
STRSM optimized for MSA
10 years ago
Daniel Patrick Foose
a94f2b7848
Change to allow compiling with USE_OPENMP on MSVC
MSVC treats the declaration of omp_in_parallel and omp_get_num_procs without the modifiers __declspec(dllimport) and __cdecl as a redefinition.
10 years ago
Jerome Robert
d346c533b1
Fix z/ctrmv stack allocation on AMD bulldozer and barcelona target
* Hopefully, because this was found by error and trial (dark magic)
* Ref #786
10 years ago
Werner Saar
f04af36ad0
Merge pull request #898 from wernsaar/develop
added experimental support for optimized lapack fortran functions
10 years ago
Werner Saar
41000c8443
added directory for optimized lapack fortan codes and added dlaqr5.f
10 years ago
Kaustubh Raste
011431b9d7
STRSM optimized for MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
10 years ago
Kaustubh Raste
c8a7860eb3
STRSM optimized
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
10 years ago
Zhang Xianyi
2daad2bcb5
Merge pull request #893 from biddisco/develop
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PRO…
10 years ago
Zhang Xianyi
bac478d17e
Merge pull request #891 from rndfax/develop
mips64/axpy: fix error when INCY == 0
10 years ago
John Biddiscombe
053044ae4d
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
10 years ago
Aleksey Kuleshov
fca66262c4
mips64/axpy: fix error when INCY == 0
10 years ago
Werner Saar
412bcd187a
optimized dtrsm_logic_LT_16x4_power8.S and dtrsm_macros_LT_16x4_power8.S
10 years ago
Werner Saar
bd06b246cc
Merge pull request #890 from wernsaar/develop
optimized dtrsm_kernel_LT for POWER8
10 years ago
Werner Saar
8b140220c8
optimized dtrsm_kernel_LT for POWER8
10 years ago
Werner Saar
318cad9c37
added trsm bencharks for POWER8 to benchmark/Makefile
10 years ago
Werner Saar
8fb5a1aaff
added optimized dtrsm_LT kernel for POWER8
10 years ago
Zhang Xianyi
7d0358475d
Merge the patch for musl libc.
10 years ago
Zhang Xianyi
b46f680f01
Merge pull request #887 from ksraste/develop
STRSM optimization for MIPS P5600 and I6400 using MSA
10 years ago
Kaustubh Raste
ad9f317870
STRSM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
10 years ago
Zhang Xianyi
a8fcd89d6d
Merge pull request #886 from vriera/develop
Makefile.system: P5600 and I6400 cores need -mmsa
10 years ago
Zhang Xianyi
232335fd49
Merge pull request #885 from sva-img/develop
SGEMM optimization for MIPS P5600 and I6400 using MSA.
10 years ago
Vicente Olivert Riera
e12cff87b8
Makefile.system: P5600 and I6400 cores need -mmsa
Signed-off-by: Vicente Olivert Riera <Vincent.Riera@imgtec.com>
10 years ago
Shivraj Patil
c4ba40e308
SGEMM optimization for MIPS P5600 and I6400 using MSA. Unrolled k loop in DGEMM kernel function
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
10 years ago
Zhang Xianyi
7a19065369
Merge pull request #878 from ksraste/develop
DTRSM bug fix for MIPS P5600 and I6400
10 years ago