Shivraj Patil
57df7956ee
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM. Updated macros in SGEMM, DGEMM, STRMM.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
437c7d64f2
Merge pull request #913 from dpfoose/develop
Small change to allow compiling with USE_OPENMP on MSVC
9 years ago
Zhang Xianyi
ca5c25c870
Merge pull request #907 from jeromerobert/bug786
Fix z/ctrmv stack allocation on AMD bulldozer and barcelona target
9 years ago
Zhang Xianyi
4a30a2584a
Merge pull request #897 from ksraste/develop
STRSM optimized for MSA
9 years ago
Daniel Patrick Foose
a94f2b7848
Change to allow compiling with USE_OPENMP on MSVC
MSVC treats the declaration of omp_in_parallel and omp_get_num_procs without the modifiers __declspec(dllimport) and __cdecl as a redefinition.
9 years ago
Jerome Robert
d346c533b1
Fix z/ctrmv stack allocation on AMD bulldozer and barcelona target
* Hopefully, because this was found by error and trial (dark magic)
* Ref #786
9 years ago
Werner Saar
f04af36ad0
Merge pull request #898 from wernsaar/develop
added experimental support for optimized lapack fortran functions
9 years ago
Werner Saar
41000c8443
added directory for optimized lapack fortan codes and added dlaqr5.f
9 years ago
Kaustubh Raste
011431b9d7
STRSM optimized for MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
9 years ago
Kaustubh Raste
c8a7860eb3
STRSM optimized
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
9 years ago
Zhang Xianyi
2daad2bcb5
Merge pull request #893 from biddisco/develop
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PRO…
9 years ago
Zhang Xianyi
bac478d17e
Merge pull request #891 from rndfax/develop
mips64/axpy: fix error when INCY == 0
9 years ago
John Biddiscombe
053044ae4d
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
9 years ago
Aleksey Kuleshov
fca66262c4
mips64/axpy: fix error when INCY == 0
9 years ago
Werner Saar
412bcd187a
optimized dtrsm_logic_LT_16x4_power8.S and dtrsm_macros_LT_16x4_power8.S
9 years ago
Werner Saar
bd06b246cc
Merge pull request #890 from wernsaar/develop
optimized dtrsm_kernel_LT for POWER8
9 years ago
Werner Saar
8b140220c8
optimized dtrsm_kernel_LT for POWER8
9 years ago
Werner Saar
318cad9c37
added trsm bencharks for POWER8 to benchmark/Makefile
9 years ago
Werner Saar
8fb5a1aaff
added optimized dtrsm_LT kernel for POWER8
9 years ago
Zhang Xianyi
7d0358475d
Merge the patch for musl libc.
9 years ago
Zhang Xianyi
b46f680f01
Merge pull request #887 from ksraste/develop
STRSM optimization for MIPS P5600 and I6400 using MSA
9 years ago
Kaustubh Raste
ad9f317870
STRSM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
9 years ago
Zhang Xianyi
a8fcd89d6d
Merge pull request #886 from vriera/develop
Makefile.system: P5600 and I6400 cores need -mmsa
9 years ago
Zhang Xianyi
232335fd49
Merge pull request #885 from sva-img/develop
SGEMM optimization for MIPS P5600 and I6400 using MSA.
9 years ago
Vicente Olivert Riera
e12cff87b8
Makefile.system: P5600 and I6400 cores need -mmsa
Signed-off-by: Vicente Olivert Riera <Vincent.Riera@imgtec.com>
9 years ago
Shivraj Patil
c4ba40e308
SGEMM optimization for MIPS P5600 and I6400 using MSA. Unrolled k loop in DGEMM kernel function
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
7a19065369
Merge pull request #878 from ksraste/develop
DTRSM bug fix for MIPS P5600 and I6400
9 years ago
Werner Saar
8a149e6294
Merge pull request #879 from wernsaar/develop
optimized dgemm and dgetrf for POWER8
9 years ago
Werner Saar
956be69e1d
optimized getrf_single.c for POWER8
9 years ago
Werner Saar
6a2bde7a2d
optimized dgemm and dgetrf for POWER8
9 years ago
Kaustubh Raste
d7cbc7ac13
DTRSM bug fix for MIPS P5600 and I6400
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
9 years ago
Zhang Xianyi
8bf71e9e06
Merge pull request #877 from jeromerobert/bug873
Disable multi-threading in swap
9 years ago
Jerome Robert
40af513669
Disable multi-threading in swap
* Close #873
9 years ago
Werner Saar
88011f625d
Merge pull request #876 from wernsaar/develop
optimized dgemm on power8 for 20 threads
9 years ago
Werner Saar
8310d4d3f7
optimized dgemm for 20 threads
9 years ago
Zhang Xianyi
5faffc123f
Merge pull request #869 from ksraste/develop
DTRSM optimization for MIPS P5600 and I6400 using MSA
9 years ago
Zhang Xianyi
81794ccb9a
Merge pull request #868 from sva-img/develop
build fix for MIPS 32 bit
9 years ago
Kaustubh Raste
edb5980c13
DTRSM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
9 years ago
Shivraj Patil
573d9218f2
build fix for MIPS 32 bit
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
7e549d5f37
Merge pull request #866 from sva-img/develop
DGEMM optimization for MIPS P5600 and I6400 using MSA
9 years ago
Shivraj Patil
085cf236c2
conflict resolved by syncing with 'xianyi:develop'
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
0d1c695508
Merge pull request #867 from IvanUkhov/space
Wrap CURDIR and DESTDIR in quotes
9 years ago
Ivan Ukhov
efaf30d536
Wrap CURDIR and DESTDIR in quotes
9 years ago
Shivraj Patil
b7b3d8ec8e
DGEMM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
9 years ago
Zhang Xianyi
2df60f7315
Merge pull request #863 from ashwinyes/develop_20160429_update_numa_binding
Update NUMA CPU binding
9 years ago
Zhang Xianyi
cd7af5260a
Merge pull request #847 from sva-img/develop
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
9 years ago
Werner Saar
3a2e8c3537
Merge pull request #864 from wernsaar/develop
optimized dgemm for POWER8
9 years ago
Werner Saar
56948dbf0f
optimized dgemm for POWER8
9 years ago
Ashwin Sekhar T K
0fb380c966
Update NUMA CPU binding
When the number of process can all be
accommodated within the current node,
then use cores from the current node only.
9 years ago
Zhang Xianyi
c95f5008fe
Merge pull request #858 from buffer51/develop
Fixed cross-suffix detection for path that contains dashes
9 years ago