Zhang Xianyi
79120bf9a0
Refs #205 . Merge boegel's codes about downloading LAPACK.
12 years ago
Zhang Xianyi
acb11905d5
Fixed #199 . Saved USE_THREAD switch for make install.
12 years ago
Zhang Xianyi
109500178c
Refs #220 . Support Power7 by old Power6 kernels.
12 years ago
Zhang Xianyi
e50a664865
Refs #215 . Fixed the compatible between <complex.h> and <complex> in C++.
12 years ago
Zhang Xianyi
357078b93e
Refs #216 . Revert the default value of GEMM_MULTITHREAD_THRESHOLD to 4.
12 years ago
Zhang Xianyi
5d96e4f224
Refs #210 . Disable checking /lib/libpthread.so*.
12 years ago
Xianyi Zhang
dbbda55e67
Updated the mailing list for OpenBLAS.
12 years ago
Xianyi Zhang
6c34a7f43c
Updated the mailing list for OpenBLAS.
12 years ago
Zhang Xianyi
3326f3152c
Merge pull request #213 from wernsaar/develop
Merged some improvements into dgemm_kernel_4x4_bulldozer.S.
12 years ago
wernsaar
7641f6e253
Merged some improvements into dgemm_kernel_4x4_bulldozer.S.
Changed the copy functions to generic to solve prefetch conflicts
12 years ago
Zhang Xianyi
48bdc1ad3b
Added NO_PARALLEL_MAKE flag to disable parallel make.
12 years ago
Zhang Xianyi
3ad29452d1
Merge pull request #211 from wernsaar/develop
New version of dgemm_kernel_4x4_bulldozer.S
12 years ago
wernsaar
6e3f6f25a5
New version of dgemm_kernel_4x4_bulldozer.S
The peak performance with 8 cores is now 90 GFlops
12 years ago
Zhang Xianyi
a068d54981
Refs #209 . Export the missing cblas_cdotc_sub functions.
12 years ago
Zhang Xianyi
e029242870
Merge pull request #206 from wlbksy/patch-1
Fix #204 wget in mingw/msys sometimes download file with trailing name,
13 years ago
wlbksy
7a9b94b519
Fix #204
13 years ago
Kenneth Hoste
66b919d99f
adjusted Makefile to allow for provided required LAPACK source files rather than downloading them
13 years ago
Zhang Xianyi
f4846afbad
Merge pull request #201 from Explorer09/develop
13 years ago
Explorer09
53588bc786
getarch.c: Minor re-ordering of architecture list
13 years ago
Explorer09
b47f13ee4c
getarch.c: Minor re-ordering of architecture list
13 years ago
Explorer09
309f90e563
TargetList.txt: minor re-ordering
13 years ago
Explorer09
773c01f496
Typo correction in README.md
13 years ago
Zhang Xianyi
d831b2ff8b
Override CFLAGS in LAPACK make.in.
13 years ago
Zhang Xianyi
724ae159ce
Fixed the Windows x86_64 ABI bug in s/daxpy kernels.
13 years ago
Zhang Xianyi
2c9a203bd1
Merge pull request #198 from wernsaar/develop
new optimization of dgemm kernel for bulldozer: 10% performance increase
13 years ago
wernsaar
f300ce3df5
new optimization of dgemm kernel for bulldozer: 10% performance increase
13 years ago
Zhang Xianyi
e2c7c75715
Merge pull request #197 from wernsaar/develop
optimized again bulldozer dgemm kernel
13 years ago
wernsaar
66e64131ed
optimized again bulldozer dgemm kernel
13 years ago
Zhang Xianyi
5900b1462e
Merge pull request #195 from wernsaar/develop
Develop dgemm for bullozer
13 years ago
wernsaar
9405f26f4b
new dgemm_kernel for bulldozer
13 years ago
Zhang Xianyi
54e7b37630
Merge branch 'develop'
13 years ago
Zhang Xianyi
529f1b5006
Refs#194. Export the missing LAPACK s/dlamc3 functions.
13 years ago
Zhang Xianyi
e5ac3007e0
Merge branch 'develop'
13 years ago
Zhang Xianyi
0d0405b434
Updated the doc for 0.2.6 version.
13 years ago
Zhang Xianyi
f1ce74ffdd
Improved the print when OS don't support AVX.
13 years ago
Zhang Xianyi
d744c9590a
In OpenMP threading, preallocate the thread buffer instead of allocating the buffer every time. This patch improved the performance slightly.
13 years ago
Zhang Xianyi
3cc6ae793e
Refs #174 . Return sb pointer when OpenMP or Windows.
13 years ago
Zhang Xianyi
4c2123c334
Fixed the overflowing bug in single thread cholesky factorization.
13 years ago
Zhang Xianyi
5155e3f509
Refs #174 . Fixed the overflowing buffer bug of multithreading hbmv and sbmv.
Instead of using thread 0 buffer, each thread uses its own sb buffer.
Thus, it can avoid overflowing thread 0 buffer.
13 years ago
Zhang Xianyi
5c8bf6ae0e
Merge branch 'bulldozer' into develop
13 years ago
Zaheer Chothia
a9500d0079
Missing line continuation -- follow-up to last commit ( 64ad8b9809).
13 years ago
Zaheer Chothia
64ad8b9809
Refs #193 . Don't use C99 complex numbers when building C++ code.
13 years ago
Zaheer Chothia
875d520ccf
Refs #193 . cblas: move #include out of extern "C" block.
Standard headers may contain C++ templates which are not permitted inside an
extern "C" block. This might be the case when we include <complex.h>.
13 years ago
Zhang Xianyi
d311236dfd
Refs #189 . Fixed the bug of s/cdot about invalid reading NAN on x86_64.
13 years ago
Zhang Xianyi
36e0982966
Refs #187 . Use perl to generate cblas_noconst.h instead of sed.
Thank Dan Povey's patch. https://github.com/xianyi/OpenBLAS/issues/187
13 years ago
Zhang Xianyi
8cdb795438
Refs #187 . Use binary code for xgetbv, which is compatible with old compiler.
13 years ago
Zaheer Chothia
4db6660de4
Refs #185 . Add missing 'const' to declarations in <cblas.h>. Thanks to Dan Povey!
The 'const' modifications were done automatically using this scripts:
https://kaldi.svn.sourceforge.net/svnroot/kaldi/sandbox/dan/tools/for_openblas
13 years ago
Zhang Xianyi
0b08f7479e
Refs #154 . Fixed gemv_t bug about overflow 16MB buffer on x86.
13 years ago
Zaheer Chothia
200e4acf15
cblas: typedef enums for improved compatibility with Intel MKL.
Netlib style:
enum CBLAS_XYZ {X=1, Y=2, Z=3};
Intel MKL style:
typedef enum {X=1, Y=2, Z=3} CBLAS_XYZ;
With this hybrid style, code written in the latter form won't need any
modifications to be built with OpenBLAS. This change should not affect existing
code, although a warning may be emitted for C code which does the following
(does not occur with C++):
typedef enum CBLAS_XYZ CBLAS_XYZ;
warning: redefinition of typedef 'CBLAS_XYZ' [-pedantic]
13 years ago
Zhang Xianyi
99d1978df7
Fixed #180 . the typos in kernel/x86_64/sgemv_t.S
13 years ago