Zhang Xianyi
3cc6ae793e
Refs #174 . Return sb pointer when OpenMP or Windows.
13 years ago
Zhang Xianyi
4c2123c334
Fixed the overflowing bug in single thread cholesky factorization.
13 years ago
Zhang Xianyi
5155e3f509
Refs #174 . Fixed the overflowing buffer bug of multithreading hbmv and sbmv.
Instead of using thread 0 buffer, each thread uses its own sb buffer.
Thus, it can avoid overflowing thread 0 buffer.
13 years ago
Zhang Xianyi
5c8bf6ae0e
Merge branch 'bulldozer' into develop
13 years ago
Zhang Xianyi
6ae2f868fd
Set the affinity. Only use 1 core of each module on bulldozer.
13 years ago
Zhang Xianyi
a1ead62f28
Disable the warning of sgemm bulldozer kernel.
13 years ago
Zhang Xianyi
0133580148
Used sgemm bulldozer kernel on 64 bit.
13 years ago
Zhang Xianyi
274246651d
Merge branch 'bulldozer' of git://github.com/wernsaar/OpenBLAS into bulldozer
13 years ago
Zhang Xianyi
299b5a44dc
Merge branch 'develop' of github.com:xianyi/OpenBLAS into bulldozer
13 years ago
Zaheer Chothia
a9500d0079
Missing line continuation -- follow-up to last commit ( 64ad8b9809).
13 years ago
Zaheer Chothia
64ad8b9809
Refs #193 . Don't use C99 complex numbers when building C++ code.
13 years ago
Zaheer Chothia
875d520ccf
Refs #193 . cblas: move #include out of extern "C" block.
Standard headers may contain C++ templates which are not permitted inside an
extern "C" block. This might be the case when we include <complex.h>.
13 years ago
Zhang Xianyi
d311236dfd
Refs #189 . Fixed the bug of s/cdot about invalid reading NAN on x86_64.
13 years ago
Zhang Xianyi
36e0982966
Refs #187 . Use perl to generate cblas_noconst.h instead of sed.
Thank Dan Povey's patch. https://github.com/xianyi/OpenBLAS/issues/187
13 years ago
Zhang Xianyi
8cdb795438
Refs #187 . Use binary code for xgetbv, which is compatible with old compiler.
13 years ago
Zaheer Chothia
4db6660de4
Refs #185 . Add missing 'const' to declarations in <cblas.h>. Thanks to Dan Povey!
The 'const' modifications were done automatically using this scripts:
https://kaldi.svn.sourceforge.net/svnroot/kaldi/sandbox/dan/tools/for_openblas
13 years ago
Zhang Xianyi
0b08f7479e
Refs #154 . Fixed gemv_t bug about overflow 16MB buffer on x86.
13 years ago
Zaheer Chothia
200e4acf15
cblas: typedef enums for improved compatibility with Intel MKL.
Netlib style:
enum CBLAS_XYZ {X=1, Y=2, Z=3};
Intel MKL style:
typedef enum {X=1, Y=2, Z=3} CBLAS_XYZ;
With this hybrid style, code written in the latter form won't need any
modifications to be built with OpenBLAS. This change should not affect existing
code, although a warning may be emitted for C code which does the following
(does not occur with C++):
typedef enum CBLAS_XYZ CBLAS_XYZ;
warning: redefinition of typedef 'CBLAS_XYZ' [-pedantic]
13 years ago
Zhang Xianyi
99d1978df7
Fixed #180 . the typos in kernel/x86_64/sgemv_t.S
13 years ago
Zhang Xianyi
08bf6674d5
Refs #177 . Fixed sgemv_t compiling bug on Win64.
13 years ago
Zhang Xianyi
8b122ff9dc
Refs #176 . Fixed make.inc overriding RANLIB bug when cross-compiling LAPACK.
13 years ago
Zhang Xianyi
69200884e1
Refs #173 . Fixed overflow internal buffer bug of gemv_n on x86
13 years ago
Zhang Xianyi
0d1518add9
Refs #173 . Fixed overflow internal buffer bug of sgemv_t on x86
13 years ago
Zhang Xianyi
91ed4e4450
Refs #171 . Prevent loading the dirty number from the buffer in sgemv_t x86 kernel.
13 years ago
Zhang Xianyi
fd3046b32a
Refs #173 . Fixed overflow internal buffer bug of gemv_t on x86.
13 years ago
Zhang Xianyi
a4ee6f3915
Fixed #172 . Support Intel Xeon E7540.
13 years ago
Zhang Xianyi
a0363e9b48
Merge branch 'master' into develop
13 years ago
Zhang Xianyi
b471d52e61
Merge pull request #170 from juliantaylor/athlon-defaults
set parameters for CORE_ATHLON
13 years ago
Julian Taylor
9fb341a9f8
set parameters for CORE_ATHLON
else dgemm_p is set to zero leading to a segfault in alloc_mmap due to
allocsize being zero
13 years ago
Zhang Xianyi
fba6b590f2
Merge branch 'master' into develop
13 years ago
Zhang Xianyi
97f68f7f3a
Merge pull request #169 from juliantaylor/sanity-check-cpu
add a sanity check on the detected cpu type
13 years ago
Julian Taylor
1138817dd2
add a sanity check on the detected cpu type
if we have 64 bit pointers we can't have a 32 bit cpu, so fall back to
the 64bit cpu fallback (prescott)
E.g. the cpu detection fails in amd qemu64 emulation (family 6 model 2)
causing it to use the uninitialized gotoblas_ATHLON
13 years ago
Zhang Xianyi
13f8fc0b1a
Write FMA4 flag to the configure file.
13 years ago
Zhang Xianyi
bdf8d9411e
Refs #163 . Obtain the build configure on runtime.
openblas_get_config function returns the configure string.
So far, it supports USE64BITINT, NO_CBLAS, NO_LAPACK, NO_LAPACKE,
DYNAMIC_ARCH, NO_AFFINITY.
Example:
#include <stdio.h>
extern char * openblas_get_config();
void main()
{
printf("%s\n",openblas_get_config());
return;
}
13 years ago
Zhang Xianyi
bb10cb8442
Refs #165 . fall back of DTB_DEFAULT_ENTRIES for some virtual machines.
13 years ago
wernsaar
d48cff8cf1
Added optimized sgemm_kernel
13 years ago
Zhang Xianyi
f19af5ecc0
Refs #54 . Added AMD Bulldozer x86_64 dgemm kernel developed by Werner Saar <wernsaar at googlemail.com>
Based on the dgemm kernel for AMD Barcelona, he used AVX and FMA4 instructions.
Thank Werner Saar!
13 years ago
Zhang Xianyi
bfaaa975e6
Added BULLDOZER target. So far it uses barcelona kernels.
13 years ago
Zhang Xianyi
b7c0fa6bd2
Init AMD Bulldozer codebase.
13 years ago
Zhang Xianyi
7110d17146
Added -lgomp for generating DLL on Windows.
13 years ago
Zhang Xianyi
e01b3d4b54
Merge branch 'develop'
13 years ago
Zhang Xianyi
cea1a885b5
Refs #154 . Fixed the build bug of dgemv_t on MinW64.
13 years ago
Zhang Xianyi
f78eb335d6
Merge branch 'develop'
13 years ago
Zhang Xianyi
2345bdec68
Update the doc for 0.2.5 version.
13 years ago
Zhang Xianyi
5f0117385e
Refs #154 . Fixed a SEGFAULT bug of dgemv_t when m is very large.
It overflowed the internal buffer. Thus, we split vector x into blocks when m is very large.
Thank @wangqian for this patch.
13 years ago
Zhang Xianyi
6caf1bab73
Fixed #160 . Merge branch 'master' of https://github.com/sebastien-villemot/OpenBLAS into develop
13 years ago
Sébastien Villemot
01e3c984ce
Fix compilation with TARGET=GENERIC
Patch applied to Debian package
13 years ago
Zhang Xianyi
6751f7b9a7
Fixed #157 . Only detect the number of physical CPU cores on Mac OSX.
13 years ago
Zhang Xianyi
d5717a97ea
Compile lapacke with ILP64 modle when INTERFACE64=1
13 years ago
Zhang Xianyi
b45d43d295
Added the patch for lapacke example.
13 years ago