wernsaar
a4dde45f87
optimized sgemv_n kernel for sandybridge
11 years ago
wernsaar
7fa7ea3e1e
updated haswell optimized sgmv_n kernel
11 years ago
wernsaar
3fbc13eb65
modified sgemv_n for haswell
11 years ago
wernsaar
db6917303f
added a better optimized sgemv_n kernel for bulldozer and piledriver
11 years ago
wernsaar
5087096711
optimization of sandybridge cgemm-kernel
11 years ago
wernsaar
46bc4fd50c
optimized cgemm kernel for haswell
11 years ago
wernsaar
1cc02b4337
optimized sgemm kernel for haswell
11 years ago
wernsaar
1d33547222
optimized zgemm kernel for haswell
11 years ago
wernsaar
6acbafe45b
added sgemv_n microkernel for haswell
11 years ago
wernsaar
5392d11b04
optimized sgemv_n_microk_sandy.c
11 years ago
wernsaar
c0fe95fb72
added sgemv_n microkernel for sandybridge
11 years ago
wernsaar
d9d4077c93
added sgemv_t microkernel for haswell
11 years ago
wernsaar
02eb72ac42
bugfix in sgemv_t_microk_sandy.c
11 years ago
wernsaar
c06f9986d4
added sgemv_t microkernel for sandybridge
11 years ago
wernsaar
2cce125c79
added optimized sgemv_t for bulldozer and piledriver
11 years ago
wernsaar
b3938fe371
don't use this sgemv_n on Windows
11 years ago
wernsaar
c8a4a56177
performance optimizations for sgemv_n
11 years ago
wernsaar
3c5732615d
added blocked sgemv_n and microkernel for bulldozer and piledriver
11 years ago
wernsaar
880597b301
segment violation in sgemv kernels
12 years ago
wernsaar
13348b2137
removed reference to daxpy_bulldozer kernel (Windows bug in lapack-test)
12 years ago
wernsaar
d5b976f92d
fallback to zgemm_kernel_4x2_sse.S
12 years ago
wernsaar
e0c080a28c
removed reference to zgemm_kernel_4x2_sse3.S (bug in lapack-test)
12 years ago
wernsaar
b079df9ef4
added optimized sdot- and dsdot-kernel, written in C
12 years ago
wernsaar
01a119abfc
enabled SMP for sbmv and zsbmv, but only for 64bit binaries
12 years ago
Zhang Xianyi
99efbbbad5
Fixed #395 . Enable optimized cgemm for Sandybridge. Added optimized sdot kernel.
Fixed c/zgemm, zgemv computational error of haswell, piledriver, bullldozer, and
barcelona on Windows.
Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop
Conflicts:
kernel/Makefile.L1
kernel/x86_64/KERNEL
param.h
12 years ago
wernsaar
22e5aee2dd
fixed zgemv bug for older AMD Processors
12 years ago
wernsaar
35d37e124f
bugfix for barcelona zgemv-kernel
12 years ago
wernsaar
d8ba46efdb
bugfix for bulldozer cgemm-, zgemm- and zgemv-kernel
12 years ago
wernsaar
a15f22a1f6
bugfix for piledriver cgemm-, zgemm- and zgemv-kernel
12 years ago
wernsaar
b94ea89f52
bugfix for haswell cgemm- and zgemm-kernel
12 years ago
wernsaar
35f668bb14
bugfix for cgemm_kernel_8x2_sandy.S
12 years ago
Timothy Gu
6c2ead30f0
Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
12 years ago
wernsaar
365e8de346
added optimized cgemm-kernel for SANDYBRIDGE
12 years ago
wernsaar
578d1b6219
added DSDOT definition and enabled optimized sdot kernel
12 years ago
wernsaar
dabab2b5f4
added new optimized sgemm kernel for SANDYBRIGE
12 years ago
wernsaar
aa2709c4e0
enabled optimized dgemm kernel for NEHALEM
12 years ago
wernsaar
a13bcc1716
enabled optimized sgemv kernel for barcelona and piledriver
12 years ago
wernsaar
d2c82d7543
enabled optimized sgemv kernel for HASWELL
12 years ago
wernsaar
0517672dd0
enabled optimized sgemv kernels for nehalem, sandybridge and bulldozer
12 years ago
wernsaar
23203d52c1
Ref #380 : lowered stack usage for haswell kernels
12 years ago
wernsaar
73545a79cd
Ref #380 : lowered stack usage for piledriver and bulldozer kernels
12 years ago
wernsaar
5f3b68b4d4
replaced sgemm and cgemm kernels because lapack bugs
12 years ago
wernsaar
2424af62fd
replaced dgemm-kernel because bug in lapack
12 years ago
wernsaar
793509a3b5
replaced files for sdot, sgemv_n and sgemv_t for bug #348
12 years ago
wernsaar
47b22763f8
reduced stack usage on windows to 16K
12 years ago
Zhang Xianyi
9a557e90da
Refs #340 . Fixed SEGFAULT bug of dgemv_n on OSX.
12 years ago
wangqian
2d557eb1e0
Fixed computational error of dgemv_n.
12 years ago
Zhang Xianyi
05bb391c3a
Refs #330 . Fixed the compatible issue with clang on Mac OSX.
12 years ago
Zhang Xianyi
9b5be29886
Refs #310 . Fixed Segfault bug on nehalem when Julia calling dgeqrt3 on OSX.
Please also check JuliaLang/julia#4099
Julia test script:
A=rand(256, 256)
qrfact(A)
I found this was a bug in kernel/x86_64/dgemm_ncopy_8.S.
However, I cannot use gdb with julia. Thus, this is a walkaround fix.
12 years ago
wernsaar
034a5b2083
modified zsymv
12 years ago