Werner Saar
|
298b13bba4
|
updated some kernel files for EXCAVATOR
|
9 years ago |
Zhang Xianyi
|
f24d5307cf
|
Refs #834. Fix zgemv config bug on Steamroller.
|
9 years ago |
Zhang Xianyi
|
d4380c1fe4
|
Refs xianyi/OpenBLAS-CI#10 , Fix sdot for scipy test_iterative.test_convergence test failure on AMD bulldozer and piledriver.
|
9 years ago |
Werner Saar
|
faa5e2e5e3
|
FIX: forgot the add the files cgemv_n_4.c and cgemv_t_4.c
|
10 years ago |
Werner Saar
|
fdf291be30
|
Added optimized cgemv_n and cgemv_t kernels for bulldozer, piledriver and steamroller
|
10 years ago |
Werner Saar
|
c99cc41cbd
|
Added optimized zgemv_n kernel for bulldozer, piledriver and steamroller
|
10 years ago |
Werner Saar
|
acdff55a6a
|
Bugfix for ztrmv
|
10 years ago |
Zhang Xianyi
|
7d6b68eb4a
|
Refs #786. Revert to default assembly kernel.
|
10 years ago |
Zhang Xianyi
|
8f758eeff9
|
Refs #786. avoid old assembly c/zgemv kernels.
|
10 years ago |
Zhang Xianyi
|
efa4f5c936
|
Refs #695 #783. Replace default x86_64 cgemv_t
asm kernel by C kernel.
|
10 years ago |
Zhang Xianyi
|
6e7be06e07
|
Refs JuliaLang/julia#5728. Fix gemv performance bug on Haswell Mac OSX.
On Mac OS X, it should use .align 4 (equal to .align 16 on Linux).
I didn't get the performance benefit from .align. Thus, I deleted it.
|
10 years ago |
Zhang Xianyi
|
962376664d
|
Refs #768. Swap the result of zdot x87 fp kernel.
|
10 years ago |
Zhang Xianyi
|
c44ff4d648
|
Refs #714. avoid compiling warnings.
|
10 years ago |
Werner Saar
|
c8f2c5d636
|
added optimized trsm_kernels
|
10 years ago |
Zhang Xianyi
|
69363622a8
|
Fix DYNAMIC_ARCH=1 bug.
|
10 years ago |
Zhang Xianyi
|
f874465bb8
|
Use cmake to build OpenBLAS GENERIC Target on MSVC x86 64-bit.
Disable CBLAS and LAPACK.
|
10 years ago |
Zhang Xianyi
|
ab0a0a75fc
|
Merge branch 'develop' into cmake
|
10 years ago |
Zhang Xianyi
|
1cf2b10224
|
Use pure C generic target on x86 and x86_64.
make TARGET=GENERIC
?gemm3m is unimplemented on generic target.
|
10 years ago |
Zhang Xianyi
|
7ac7e147d4
|
Fixed cmake building bugs on Linux. Disable LAPACK by default.
|
10 years ago |
Werner Saar
|
e7c969e164
|
added optimized dtrmm_kernel for haswell
|
10 years ago |
Werner Saar
|
9bd962f655
|
modified haswell parameter dgemm_unroll_n
|
10 years ago |
Werner Saar
|
24f58c8bb1
|
added optimized cscal and zscal kernels for steamroller
|
10 years ago |
Werner Saar
|
95b1faf667
|
added optimized cscal and zscal kernels for steamroller and piledriver
|
10 years ago |
Werner Saar
|
2d9e406050
|
added optimized cscal kernel for sandybridge
|
10 years ago |
Werner Saar
|
59083e3ce1
|
added optimized cscal kernel for bulldozer
|
10 years ago |
wernsaar
|
685be40339
|
Merge pull request #571 from wernsaar/develop
added optimized cscal and zscal functions
|
10 years ago |
Werner Saar
|
31c9e399e9
|
added optimized cscal kernel for haswell
|
10 years ago |
Werner Saar
|
7de6bb9889
|
added optimized zscal kernel for bulldozer
|
10 years ago |
Werner Saar
|
d63034303b
|
added optimized zscal kernel for haswell
|
10 years ago |
Zhang Xianyi
|
51ff17d46e
|
Add AMD Excavator target.
|
10 years ago |
Werner Saar
|
18e90ee2e3
|
bugfix: added static to functions
|
10 years ago |
Werner Saar
|
e00cccc41e
|
added optimized dscal kernel for piledriver
|
10 years ago |
Werner Saar
|
73f09bf64f
|
optimized dscal kernel for increment != 1
|
10 years ago |
Werner Saar
|
02e772c7e4
|
added optimized dscal kernel for haswell
|
10 years ago |
Werner Saar
|
7aee913991
|
added optimized dscal kernel for sandybridge
|
10 years ago |
Werner Saar
|
e50a933037
|
added optimized dscal kernel for bulldozer
|
10 years ago |
Werner Saar
|
133c11a156
|
updated dgemv_n kernel for nehalem
|
10 years ago |
Werner Saar
|
30f52d53df
|
optimized dgemv_n kernel for haswell
|
10 years ago |
Werner Saar
|
5e83d80725
|
optimized dger kernel for sandybridge
|
10 years ago |
Werner Saar
|
b2e1797dc6
|
added optimized sger kernel for sandybridge
|
10 years ago |
Werner Saar
|
e216f686cb
|
optimized saxpy and daxpy for sandybridge
|
10 years ago |
Werner Saar
|
fc0e0391f3
|
bugfixes: replaced int with BLASLONG
|
10 years ago |
Werner Saar
|
c22068c406
|
optimized sdot.c for increments != 1
|
10 years ago |
Werner Saar
|
dee100d0e4
|
optimized saxpy.c for increments != 1
|
10 years ago |
Werner Saar
|
0273966abb
|
optimized daxpy kernel for increments != 1
|
10 years ago |
Werner Saar
|
3a67daa954
|
optimized ddot.c for increments != 1
|
10 years ago |
Werner Saar
|
b4f2153dcd
|
added optimized ssymv kernels for sandybridge
|
10 years ago |
Werner Saar
|
1c4b0eeae3
|
added optimized ssymv kernels for haswell
|
10 years ago |
Werner Saar
|
1bec9abb9a
|
added optimized dsymv kernels for sandybridge
|
10 years ago |
Werner Saar
|
3814bf60d3
|
added optimized dsymv kernels for haswell
|
10 years ago |