Dan Luu
|
6544d30e42
|
Fix segfault when gemm is called immediately after set_num_threads.
|
11 years ago |
Zhang Xianyi
|
1cba8e7b11
|
Merge pull request #446 from grisuthedragon/cblas_matcopy
Add a CBLAS interface for the BLAS extension s/d/c/z*matcopy routines.
|
11 years ago |
Zhang Xianyi
|
d13e92f07e
|
Merge pull request #445 from wernsaar/develop
A lot of optimizations for gemv kernels
|
11 years ago |
wernsaar
|
baa46e4fba
|
added and tested optimized dgemv_n kernel for haswell
|
11 years ago |
wernsaar
|
faab7a181d
|
added optimized dgemv_n kernel for haswell
|
11 years ago |
wernsaar
|
8109d8232c
|
optimized dgemv_t kernel for haswell
|
11 years ago |
wernsaar
|
debc6d1a05
|
bugfix in KERNEL.HASWELL
|
11 years ago |
wernsaar
|
e73a0113ec
|
added optimized gemv kernels
|
11 years ago |
wernsaar
|
44f2bf9bae
|
added optimized dgemv_t kernel for haswell
|
11 years ago |
Martin Koehler
|
a057e5434d
|
add CBLAS interface for s/d/c/zimatcopy
|
11 years ago |
wernsaar
|
cd34e9701b
|
removed obsolete files
|
11 years ago |
Martin Köhler
|
7794766d3c
|
Add cblas_(s/d/c/z)omatcopy in order to have cblas interface for them.
|
11 years ago |
wernsaar
|
658939faaa
|
optimized dgemv_n kernel for small sizes
|
11 years ago |
wernsaar
|
f511807fc0
|
modified multithreading threshold
|
11 years ago |
wernsaar
|
c4d9d4e5f8
|
added haswell optimized kernel
|
11 years ago |
wernsaar
|
7c0a94ff47
|
bugfix in sgemv_n_microk_haswell-4.c
|
11 years ago |
wernsaar
|
cbbc80aad3
|
added optimized sgemv_t kernel for haswell
|
11 years ago |
wernsaar
|
2be5c7a640
|
bugfix for windows
|
11 years ago |
wernsaar
|
80f7786875
|
enabled optimized sgemv kernels for piledriver
|
11 years ago |
wernsaar
|
553e275407
|
optimized sgemv_n kernel for sandybridge
|
11 years ago |
wernsaar
|
7b3932b3f3
|
optimized sgemv_n kernel for nehalem
|
11 years ago |
wernsaar
|
75207b1148
|
optimized sgemv_n for very small size of m
|
11 years ago |
wernsaar
|
274828fa50
|
optimizations for very small sizes
|
11 years ago |
wernsaar
|
5ae1731fe6
|
better optimzations for sgemv_t kernel
|
11 years ago |
wernsaar
|
c8eaf3ae2d
|
optimized sgemv_t_4 kernel for very small sizes
|
11 years ago |
wernsaar
|
3a7ab47ee9
|
optimized sgemv_t
|
11 years ago |
wernsaar
|
cf5544b417
|
optimization for small size
|
11 years ago |
wernsaar
|
d143f84dd2
|
added optimized sgemv_n kernel for haswell
|
11 years ago |
wernsaar
|
7794237475
|
undef WHEREAMI
|
11 years ago |
wernsaar
|
a64fe9bcc9
|
added optimized sgemv_n kernel for sandybridge
|
11 years ago |
wernsaar
|
2021d0f9d6
|
experimentally removed expensive function calls
|
11 years ago |
wernsaar
|
6df7a88930
|
optimized sgemv_t for sandybridge
|
11 years ago |
wernsaar
|
53de943690
|
bugfix for sgemv_n_4.c
|
11 years ago |
wernsaar
|
7f910010a0
|
optimized sgemv_n kernel for small sizes
|
11 years ago |
wernsaar
|
3a5d8dbff9
|
optimized sgemv_n_4.c
|
11 years ago |
wernsaar
|
2a60c6d4b0
|
optimized sgemv_n for small sizes
|
11 years ago |
wernsaar
|
0fc560ba23
|
bugfix for buffer overflow
|
11 years ago |
wernsaar
|
d1800397f5
|
optimized interface/gemv.c for multithreading
|
11 years ago |
wernsaar
|
f4ff889491
|
updated interface/gemv.c for multithreading
|
11 years ago |
wernsaar
|
210bec9111
|
added plot-header to compare multithreading
|
11 years ago |
wernsaar
|
f3b50dcf5b
|
removed obsolete instructions from sgemv_t_4.c
|
11 years ago |
wernsaar
|
93eaba959d
|
optimized sgemv_t for bulldozer
|
11 years ago |
wernsaar
|
9570e56965
|
optimized sgemv_t_4.c for small sizes
|
11 years ago |
wernsaar
|
d7f91f8b4f
|
extended gemv.c benchmark
|
11 years ago |
wernsaar
|
53f1277b6b
|
modified benchmark/gemv.c
|
11 years ago |
wernsaar
|
bc99faef1b
|
optimized sgemv_t_4.c for uneven sizes
|
11 years ago |
wernsaar
|
848c0f16f7
|
optimized sgemv_t_4.c for small size
|
11 years ago |
wernsaar
|
e2fc8c8c2c
|
changed 1 test value (bug in lapack-testing?)
|
11 years ago |
wernsaar
|
53e6dbf6ca
|
optimized sgemv_t kernel for small sizes
|
11 years ago |
Zhang Xianyi
|
868f8a8756
|
Merge pull request #443 from idunham/fix
Workaround PIC limitations in cpuid.
|
11 years ago |