Werner Saar
4319769b79
added target processor STEAMROLLER
11 years ago
Jerome Robert
e9d9a8eae3
Allow to do gemv and ger buffer allocation on the stack
ger and gemv call blas_memory_alloc/free which in their turn
call blas_lock. blas_lock create thread contention when matrices
are small and the number of thread is high enough. We avoid
call blas_memory_alloc by replacing it with stack allocation.
This can be enabled with:
make -DMAX_STACK_ALLOC=2048
The given size (in byte) must be high enough to avoid thread contention
and small enough to avoid stack overflow.
Fix #478
11 years ago
Werner Saar
587e16fba3
Ref #458 : Backport, sandybrigde uses nehalem zgemm kernel
11 years ago
Werner Saar
6261342de3
small optimization on dgemm_kernel for N=1
11 years ago
Werner Saar
bc5fff7085
changed inline assembler labels to short form
11 years ago
Zhang Xianyi
0cf29ba6d2
Fixed a bug of sgemm sandy bridge kernel.
Reported by Julia project. JuliaLang/julia#9084
11 years ago
Zhang Xianyi
2fb02626da
Update organization info.
11 years ago
Zhang Xianyi
a85c2785ae
Refs #467 . Added generic kernel file for x86_64.
11 years ago
wernsaar
b7c9566eea
removed obsolete gemv kernel files
11 years ago
wernsaar
6df1b0be81
optimized zgemv_n_microk_sandy-4.c
11 years ago
wernsaar
2ac1e076c1
added optimized zgemv_n kernel for sandybridge
11 years ago
wernsaar
9908b6031c
bugfix in KERNEL.PILEDRIVER
11 years ago
wernsaar
8f100a14f2
optimized cgemv_t kernel for haswell
11 years ago
wernsaar
53b5726b04
added optimized cgemv_t kernel for haswell
11 years ago
wernsaar
1a352b24e6
updated KERNEL.HASWELL
11 years ago
wernsaar
5194818d4b
updated zgemv_t_4.c
11 years ago
wernsaar
8a39cdb1c1
added optimized zgemv_t kernel for haswell
11 years ago
wernsaar
0a1390f2d8
enabled optimized zgemv_t kernel for bulldozer
11 years ago
wernsaar
a8b0812feb
optimized zgemv_t for bulldozer
11 years ago
wernsaar
a0fb68ab42
added optimized zgemv_t kernel for bulldozer
11 years ago
wernsaar
44c11165d5
bugfix in cgemv_t_4.c
11 years ago
wernsaar
564be4eb72
added optimized cgemv_t kernel
11 years ago
wernsaar
107c3ea7d5
added optimized zgemv_t routine
11 years ago
wernsaar
bb8d698335
optimized zgemv_n_microk_haswell-4.c for small size
11 years ago
wernsaar
e0192a6914
bugfix in zgemv_n_4.c
11 years ago
wernsaar
bced4594bb
added optimized zgemv_n kernel
11 years ago
wernsaar
cafba99b6b
bufix in cgemv_n_microk_haswell-4.c
11 years ago
wernsaar
ac8f232b2a
more optimizations
11 years ago
wernsaar
f98e1244c4
optimized cgemv_n_4.c
11 years ago
wernsaar
be95700b30
added optimized cgemv_kernel for haswell
11 years ago
wernsaar
4aa534ae93
added cgemv_n kernel, optimized for small sizes
11 years ago
wernsaar
baa46e4fba
added and tested optimized dgemv_n kernel for haswell
11 years ago
wernsaar
faab7a181d
added optimized dgemv_n kernel for haswell
11 years ago
wernsaar
8109d8232c
optimized dgemv_t kernel for haswell
11 years ago
wernsaar
debc6d1a05
bugfix in KERNEL.HASWELL
11 years ago
wernsaar
e73a0113ec
added optimized gemv kernels
11 years ago
wernsaar
44f2bf9bae
added optimized dgemv_t kernel for haswell
11 years ago
wernsaar
cd34e9701b
removed obsolete files
11 years ago
wernsaar
658939faaa
optimized dgemv_n kernel for small sizes
11 years ago
wernsaar
c4d9d4e5f8
added haswell optimized kernel
11 years ago
wernsaar
7c0a94ff47
bugfix in sgemv_n_microk_haswell-4.c
11 years ago
wernsaar
cbbc80aad3
added optimized sgemv_t kernel for haswell
11 years ago
wernsaar
2be5c7a640
bugfix for windows
11 years ago
wernsaar
80f7786875
enabled optimized sgemv kernels for piledriver
11 years ago
wernsaar
553e275407
optimized sgemv_n kernel for sandybridge
11 years ago
wernsaar
7b3932b3f3
optimized sgemv_n kernel for nehalem
11 years ago
wernsaar
75207b1148
optimized sgemv_n for very small size of m
11 years ago
wernsaar
274828fa50
optimizations for very small sizes
11 years ago
wernsaar
5ae1731fe6
better optimzations for sgemv_t kernel
11 years ago
wernsaar
c8eaf3ae2d
optimized sgemv_t_4 kernel for very small sizes
11 years ago