Isaac Dunham
db7e6366cd
Workaround PIC limitations in cpuid.
cpuid uses register ebx, but ebx is reserved in PIC.
So save ebx, swap ebx & edi, and return edi.
Copied from Igor Pavlov's equivalent fix for 7zip (in CpuArch.c),
which is public domain and thus OK license-wise.
11 years ago
Zhang Xianyi
2702323f7d
Merge pull request #440 from wernsaar/develop
optimizations for leve1 and level2 blas functions
11 years ago
wernsaar
20cd850125
modification for clang compiler
11 years ago
wernsaar
5fa6158731
renoved flag no-integrated-as, because not working on macosx
11 years ago
wernsaar
84badf8086
EXPERIMENTAL: added the flag -no-integrated-as for clang compiler in Makefile.system
11 years ago
Zhang Xianyi
c8cc4a0d22
Fixed the typo in Changelog.txt
11 years ago
wernsaar
3885eebdb8
added optimized zaxpy bulldozer kernel
11 years ago
wernsaar
ee74445155
added optimized caxpy kernel for bulldozer
11 years ago
wernsaar
9d2ace8bac
added optimized daxpy kernel for bulldozer
11 years ago
wernsaar
b55f997302
added optimized daxpy kernel for nehalem
11 years ago
wernsaar
29125864b3
updated gemm.c
11 years ago
wernsaar
e45c960c2c
added optimized saxpy kernel for nehalem
11 years ago
wernsaar
55e81da379
added axpy benchmark-test
11 years ago
wernsaar
ac76b6267f
added optimized dgemv_n kernel for nehalem
11 years ago
wernsaar
f1b96c4846
added optimized ddot kernel for bulldozer
11 years ago
wernsaar
16d6be852d
added optimized ddot kernel for nehalem
11 years ago
wernsaar
53ec5789e2
bugfix for Makefile
11 years ago
wernsaar
95a707ced3
update of KERNEL.BULLDOZER
11 years ago
wernsaar
5d97b0754c
added optimized sdot kernel for nehalem
11 years ago
wernsaar
8a9e868919
added optimized sdot for bulldozer
11 years ago
wernsaar
7e404de3de
bugfix in Makefile
11 years ago
wernsaar
e4472ad850
added sdot and ddot benchmarks
11 years ago
wernsaar
fb0b4552a5
added hemv benchmark
11 years ago
wernsaar
6f73ffc114
added benchmarks for csymv and zsymv
11 years ago
wernsaar
c8b0645266
added optimized symv_L kernels for nehalem
11 years ago
wernsaar
ec05ff3f64
added optimized ssymv_L kernel for bulldozer
11 years ago
wernsaar
f6f9122660
added optimized dsymv_L kernel for bulldozer
11 years ago
wernsaar
8247f38dc1
added optimized dsymv_U kernel for nehalem
11 years ago
wernsaar
ef6374196d
updated optimized dsymv_U kernel for bulldozer
11 years ago
wernsaar
f824c2b751
updated optimized ssymv_U for bulldozer
11 years ago
wernsaar
4ba4ab623f
added optimized ssymv_U kernel for nehalem
11 years ago
wernsaar
4f39447c05
added optimized ssymv_U kernel for bulldozer
11 years ago
wernsaar
74c9465672
added optimized dsymv_U kernel for bulldozer
11 years ago
Zhang Xianyi
a69dd3fbc5
OpenBLAS 0.2.11 version.
11 years ago
wernsaar
101dd08173
add reference in C for symv_U
11 years ago
wernsaar
493d4fe7e5
added reference in C for symv_L
11 years ago
wernsaar
0a22816e70
Ref #433 : removed obsolete lapack entries from common_interface.h
11 years ago
Zhang Xianyi
c3cd6e7e32
Merge pull request #434 from wernsaar/develop
A lot of performance enhancements
11 years ago
wernsaar
11eab4c019
added optimized cgemv_n for haswell
11 years ago
wernsaar
4568d32b6b
added optimized cgemv_t kernel for haswell
11 years ago
wernsaar
c1a6374c6f
optimized zgemv_n kernel for sandybridge
11 years ago
wernsaar
dc05937313
added additional test values
11 years ago
wernsaar
2470129132
added fast return, if m or n < 1
11 years ago
wernsaar
8c582d362d
optimized zgemv_t_microk_haswell-2.c
11 years ago
wernsaar
11e34ddd1b
bugfix for zgemv_n_microk_haswell-2.c
11 years ago
wernsaar
9528f0d9ee
bugfix in zgemv_n_microk_sandy-2.c
11 years ago
wernsaar
b06550519e
added optimized cgemv_t c-kernel
11 years ago
wernsaar
6093ee5363
bugfix in zgemv_n_microk_haswell-2.c
11 years ago
wernsaar
07c66b1960
modified algorithm for better numerical stability
11 years ago
wernsaar
58b075daef
added optimized zgemv_t kernel for haswell
11 years ago