wernsaar
|
0b6e13b689
|
Merge remote branch 'origin/develop' into haswell
|
12 years ago |
wernsaar
|
5c648a8984
|
Merge remote branch 'origin/develop' into haswell
|
12 years ago |
wernsaar
|
f1db386211
|
changes for compatibility with Pathscale compiler
|
12 years ago |
Zhang Xianyi
|
2f5fdd2000
|
Refs #314. Fixed clang compiling bug on OSX.
|
12 years ago |
wernsaar
|
6216ab8a7e
|
removed obsolete gemm_kernels from haswell branch
|
12 years ago |
wernsaar
|
afe44b0241
|
tests and code cleanup of gemm_kernels for HASWELL
|
12 years ago |
wernsaar
|
a77c71eaf5
|
added highly optimized dgemm_kernel for HASWELL
|
12 years ago |
wernsaar
|
fe8c5666f9
|
optimized dgemm_kernel for HASWELL
|
12 years ago |
wernsaar
|
f6b50057e2
|
corrected and testet FMA3 Code
|
12 years ago |
wangqian
|
beffee7d91
|
Fixed buffer overflow bug in kernel/x86_64/dgemv_t.S file.
|
12 years ago |
wernsaar
|
067e8417fd
|
removed unnessesary instructions from zgemm_kernel_2x2_bulldozer.S
|
12 years ago |
wernsaar
|
a82da3d069
|
removed unnessesary instructions
|
12 years ago |
Zhang Xianyi
|
1569bf14f8
|
Refs #282. Fixed zgemv_n typo bug on Win64.
|
12 years ago |
Zhang Xianyi
|
f51a849d91
|
Merge pull request #278 from wernsaar/haswell
Merge wernsaar's Haswell gemm kernels.
|
12 years ago |
wernsaar
|
44ef70420c
|
added cgemm_kernel_8x2_haswell.S
|
12 years ago |
wernsaar
|
d488b1b1aa
|
added zgemm_kernel_4x2_haswell.S
|
12 years ago |
wernsaar
|
4070d9a123
|
added dgemm_kernel_16x2_haswell.S
|
12 years ago |
wernsaar
|
0b90c0ec64
|
added sgemm_kernel_16x4_haswell.S
|
12 years ago |
wernsaar
|
2b8ab8f55b
|
sgemm_kernel_16x4_haswell.S minor changes
|
12 years ago |
wernsaar
|
1cb9579cd0
|
added zgemm_kernel_4x2_haswell.S and fixed a bug in sgemm_kernel_16x4_haswell.S
|
12 years ago |
Zhang Xianyi
|
2638370844
|
Init code base for Intel Haswell.
|
12 years ago |
wernsaar
|
89637f87c8
|
added sgemm- and dgemm-kernel for HASWELL processor
|
12 years ago |
Zhang Xianyi
|
c0159d44a3
|
Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop
|
12 years ago |
wernsaar
|
c17a850c1c
|
modified KERNEL.BULLDOZER
|
12 years ago |
wernsaar
|
099853fff6
|
added dtrsm_kernel_RN_8x2_bulldozer.S
|
12 years ago |
wernsaar
|
44d23881b5
|
dtrsm_kernel_LT_8x2_bulldozer.S performance optimization
|
12 years ago |
Zhang Xianyi
|
32fb6b9bb2
|
Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop
|
12 years ago |
wernsaar
|
aaeb8eaecd
|
modified dtrsm_kernel_LT_8x2_bulldozer.S
|
12 years ago |
wernsaar
|
8aeec32ea0
|
modified dtrsm_kernel_LT_8x2_bulldozer.S
|
12 years ago |
wernsaar
|
87fc9de572
|
added dtrsm_kernel_LT_8x2_bulldozer.S
|
12 years ago |
wernsaar
|
564aa60fec
|
removed dtrsm_kernel_LT_8x2_bulldozer.S
|
12 years ago |
wernsaar
|
f645665dd6
|
fixed bug in dgemv_t_bulldozer.S
|
12 years ago |
wernsaar
|
e45a347cd2
|
repaired trmm bug in sgemm_kernel_16x2_bulldozer.S
|
12 years ago |
wernsaar
|
99727ac013
|
repaired trmm bug in cgemm_kernel_4x2_bulldozer.S
|
12 years ago |
wernsaar
|
6e0a2fbc0c
|
repaired trmm bug in zgemm_kernel_2x2_bulldozer.S
|
12 years ago |
wernsaar
|
0a22f99c58
|
repaired trmm bug in dgemm_kernel_8x2_bulldozer.S
|
12 years ago |
wernsaar
|
cff70a666d
|
added generic trmm kernels and modified Makefile.L3
|
12 years ago |
wernsaar
|
84bd0aabaa
|
added dtrsm_kernel_LT_8x2_bulldozer.S
|
12 years ago |
Zhang Xianyi
|
72b1edaf1b
|
Merge branch 'develop' into bulldozer
Conflicts:
kernel/x86_64/KERNEL.BULLDOZER
|
12 years ago |
wangqian
|
1b3b9e841d
|
Fixed a computational error in zgemm_kernel_4x4_sandy.S file.
|
12 years ago |
Zhang Xianyi
|
2ed0f6ab60
|
Fixed the typo.
|
12 years ago |
Zhang Xianyi
|
886cbaf4e4
|
Support AMD Piledriver by bulldozer kernels.
|
12 years ago |
Zhang Xianyi
|
57944538b6
|
Use ALIGN_5 instead of .algin 32 in assembly kernel. Added ALIGN_5 for 32-bit OSX.
|
12 years ago |
Zhang Xianyi
|
fa916a0fac
|
Fixed #238 bug in lsame on x86.
|
12 years ago |
Zhang Xianyi
|
fb298b34ae
|
Merge pull request #235 from wernsaar/develop
Added ddot, daxpy, dcopy kernels for AMD bulldozer.
|
12 years ago |
wernsaar
|
16012767f4
|
added dcopy_bulldozer.S
|
12 years ago |
wernsaar
|
bcbac31b47
|
added ddot_bulldozer.S
|
12 years ago |
wernsaar
|
8dc0c72583
|
added daxpy_bulldozer.S
|
12 years ago |
wernsaar
|
89405a1a0b
|
cleanup of dgemm_ncopy_8_bulldozer.S
|
12 years ago |
wernsaar
|
4f2b12b8a8
|
added dgemv_t_bulldozer.S
|
12 years ago |