Commit Graph

  • *
  • *
  • *
  • *
  • *
  • | *
  • | |\
  • | |/
  • |/|
  • | | *
  • | |/|
  • | |/
  • |/|
  • | *
  • * |
  • * |
  • * |
  • * |
  • * |
  • * |
  • * |
  • * |
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | *
  • | |\
  • | * |
  • | * |
  • | * |
  • | * |
  • * | |
  • * | |
  • * | |
  • * | |
  • * | |
  • * | |
  • | * |
  • * | |
  • | * |
  • | * |
  • | * |
  • | |\ \
  • | |/ /
  • |/| |
  • | | | *
  • | | |/|
  • | |/|/
  • | |/|
  • |/| |
  • * | |
  • | * |
  • | * |
  • | |\ \
  • | |/ /
  • |/| |
  • | | | *
  • | | |/|
  • | |/|/
  • | |/|
  • |/| |
  • * | |
  • | | | *
  • | | | |\
  • | | | | *
  • | | | | |\
  • | | | | |/
  • | | | |/|
  • | | | | *
  • | | | * |
  • | * | | |
  • | | | * |
  • | | | |\|
  • | | | | *
  • | | | | *
  • | | | | *
  • | | | | |\
  • | | |_|_|/
  • | |/| | |
  • | * | | |
  • | |\ \ \ \
  • | | | | | | *
  • | | |_|_|_|/|
  • | |/| | | |/
  • | | | | |/|
  • | | | |/| |
  • | | |/| | |
  • | | * | | |
  • | |/ / / /
  • | | | | | *
  • | | |_|_|/|
  • | |/| | |/
  • | | | |/|
  • | | |/| |
  • | | * | |
  • | |/ / /
  • | * | |
  • | |\ \ \
  • | | | | | *
  • | | |_|_|/|
  • | |/| | |/
  • | | | |/|
  • | | |/| |
  • | | * | |
  • | | * | |
  • | | * | |
  • | | * | |
  • | |/ / /
  • | * | |
  • | * | |
  • | * | |
  • | |\ \ \
  • | |/ / /
  • |/| | |
  • | | | | *
  • | | |_|/|
  • | |/| |/
  • | | |/|
  • | |/| |
  • |/| | |
  • * | | |
  • | * | |
  • | |\ \ \
  • | |/ / /
  • |/| | |
  • | | | | *
  • | | |_|/|
  • | |/| |/
  • | | |/|
  • | |/| |
  • |/| | |
  • * | | |
  • | * | |
  • | |\ \ \
  • | |/ / /
  • |/| | |
  • | | | | *
  • | | |_|/|
  • | |/| |/
  • | | |/|
  • | |/| |
  • |/| | |
  • * | | |
  • * | | |
  • |\| | |
  • | * | |
  • * | | |
  • |\| | |
  • | * | |
  • | * | |
  • | * | |
  • | * | |
  • | * | |
  • | * | |
  • | * | |
  • | |\ \ \
  • | | | | | *
  • | | | | | *
  • | | | | | *
  • | | | | | *
  • | | | | | |\
  • | | | | | * |
  • | | | |_|/| |
  • | | |/| |/ /
  • | | | |/| |
  • | | |/| | |
  • | |/| | | |
  • | * | | | |
  • | * | | | |
  • | * | | | |
  • | * | | | |
  • | * | | | |
  • | * | | | |
  • | * | | | |
  • | * | | | |
  • | * | | | |
  • a135f5d9e added gemm_tcopy_2_bulldozer.S by wernsaar 2013-06-18 11:01:33 +0200
  • d0b6299b1 added dgemm_tcopy_8_bulldozer.S by wernsaar 2013-06-17 14:19:09 +0200
  • 9e58dd509 added gemm_ncopy_2_bulldozer.S by wernsaar 2013-06-17 12:55:12 +0200
  • 7c8227101 cleanup of dgemv_n_bulldozer.S and optimization of inner loop by wernsaar 2013-06-16 12:50:45 +0200
  • f67fa6285 added dgemv_n_bulldozer.S by wernsaar 2013-06-15 16:42:37 +0200
  • cd1d473ba Merge pull request #230 from wernsaar/develop by Zhang Xianyi 2013-06-13 07:29:27 -0700
  • b2ebf211e (refs/pull/230/merge) Merge 0ded1fcc1c into 56f160134d by wernsaar 2013-06-13 07:29:02 -0700
  • 56f160134 Refs #231. Change the default C compiler to clang on Mac OSX. by Zhang Xianyi 2013-06-13 22:15:19 +0800
  • 0ded1fcc1 (refs/pull/230/head) performance optimizations in sgemm_kernel_16x2_bulldozer.S by wernsaar 2013-06-13 11:35:15 +0200
  • a789b588c added cgemm_kernel_4x2_bulldozer.S by wernsaar 2013-06-12 15:55:27 +0200
  • 8eaa04acb added zgemm_kernel_2x2_bulldozer.S by wernsaar 2013-06-11 12:00:49 +0200
  • d854b30ae Added UNROLL values for 3M to getarch_2nd.c, Makefile.system and Makefile.L3 by wernsaar 2013-06-09 17:26:42 +0200
  • d65bbec99 added new sgemm kernel for BULLDOZER by wernsaar 2013-06-09 15:57:42 +0200
  • e4c39c7c2 changed stack touching by wernsaar 2013-06-08 10:43:08 +0200
  • ba800f088 correct GEMM_THREAD in param.h by wernsaar 2013-06-08 10:03:59 +0200
  • 25491e42f New dgemm kernel for BULLDOZER: dgemm_kernel_8x2_bulldozer.S by wernsaar 2013-06-08 09:40:17 +0200
  • 960b0c88a Refs #227. Detected LLVM/Clang compiler. by Zhang Xianyi 2013-06-06 23:43:40 +0800
  • 65ffead0c Refs #124. Check XSAVE flag on x86 CPU. by Zhang Xianyi 2013-06-06 22:50:43 +0800
  • f2fb8c703 Change LIBSUFFIX from .lib to .a on windows. by Zhang Xianyi 2013-06-04 16:05:28 +0800
  • 9f59f384d Refs #223. Fixed s/dgemv bug on windows. by Zhang Xianyi 2013-06-04 16:01:05 +0800
  • 23965f164 Fixed overflow internal buffer bug of (s/d/c/z)gemv on x86_64. by wangqian 2013-05-29 19:48:31 +0800
  • 6a7284094 Fixed overflow internal buffer bug of (s/d/c/z)gemv on x86. by wangqian 2013-05-29 13:23:12 +0800
  • 947457fb7 Fixed the bug about testing the exist of lapack tar package. by Zhang Xianyi 2013-05-24 15:52:35 +0800
  • 79120bf9a Refs #205. Merge boegel's codes about downloading LAPACK. by Zhang Xianyi 2013-05-24 15:29:10 +0800
  • acb11905d Fixed #199. Saved USE_THREAD switch for make install. by Zhang Xianyi 2013-05-24 15:15:52 +0800
  • 109500178 Refs #220. Support Power7 by old Power6 kernels. by Zhang Xianyi 2013-05-21 22:59:45 +0800
  • e50a66486 Refs #215. Fixed the compatible between <complex.h> and <complex> in C++. by Zhang Xianyi 2013-05-17 16:41:05 +0800
  • 357078b93 Refs #216. Revert the default value of GEMM_MULTITHREAD_THRESHOLD to 4. by Zhang Xianyi 2013-05-03 09:08:54 +0800
  • 731220f87 changed DGEMM_DEFAULT_P and DGEMM_DEFAULT_Q to 248 for BULLDOZER 64bit by wernsaar 2013-04-30 10:07:17 +0200
  • 69aa6c8fb bad performance with some data by wernsaar 2013-04-28 11:14:23 +0200
  • 60b263f3d removed trsm_kernel_RT_4x4_bulldozer.S. wrong results by wernsaar 2013-04-27 17:23:08 +0200
  • 7ac306e0d added trsm_kernel_RT_4x4_bulldozer.S by wernsaar 2013-04-27 16:48:48 +0200
  • 4cb454cdf added trsm_kernel_LT_4x4_bulldozer.S by wernsaar 2013-04-27 14:30:00 +0200
  • 19ad2fb12 prefetch improved. Defined 2 different kernels for inner loop by wernsaar 2013-04-27 13:40:49 +0200
  • 5d96e4f22 Refs #210. Disable checking /lib/libpthread.so*. by Zhang Xianyi 2013-04-27 15:02:04 +0800
  • 682167748 minor improvements and code cleanup by wernsaar 2013-04-26 20:05:42 +0200
  • dbbda55e6 Updated the mailing list for OpenBLAS. by Xianyi Zhang 2013-04-25 00:45:42 +0800
  • 6c34a7f43 Updated the mailing list for OpenBLAS. by Xianyi Zhang 2013-04-25 00:44:22 +0800
  • 3326f3152 Merge pull request #213 from wernsaar/develop by Zhang Xianyi 2013-04-17 23:56:09 -0700
  • c7fdc692c (refs/pull/213/merge) Merge 7641f6e253 into 48bdc1ad3b by wernsaar 2013-04-16 10:15:33 -0700
  • 7641f6e25 (refs/pull/213/head) Merged some improvements into dgemm_kernel_4x4_bulldozer.S. Changed the copy functions to generic to solve prefetch conflicts by wernsaar 2013-04-16 19:05:06 +0200
  • 48bdc1ad3 Added NO_PARALLEL_MAKE flag to disable parallel make. by Zhang Xianyi 2013-04-15 21:37:30 +0800
  • 3ad29452d Merge pull request #211 from wernsaar/develop by Zhang Xianyi 2013-04-15 00:20:55 -0700
  • e2fc2344c (refs/pull/211/merge) Merge 6e3f6f25a5 into a068d54981 by wernsaar 2013-04-12 09:11:40 -0700
  • 6e3f6f25a (refs/pull/211/head) New version of dgemm_kernel_4x4_bulldozer.S The peak performance with 8 cores is now 90 GFlops by wernsaar 2013-04-12 17:55:51 +0200
  • 986d542ac Merge branch 'loongson3a' into loongson3b by Xianyi Zhang 2013-04-11 16:07:59 +0800
  • 990efcab6 Merge branch 'loongson3b' into loongson3a by Zhang Xianyi 2013-04-11 16:11:03 +0000
  • 75a5dc397 Added the configure for the host loongcc compiling on Loongson3. by Zhang Xianyi 2013-04-11 16:10:47 +0000
  • 6958c1a1a Fixed the SEGFAULT bug with Loongcc and Loongson3. by Xianyi Zhang 2013-04-11 15:33:43 +0800
  • a068d5498 Refs #209. Export the missing cblas_cdotc_sub functions. by Zhang Xianyi 2013-04-08 23:21:28 +0800
  • d692ee07f Merge branch 'loongson3a' into loongson3b by Xianyi Zhang 2013-04-08 14:56:39 +0800
  • 1a57717b1 Added the configuration of Loongcc compiler for Loongson 3 CPU. by Xianyi Zhang 2013-04-07 15:42:07 +0800
  • 6b01d5871 Disable the optimization of muli-threading gemm on the Loongson3A. by Xianyi Zhang 2013-03-30 20:12:43 +0000
  • 35b943f17 Merge branch 'develop' into loongson3a by Xianyi Zhang 2013-03-27 14:36:15 +0000
  • e02924287 Merge pull request #206 from wlbksy/patch-1 by Zhang Xianyi 2013-03-23 09:57:41 -0700
  • f8c889529 (refs/pull/206/merge) Merge 7a9b94b519 into f4846afbad by wlbksy 2013-03-22 23:41:47 -0700
  • 7a9b94b51 (refs/pull/206/head) Fix #204 by wlbksy 2013-03-23 14:41:26 +0800
  • e3c21da90 (refs/pull/205/merge) Merge 66b919d99f into f4846afbad by Kenneth Hoste 2013-03-22 11:47:05 -0700
  • 66b919d99 (refs/pull/205/head) adjusted Makefile to allow for provided required LAPACK source files rather than downloading them by Kenneth Hoste 2013-03-22 19:45:11 +0100
  • f4846afba Merge pull request #201 from Explorer09/develop by Zhang Xianyi 2013-03-18 07:31:30 -0700
  • 17176ae7e (refs/pull/201/merge) Merge 53588bc786 into d831b2ff8b by Explorer09 2013-03-17 08:16:26 -0700
  • 53588bc78 (refs/pull/201/head) getarch.c: Minor re-ordering of architecture list by Explorer09 2013-03-17 23:09:23 +0800
  • b47f13ee4 getarch.c: Minor re-ordering of architecture list by Explorer09 2013-03-17 23:07:48 +0800
  • 309f90e56 TargetList.txt: minor re-ordering by Explorer09 2013-03-17 23:03:05 +0800
  • 773c01f49 Typo correction in README.md by Explorer09 2013-03-17 22:48:24 +0800
  • d831b2ff8 Override CFLAGS in LAPACK make.in. by Zhang Xianyi 2013-03-10 01:01:16 +0800
  • 724ae159c Fixed the Windows x86_64 ABI bug in s/daxpy kernels. by Zhang Xianyi 2013-03-08 22:28:34 +0800
  • 2c9a203bd Merge pull request #198 from wernsaar/develop by Zhang Xianyi 2013-03-06 13:39:53 -0800
  • 65e54956d (refs/pull/198/merge) Merge f300ce3df5 into e2c7c75715 by wernsaar 2013-03-06 09:04:11 -0800
  • f300ce3df (refs/pull/198/head) new optimization of dgemm kernel for bulldozer: 10% performance increase by wernsaar 2013-03-06 17:26:03 +0100
  • e2c7c7571 Merge pull request #197 from wernsaar/develop by Zhang Xianyi 2013-03-06 01:11:08 -0800
  • 059e985db (refs/pull/197/merge) Merge 66e64131ed into 5900b1462e by wernsaar 2013-03-05 10:59:43 -0800
  • 66e64131e (refs/pull/197/head) optimized again bulldozer dgemm kernel by wernsaar 2013-03-05 19:51:37 +0100
  • 5900b1462 Merge pull request #195 from wernsaar/develop by Zhang Xianyi 2013-03-05 05:35:42 -0800
  • 901230f0d (refs/pull/195/merge) Merge 9405f26f4b into 529f1b5006 by wernsaar 2013-03-04 08:59:38 -0800
  • 9405f26f4 (refs/pull/195/head) new dgemm_kernel for bulldozer by wernsaar 2013-03-04 17:37:38 +0100
  • 54e7b3763 (tag: v0.2.6) Merge branch 'develop' by Zhang Xianyi 2013-03-02 14:42:06 +0800
  • 529f1b500 Refs#194. Export the missing LAPACK s/dlamc3 functions. by Zhang Xianyi 2013-03-02 14:41:18 +0800
  • e5ac3007e Merge branch 'develop' by Zhang Xianyi 2013-03-02 14:24:23 +0800
  • 0d0405b43 Updated the doc for 0.2.6 version. by Zhang Xianyi 2013-03-02 14:22:27 +0800
  • f1ce74ffd Improved the print when OS don't support AVX. by Zhang Xianyi 2013-03-02 14:15:54 +0800
  • d744c9590 In OpenMP threading, preallocate the thread buffer instead of allocating the buffer every time. This patch improved the performance slightly. by Zhang Xianyi 2013-03-01 14:36:47 +0800
  • 3cc6ae793 Refs #174. Return sb pointer when OpenMP or Windows. by Zhang Xianyi 2013-02-26 00:48:21 +0800
  • 4c2123c33 Fixed the overflowing bug in single thread cholesky factorization. by Zhang Xianyi 2013-02-23 12:51:13 +0800
  • 5155e3f50 Refs #174. Fixed the overflowing buffer bug of multithreading hbmv and sbmv. by Zhang Xianyi 2013-02-13 16:05:58 +0800
  • 5c8bf6ae0 Merge branch 'bulldozer' into develop by Zhang Xianyi 2013-02-10 01:19:42 +0800
  • 6ae2f868f Set the affinity. Only use 1 core of each module on bulldozer. by Zhang Xianyi 2013-02-09 18:18:55 +0100
  • a1ead62f2 Disable the warning of sgemm bulldozer kernel. by Zhang Xianyi 2013-02-09 17:03:13 +0100
  • 013358014 Used sgemm bulldozer kernel on 64 bit. by Zhang Xianyi 2013-02-09 16:29:14 +0100
  • 274246651 Merge branch 'bulldozer' of git://github.com/wernsaar/OpenBLAS into bulldozer by Zhang Xianyi 2013-02-09 16:25:07 +0100
  • 299b5a44d Merge branch 'develop' of github.com:xianyi/OpenBLAS into bulldozer by Zhang Xianyi 2013-02-09 16:22:04 +0100
  • a9500d007 Missing line continuation -- follow-up to last commit (64ad8b9809). by Zaheer Chothia 2013-02-01 09:34:12 +0100
  • 64ad8b980 Refs #193. Don't use C99 complex numbers when building C++ code. by Zaheer Chothia 2013-02-01 09:24:44 +0100
  • 875d520cc Refs #193. cblas: move #include out of extern "C" block. by Zaheer Chothia 2013-01-31 08:48:27 +0100
  • d311236df Refs #189. Fixed the bug of s/cdot about invalid reading NAN on x86_64. by Zhang Xianyi 2013-01-25 16:18:27 +0800
  • 36e098296 Refs #187. Use perl to generate cblas_noconst.h instead of sed. by Zhang Xianyi 2013-01-22 00:29:54 +0800
  • 8cdb79543 Refs #187. Use binary code for xgetbv, which is compatible with old compiler. by Zhang Xianyi 2013-01-22 00:18:21 +0800
  • 4db6660de Refs #185. Add missing 'const' to declarations in <cblas.h>. Thanks to Dan Povey! by Zaheer Chothia 2013-01-20 21:53:52 +0100
  • 0b08f7479 Refs #154. Fixed gemv_t bug about overflow 16MB buffer on x86. by Zhang Xianyi 2013-01-20 21:22:12 +0800
  • 200e4acf1 cblas: typedef enums for improved compatibility with Intel MKL. by Zaheer Chothia 2012-06-25 13:51:46 +0200