Commit Graph

  • *
  • | *
  • | *
  • | |\
  • | | | *
  • | | |/|
  • | |/|/
  • | | *
  • | |/
  • * |
  • |\ \
  • | | *
  • | | |\
  • | | |/
  • | |/|
  • | * |
  • | * |
  • | |\ \
  • | |/ /
  • |/| /
  • | |/
  • * |
  • * |
  • |\|
  • | | *
  • | | *
  • | * |
  • | |\ \
  • | | | | *
  • | | |_|/|
  • | |/| |/
  • | | |/|
  • | | * |
  • | |/ /
  • | | *
  • | | |\
  • | | | *
  • | | * |
  • * | | |
  • * | | |
  • |\ \ \ \
  • | * | | |
  • |/ / / /
  • | | * |
  • | | * |
  • | | * |
  • | | * |
  • | | * |
  • | | * |
  • | | |\|
  • | | | *
  • | | * |
  • | | * |
  • | | |\ \
  • | |_|/ /
  • |/| | /
  • | | |/
  • | | *
  • | | *
  • * | |
  • * | |
  • * | |
  • * | |
  • * | |
  • | | *
  • | | *
  • * | |
  • |\ \ \
  • | | * \
  • | | |\ \
  • | | |/ /
  • | |/| |
  • | * | |
  • | * | |
  • | * | |
  • | * | |
  • | * | |
  • | |\ \ \
  • | |/ / /
  • |/| | |
  • | * | |
  • | * | |
  • | * | |
  • | * | |
  • | |/ /
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | |\
  • * | | |
  • | | * |
  • * | | |
  • |\ \ \ \
  • | | * \ \
  • | | |\ \ \
  • | | |/ / /
  • | |/| | |
  • | | | | *
  • | | | |/
  • | | | *
  • | | | *
  • | | | |\
  • | |_|_|/
  • |/| | |
  • | * | |
  • | | | *
  • | | | |\
  • | | | * |
  • | | | | *
  • | | | |/
  • | * | |
  • | * | |
  • | |/ /
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • | | *
  • * | |
  • | | *
  • | | *
  • | | *
  • | | |\
  • | | * |
  • * | | |
  • |\ \ \ \
  • | | * \ \
  • | | |\ \ \
  • | | |/ / /
  • | |/| | |
  • | * | | |
  • | * | | |
  • | |/ / /
  • | | | *
  • | | |/|
  • | |_|/
  • |/| |
  • * | |
  • |\ \ \
  • | | * \
  • | | |\ \
  • | | |/ /
  • | |/| |
  • | * | |
  • | |/ /
  • | | *
  • | | |\
  • | | * |
  • | | * |
  • | | | *
  • | | |/|
  • | | * |
  • * | | |
  • |\ \ \ \
  • | | |_|/
  • | |/| |
  • | | * |
  • | | |\ \
  • | | |/ /
  • | |/| |
  • | * | |
  • | * | |
  • | * | |
  • 0a696bd4c Improved the makefile for Intel compiler. by Xianyi Zhang 2012-02-20 23:36:58 +0800
  • fda39c6cb (tag: v0.1alpha2.5) Updated the Changelog. by Xianyi Zhang 2012-02-20 09:06:43 +0800
  • 875da22a4 Merge pull request #77 from nolta/master by Xianyi Zhang 2012-02-19 16:44:35 -0800
  • 765387af1 (refs/pull/77/merge) Merge 363a563ec2 into 0caa5616f2 by GitHub Merge Button 2012-02-19 11:08:43 -0800
  • 363a563ec (refs/pull/77/head) fix #49 by Mike Nolta 2012-02-19 14:07:34 -0500
  • 8da6fdc2c Merge branch 'hotfix-0.1alpha2.5' into develop by Xianyi Zhang 2012-02-19 23:11:06 +0800
  • 0caa5616f Merge branch 'hotfix-0.1alpha2.5' by Xianyi Zhang 2012-02-19 22:56:06 +0800
  • 727e6d83c Released 0.1 alpha 2.5. Updated the documents. by Xianyi Zhang 2012-02-19 22:55:31 +0800
  • da3f101a7 Merge branch 'develop' into hotfix-0.1alpha2.5 by Xianyi Zhang 2012-02-19 22:31:09 +0800
  • fe613de8e refs #69. Auto-detect Intel Core i6/i7 (Sandy Bridge) CPU with Nehalem assembly kernels. by Xianyi Zhang 2012-02-13 19:20:35 +0800
  • 142e99d4e Merge branch 'master' into develop by Xianyi Zhang 2012-01-20 21:32:13 +0800
  • 7af0139a0 Modify P Q R size of Loongson3b. by traz 2012-01-11 16:05:39 +0000
  • 8e53b57bb Appending gemmkernel and trmmkernel C code in kernel/generic, this code can be used to execute on a new platform which dose not have optimized assemble kernel. by Wang Qian 2012-01-10 17:16:13 +0000
  • 0d3647c39 Merge pull request #76 from StefanKarpinski/patch-1 by Xianyi Zhang 2012-01-01 05:57:25 -0800
  • 233c7e4be (refs/pull/76/merge) Merge 0d76196a09 into fe7a932ab8 by GitHub Merge Button 2011-12-28 20:54:45 -0800
  • 0d76196a0 (refs/pull/76/head) Fix #68: don't require SystemStubs on OS X. by Stefan Karpinski 2011-12-28 23:53:20 -0500
  • b281f3dee Merge remote branch 'origin/loongson3a' into loongson3b by traz 2011-12-06 13:49:39 +0000
  • a4292976e Adding detection of complex situations in symm.c, otherwise the buffer address of sb will overlap the end of sa. by traz 2011-12-05 14:54:25 +0000
  • c2dad58ad Adding n32 multiple threads condition. by Wang Qian 2011-12-01 16:33:11 +0000
  • d5a6d789e Fixed a typo in Makefile. by Xianyi Zhang 2011-11-28 15:31:46 +0800
  • 875dde437 Merge branch 'lapack_3.4.0' into develop by Xianyi Zhang 2011-11-28 15:28:54 +0800
  • 5be22ca80 Refs #72. Upgraded LAPACK to 3.4.0 version. by Xianyi Zhang 2011-11-28 15:28:22 +0800
  • 66904fc4e BLAS3 used standard MIPS instructions without extensions on Loongson 3B. by Wang Qian 2011-11-25 11:20:25 +0000
  • 8163ab7e5 Change the block size on Loongson 3B. by Wang Qian 2011-11-23 18:40:35 +0000
  • ef6f7f32a Fixed mbind bug on Loongson 3B. Check the return value of my_mbind function. by Xianyi Zhang 2011-11-23 17:17:41 +0000
  • 285e69e2d Disable using simple thread level3 to fix a bug on Loongson 3B. by Xianyi Zhang 2011-11-17 16:46:26 +0000
  • d1baf14a6 Enable thread affinity on Loongson 3B. Fixed the bug of reading cycle counter. by Xianyi Zhang 2011-11-11 17:49:41 +0000
  • 0884f6b78 Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3b by Xianyi Zhang 2011-11-11 14:26:49 +0000
  • 2d78fb05c Add conjugate condition to gemv. by traz 2011-11-10 15:38:48 +0000
  • b95ad4cfa Support detecting ICT Loongson-3B CPU. by Xianyi Zhang 2011-11-09 19:28:22 +0000
  • 3bbe3ddb3 Merge branch 'develop' of github.com:xianyi/OpenBLAS into loongson3b by Xianyi Zhang 2011-11-09 19:08:29 +0000
  • a32e56500 Fix the compute error of gemv when incx and incy are negative numbers. by traz 2011-11-04 19:32:21 +0000
  • c1e618ea2 Add complete gemv function on Loongson3a platform. by traz 2011-11-03 13:53:48 +0000
  • 19f5b5c13 Fixed #66 the bug in zgemv kernel with transpose matrix on 64-bit MingW (Windows). by traits 2011-10-18 18:44:23 +0800
  • c852ce398 Ref #65. Fixed 64-bit Windows calling convention bug in cdot and zdot. by traits 2011-10-18 10:23:17 +0800
  • ba31b19c0 Ref #62. In OpenMP implementation, check the return value of omp_get_max_threads(). It makes sure the return value as same as blas_cpu_numbers which is an internal global variable to store the number of threads in OpenBLAS. by Xianyi Zhang 2011-10-16 22:56:19 +0800
  • 66a3c6df4 Ref #63. Fixed generating DLL bug on ming-w64. by traits 2011-10-09 17:25:44 +0800
  • 57658a8c1 ref #62. Added the user friendly message with USE_OPENMP=1. The users should use OMP_NUM_THREADS. by Xianyi Zhang 2011-10-09 15:14:48 +0800
  • 9fe3049de Adding conditional compilation(#if defined(LOONGSON3A)) to avoid affecting the performance of other platforms. by traz 2011-09-26 15:21:45 +0000
  • 831858b88 Modify aligned address of sa and sb to improve the performance of multi-threads. by traz 2011-09-23 20:59:48 +0000
  • 8de2ba67d Merge branch 'hotfix-0.1alpha2.4' into develop by Xianyi 2011-09-18 17:00:29 +0800
  • fe7a932ab (tag: v0.1alpha2.4) Merge branch 'hotfix-0.1alpha2.4' by Xianyi 2011-09-18 16:57:28 +0800
  • 1d31c79dc Prepared the document for 0.1 alpha 2.4 version. by Xianyi 2011-09-18 05:46:08 +0800
  • d40e5621e Change the installation folder into /include and /lib. by Xianyi 2011-09-18 05:07:00 +0800
  • bcc795621 Refs #57. Continue to fix absolute path issue about shared library on Mac OSX. by Xianyi 2011-09-18 01:35:12 +0800
  • 821cbb299 Updated the document for 0.1 alpha 2.4. by Xianyi 2011-09-17 07:55:59 +0800
  • 74fa79035 Merge branch 'develop' into hotfix-0.1alpha2.4 by Xianyi 2011-09-17 07:32:10 +0800
  • 756477bfe Output the installation tip after building complete. by Xianyi 2011-09-17 07:21:11 +0800
  • 864c68ffc Bump the version number. by Xianyi 2011-09-17 03:05:26 +0800
  • 68cae521d Refs #57. The bug about absolute path of shared library on Mac OSX. by Xianyi 2011-09-17 02:58:01 +0800
  • d0152ec8c Fixed #61 a building bug about setting TARGET and DYNAMIC_ARCH at the same time. by Xianyi 2011-09-17 02:27:56 +0800
  • e08cfaf9c Complete all the complex single-precision functions of level3, but the performance needs further improve. by traz 2011-09-16 17:50:40 +0000
  • ee4bb8bd2 Add ctrmm part in cgemm_kernel_loongson3a_4x2_ps.S. by traz 2011-09-16 16:08:39 +0000
  • 7fa3d23dd Complete cgemm function, but no optimization. by traz 2011-09-15 16:08:23 +0000
  • 9679dd077 Fix some compute error. by traz 2011-09-14 20:00:35 +0000
  • 048742f38 Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a by traz 2011-09-14 16:32:36 +0000
  • 7b410b7f0 Fixed #58 zdot SEGFAULT bug with GCC-4.6. Thank Mr. John for this patch. by Zhang Xiianyi 2011-09-14 23:52:51 +0800
  • d238a768a Use ps instructions in cgemm. by traz 2011-09-14 15:32:25 +0000
  • 260db9fb9 Merge branch 'hotfix-0.1alpha2.3' into develop by traits 2011-09-09 00:57:47 +0800
  • e27b761d7 (tag: v0.1alpha2.3) Merge branch 'hotfix-0.1alpha2.3' by traits 2011-09-09 00:55:04 +0800
  • 16fc08332 Refs #47. Fixed the seting parameter bug on Loongson 3A single thread version. by Xianyi Zhang 2011-09-08 16:39:34 +0000
  • 3c856c0c1 Check the return value of pthread_create. Update the docs with known issue on Loongson 3A. by Xianyi Zhang 2011-09-06 18:27:33 +0000
  • dc9c69db9 Merge branch 'develop' into loongson3a by Xianyi Zhang 2011-09-06 18:19:50 +0000
  • b1fe26c45 refs #55. Changed DTB_ENTRIES to DTB_DEFAULT_ENTRIES in x86 gemv_n kernel codes. by traits 2011-09-06 14:14:07 +0800
  • 0389b631f Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a by traz 2011-09-05 16:31:40 +0000
  • 64fa709d1 Fixed #46. Initialize variables in cblat3.f and zblat3.f. by traz 2011-09-05 16:30:55 +0000
  • 4727fe8ab Refs #47. On Loongson 3A, set DGEMM_R parameter depending on different number of threads. It would improve double precision BLAS3 on multi-threads. by Xianyi Zhang 2011-09-05 15:13:05 +0000
  • 90481ce74 Updated the doc about 0.1alpha2.3. by traits 2011-09-05 17:40:55 +0800
  • 9fc6764fa refs #55. Added DTB_ENTRIES into dynamic arch setting parameters. Now, it can read DTB_ENTRIES on runtime. by traits 2011-09-05 17:37:07 +0800
  • 74d4cdb81 Fix an illegal instruction for strmm_RTLU. by traz 2011-09-02 19:41:06 +0000
  • 790614683 Fix an error for strmm_LLTN. by traz 2011-09-02 16:57:33 +0000
  • 3274ff47b Fix an error for strmm_LLTN. by traz 2011-09-02 16:50:50 +0000
  • a059c553a Fix a compute error for strmm. by traz 2011-09-02 16:00:04 +0000
  • 23e182ca7 Fix stack-pointer bug for strmm. by traz 2011-09-02 15:28:01 +0000
  • a15bc9582 Add strmm part. by traz 2011-09-02 09:15:09 +0000
  • 74a3f6348 Tuning mb, kb, nb size to get the best performance. by traz 2011-09-01 17:15:28 +0000
  • 09f49fa89 Using PS instructions to improve the performance of sgemm and it is 4.2Gflops now. by traz 2011-08-31 21:24:03 +0000
  • b9d89f8aa Fixed the bug about installation. f77blas.h works OK now. by Xianyi Zhang 2011-08-31 18:21:37 +0800
  • cb0214787 Modify compile options. by traz 2011-08-30 20:57:00 +0000
  • 2e8cdd154 Using ps instruction. by traz 2011-08-30 20:54:19 +0000
  • b29d327d1 Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a by traz 2011-07-18 17:06:53 +0000
  • c8360e3ae Complete all the plura single precision functions of level3 on Loongson3a, the performance is 2.3GFlops. by traz 2011-07-18 17:03:38 +0000
  • 19d2ab485 Merge branch 'hotfix-0.1alpha2.2' into develop by traits 2011-07-14 01:09:21 +0800
  • 12d77deee (tag: v0.1alpha2.2) Merge branch 'hotfix-0.1alpha2.2' by traits 2011-07-14 01:03:09 +0800
  • 043927c7d Update the documents for 0.1alpha2.2 version. by traits 2011-07-14 01:02:19 +0800
  • 30947ea2d Fixed #44 a makefile bug when DYNAMIC_ARCH=1 and INTERFACE64=1. by traits 2011-07-14 00:54:23 +0800
  • 33313b022 Merge branch 'develop' into loongson3a by Xianyi Zhang 2011-07-07 14:25:51 +0800
  • a5300420e Merge branch 'hotfix-0.1alpha2.1' into develop by traits 2011-06-28 15:46:55 +0800
  • 9b46bf1eb (tag: v0.1alpha2.1) Merge branch 'hotfix-0.1alpha2.1' by traits 2011-06-28 15:43:08 +0800
  • c06b7be32 Refs #42. Output the error message when detecting fortran compiler failed. by traits 2011-06-28 15:42:09 +0800
  • 68532fa9e Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a by traz 2011-06-24 09:28:12 +0000
  • 708d2b625 Fix compute error in ztrmm. by traz 2011-06-24 09:27:41 +0000
  • e72113f06 Add ztrmm and ztrsm part on loongson3a. The average performance is 2.2G. by traz 2011-06-23 21:11:00 +0000
  • fc21f7ad2 Merge branch 'release-v0.1alpha2' into loongson3a by Xianyi Zhang 2011-06-23 16:08:23 +0800
  • 14f81da37 Change prefetch length of A and B, the performance is 2.1G now. by traz 2011-06-23 10:46:58 +0000
  • ca8bf5abb Merge branch 'release-v0.1alpha2' into develop by Xianyi Zhang 2011-06-23 16:07:34 +0800
  • 4a73f5c5e (tag: v0.1alpha2) Merge branch 'release-v0.1alpha2' by traits 2011-06-23 15:18:40 +0800
  • 6a0762949 Fixed #38. Released v0.1 alpha2. by traits 2011-06-23 15:16:24 +0800
  • 859b71645 Refs #37. Updated REAME about the compatible issue with EKOPath compiler. by traits 2011-06-23 15:09:34 +0800
  • 078bfd0b4 Refs #39. Moved the shared lib (dll) to top directory in MingW64 compiler environment. by Xianyi Zhang 2011-06-22 13:19:39 +0800