Browse Source

Fixed #27. Temporarily walk around axpy's low performance issue with small imput size & multithreads.

tags/v0.1alpha2^2
Xianyi Zhang 14 years ago
parent
commit
aeed8d6225
2 changed files with 6 additions and 1 deletions
  1. +1
    -0
      Changelog.txt
  2. +5
    -1
      interface/axpy.c

+ 1
- 0
Changelog.txt View File

@@ -25,6 +25,7 @@ x86/x86_64:
* Fixed #28 a wrong result of dsdot on x86_64. * Fixed #28 a wrong result of dsdot on x86_64.
* Fixed #32 a SEGFAULT bug of zdotc with gcc-4.6. * Fixed #32 a SEGFAULT bug of zdotc with gcc-4.6.
* Fixed #33 ztrmm bug on Nehalem. * Fixed #33 ztrmm bug on Nehalem.
* Walk round #27 the low performance axpy issue with small imput size & multithreads.


MIPS64: MIPS64:
* Fixed #28 a wrong result of dsdot on Loongson3A/MIPS64. * Fixed #28 a wrong result of dsdot on Loongson3A/MIPS64.


+ 5
- 1
interface/axpy.c View File

@@ -85,7 +85,11 @@ void CNAME(blasint n, FLOAT alpha, FLOAT *x, blasint incx, FLOAT *y, blasint inc
//In that case, the threads would be dependent. //In that case, the threads would be dependent.
if (incx == 0 || incy == 0) if (incx == 0 || incy == 0)
nthreads = 1; nthreads = 1;

//Temporarily walk around the low performance issue with small imput size & multithreads.
if (n <= 10000)
nthreads = 1;
if (nthreads == 1) { if (nthreads == 1) {
#endif #endif




Loading…
Cancel
Save