This website works better with JavaScript.
Home
Issues
Pull Requests
Milestones
AI流水线
Repositories
Datasets
Forum
实训
竞赛
大数据
AI开发
Register
Sign In
OSchip
/
OpenBLAS
Not watched
Unwatch
Watch all
Watch but not notify
1
Star
0
Fork
0
Code
Releases
66
Wiki
evaluate
Activity
Issues
0
Pull Requests
0
Datasets
Model
Cloudbrain
HPC
Browse Source
Fixed
#27
. Temporarily walk around axpy's low performance issue with small imput size & multithreads.
tags/v0.1alpha2^2
Xianyi Zhang
14 years ago
parent
b3d1887745
commit
aeed8d6225
2 changed files
with
6 additions
and
1 deletions
Split View
Diff Options
Show Stats
Download Patch File
Download Diff File
+1
-0
Changelog.txt
+5
-1
interface/axpy.c
+ 1
- 0
Changelog.txt
View File
@@ -25,6 +25,7 @@ x86/x86_64:
* Fixed #28 a wrong result of dsdot on x86_64.
* Fixed #32 a SEGFAULT bug of zdotc with gcc-4.6.
* Fixed #33 ztrmm bug on Nehalem.
* Walk round #27 the low performance axpy issue with small imput size & multithreads.
MIPS64:
* Fixed #28 a wrong result of dsdot on Loongson3A/MIPS64.
+ 5
- 1
interface/axpy.c
View File
@@ -85,7 +85,11 @@ void CNAME(blasint n, FLOAT alpha, FLOAT *x, blasint incx, FLOAT *y, blasint inc
//In that case, the threads would be dependent.
if (incx == 0 || incy == 0)
nthreads = 1;
//Temporarily walk around the low performance issue with small imput size & multithreads.
if (n <= 10000)
nthreads = 1;
if (nthreads == 1) {
#endif
Write
Preview
Loading…
Cancel
Save