Bart Oldeman
9e6b060bf3
Fix comment.
It stores the pointer, not an offset (that would be an alternative approach).
3 years ago
Bart Oldeman
9959a60873
Benchmarks: align malloc'ed buffers.
Benchmarks should allocate with cacheline (often 64 bytes) alignment
to avoid unreliable timings. This technique, storing the offset in the
byte before the pointer, doesn't require C11's aligned_alloc for
compatibility with older compilers.
For example, Glibc's x86_64 malloc returns 16-byte aligned buffers, which is
not sufficient for AVX/AVX2 (32-byte preferred) or AVX512 (64-byte).
3 years ago
Marius Hillenbrand
f119e26354
Fix flipped indices in benchmark for gemv
Fixes #3439
4 years ago
Martin Kroeker
14e33e0f7e
Handle OPENBLAS_LOOPS in SYR2 benchmark
4 years ago
Martin Kroeker
4ed99c2ce3
Merge pull request #3292 from martin-frbg/syrk_limit
Add lower limit for multithreading in xSYRK
4 years ago
Martin Kroeker
a4543e4918
Handle OPENBLAS_LOOP
4 years ago
Martin Kroeker
dcfc5cf714
Handle OPENBLAS_LOOPS for more stable results
4 years ago
Martin Kroeker
06e3b07ecb
Handle OPENBLAS_LOOPS and OPENBLAS_TEST options
4 years ago
Martin Kroeker
1f8bda71b9
Add OPENBLAS_LOOPS support to potrf/potrs/potri benchmark
4 years ago
Martin Kroeker
d57c681a6d
Fix compilation on older OSX versions
4 years ago
Martin Kroeker
38dcf3454b
Support timing Apple M1
4 years ago
Qiyu8
f917c26e83
Refractoring remaining benchmark cases.
5 years ago
Qiyu8
dd6ebdfdab
Refactor the performance measurement system
5 years ago
Martin Kroeker
7ae9e8960e
Change "HALF" and "sh" to "BFLOAT16" and "sb"
5 years ago
Martin Kroeker
5464eb13ea
Change ifdef linux to __linux for C11 compatibility
5 years ago
Martin Kroeker
6f8fad87c5
Use POSIX2001 clock.gettime for higher resolution
5 years ago
Martin Kroeker
ced49466f0
Use the fortran compiler to link LAPACK-related benchmarks
to fix linking problems with (at least) the AMD version of flang that creates dependencies on more than just the fortran runtime.
5 years ago
Martin Kroeker
6e270f91ec
add support for RETURN_BY_STACK semantics, e.g. clang
5 years ago
Rajalakshmi Srinivasaraghavan
ce90e2bd3f
Include shgemm in benchtest
This patch is to enable benchtest for half precision gemm
when BUILD_HALF is set during make.
5 years ago
l00536773
6b7ef6543a
[OpenBLAS]: benchmark error of potrf
[description]: when the matrix size goes higher than 5800 during the cpotrf test, error info, such as "Potrf info = 5679", will be returned on ARM64 and x86 machines. Uplo = L & F.
[solution]: changed the func for building the matrix so that the complex Hermitian matrix can stay positive definite during the computation.
[dts]:
5 years ago
Martin Kroeker
717c604aeb
Merge pull request #2515 from zelong-1024/develop
[OpenBLAS]: benchmark for her/her2 LEVEL2 functions
5 years ago
Martin Kroeker
ce33da4cab
Merge pull request #2513 from aaawuanjun/develop
[OpenBlas]: Add benchmark tpsv file and modify benchmark/Makefile
5 years ago
l00536773
d45c53ecf1
[OpenBLAS]: benchmark for her/her2 LEVEL2 functions
[description]: benchmark for her/her2
[solution]: added benchmark for her/her2, modified makefile in benchmark
[dts]:
5 years ago
Martin Kroeker
c2840997db
Merge pull request #2508 from liujingjue/develop
[OpenBLAS]:fix the iamax benchmark error
5 years ago
Martin Kroeker
c0649aa694
Merge pull request #2506 from xiaofengF/develop
Add benchmark for SPMV and fix segmentation fault when data size >= 50000
5 years ago
wuanjun 00447568
2428dc9fd3
[OpenBlas]: Add benchmark tpsv file and modify benchmark/Makefile
[Description]: Solve lack of tpsv benchmark.
5 years ago
l00546269
a0a3bf7c81
[OpenBLAS]:fix the iamax benchmark error
[Description]:the result for i?amax is not MFlops, it is MBytes
5 years ago
jayfely@qq.com
ae3f2c2e49
Remove cspmv and zspmv to remove the error occured in travis CI
5 years ago
jayfely@qq.com
649733ff15
Only keep spmv.goto and spmv.atlas
5 years ago
wuanjun 00447568
3e8f1c6cc5
[OpenBlas]:Add benchmark tpmv.c and modify Makefile
[Description]:Solve the problem of missing tpmv.c benchmark file
5 years ago
jayfely@qq.com
2f4c5bb3a9
Update spmv.c: solve segmentation fault when m and n are larger than 50000
5 years ago
Martin Kroeker
047dfb216d
Merge pull request #2501 from jijiwawa/Fix_mistakes
Fix pr #2487 error
5 years ago
s00527847
cd8871f1a1
Use the correct unit of measure
5 years ago
jayfely@qq.com
08e1d8cbae
Modify Makefile in Benchmark
5 years ago
jayfely@qq.com
ff40a4e726
Add benchmark for SPMV
5 years ago
s00548429
c5bdd21352
Add benchmark for ?amax, ?max, ?amin, ?min, i?max, i?amin and i?min.
5 years ago
Martin Kroeker
b6a6ccbbea
Merge pull request #2495 from ZuoQ3/develop
add benchmark for axpby test
5 years ago
Martin Kroeker
8b720f7365
Merge pull request #2494 from shengyang-3390/develop
add benchmark for csrot and zdrot
5 years ago
Martin Kroeker
14df234edb
Merge pull request #2489 from jijiwawa/brightness
Remove redundant code
5 years ago
s00527847
bbeda55b7b
add trmm.c
5 years ago
s00527847
efcf89aec7
Remove redundant code
5 years ago
zq
0c8162eba6
Add benchmark file axpby.c and modify benchmark/Makefile to test s/d/c/zaxpby
5 years ago
shengyang
09c7a191bd
add benchmark for csrot and zdrot
modified: benchmark/Makefile
modified: benchmark/rot.c
5 years ago
Martin Kroeker
dca3e0cf20
Merge pull request #2491 from chenxuqiang/hbmv_benchmark
benchmark/hpmv&hbmv: add benchmark/hpmv.c and benchmark/hbmv.c
5 years ago
Martin Kroeker
c9f8db979b
Merge pull request #2490 from shengyang-3390/develop
Add benchmark file rotm.c and modify benchmark/Makefile to test s/drotm
5 years ago
Martin Kroeker
97c36ca58c
Merge branch 'develop' into develop
5 years ago
Martin Kroeker
9f5a74f3c7
Merge pull request #2486 from qqqil/develop
add benchmark for trsv
5 years ago
Martin Kroeker
2afb10975d
Merge pull request #2485 from Darkness303/develop
Add syr2 benchmark
5 years ago
chenxuqiang
32c847df45
benchmark/hpmv&hbmv: add benchmark/hpmv.c and benchmark/hbmv.c
Signed-off-by: Xuqiang Chen chenxuqiang3@hisilicon.com
5 years ago
shengyang
e0df9485d4
Add benchmark file rotm.c and modify benchmark/Makefile to test s/drotm
modified: benchmark/Makefile
new file: benchmark/rotm.c
5 years ago