Martin Kroeker
6ccbb089c2
Merge pull request #2402 from gxw-loongson/develop
Avoid printing the following information on mips and mips64 when check msa
6 years ago
Martin Kroeker
59ebe3636a
Merge pull request #2399 from martin-frbg/buffersize
Make BUFFER_SIZE configurable at build time
6 years ago
gxw
754433f420
Avoid printing the following information on mips and mips64 when check msa:
"unrecognized command line option ‘-mmsa’"
6 years ago
Martin Kroeker
7f0d523b42
Make BUFFER_SIZE configurable
6 years ago
Martin Kroeker
c353d8b106
Make BUFFER_SIZE configurable
6 years ago
Martin Kroeker
579be3aa9d
Add configuration option for BUFFER_SIZE
6 years ago
Martin Kroeker
449e8ea443
Merge pull request #26 from xianyi/develop
rebase
6 years ago
Martin Kroeker
3bec250cf9
Increment version to 0.3.9.dev
6 years ago
Martin Kroeker
f03dd23e90
Increment version to 0.3.9.dev
6 years ago
Martin Kroeker
fa93d63365
Merge branch 'release-0.3.0' into develop
6 years ago
Martin Kroeker
90e6c66a57
Merge pull request #2397 from martin-frbg/038changes
Update Changelog with changes from 0.3.8
6 years ago
Martin Kroeker
32d97330b3
Update with changes from 0.3.8
6 years ago
Martin Kroeker
29eaf4b6d7
Merge pull request #25 from xianyi/develop
rebase
6 years ago
Martin Kroeker
47c1bf7f4d
typo fixes
6 years ago
Martin Kroeker
2b55f0ad30
Merge pull request #2393 from martin-frbg/issue2388
Provide more documentation in README.md
6 years ago
Martin Kroeker
a5b32ab06c
Merge pull request #2390 from martin-frbg/pgi
Small corrections for compilation with PGI compilers
6 years ago
Martin Kroeker
50545b19d0
Update CPU and OS support and document DYNAMIC_ARCH option in README.md
prompted by #2388
6 years ago
Martin Kroeker
b3cbd60d7a
Remove PGI from list again as it is actually still not capable
6 years ago
Martin Kroeker
70199d1905
Merge pull request #2389 from Zeyiii/develop
Fix bugs in benchmark of gemv
6 years ago
Martin Kroeker
cfe63d8cc2
Remove OpenMP libraries from link list
6 years ago
Martin Kroeker
d55b10830f
Remove OpenMP libraries from link list
6 years ago
Martin Kroeker
c1c10cbb21
Merge pull request #2384 from wjc404/develop
Optimize AVX512 DGEMM (& DTRMM)
6 years ago
Martin Kroeker
5989841524
Add PGI to avx512-supporting compilers
6 years ago
Martin Kroeker
68a43db358
Fix utest compilation with PGI
6 years ago
Martin Kroeker
9694037b23
Set SUFFIX in tempfile commands, fix bad architecture option for PGI compiler in avx512 test
6 years ago
Martin Kroeker
71faa1c1a7
Merge pull request #24 from xianyi/develop
rebase
6 years ago
wjc404
3447d04eaf
Update dgemm_kernel_16x2_skylakex.c
6 years ago
wjc404
8b5cdcc64c
Update sgemm_kernel_8x4_haswell.c
6 years ago
wjc404
4e00d96a78
Update dgemm_kernel_16x2_skylakex.c
6 years ago
w00421467
ce9ea8f826
Fix another branch
6 years ago
w00421467
0b909203cb
Fix bugs in benchmark of gemv
6 years ago
wjc404
096da2f51a
Update dgemm_kernel_16x2_skylakex.c
6 years ago
wjc404
2f96a2c55b
Update trmm_R.c
6 years ago
wjc404
833bd0f8ff
Update trmm_L.c
6 years ago
wjc404
77b8f49556
Update level3_thread.c
6 years ago
wjc404
1c3e20ce48
Update level3.c
6 years ago
wjc404
83b6be7976
Update param.h
6 years ago
wjc404
081b188529
Update KERNEL.SKYLAKEX
6 years ago
wjc404
f3f969f681
Update param.h
6 years ago
wjc404
8019e70211
AVX512 16x2 DGEMM kernel
6 years ago
Martin Kroeker
8d2a796f49
Merge pull request #2378 from martin-frbg/issue2377
Add -march option for AVX512 in cmake as well
6 years ago
Martin Kroeker
8dc9fd4dfe
Add -march option for AVX512
6 years ago
Martin Kroeker
abc67bdd74
Merge pull request #2375 from ewanglong/master
fix a few performance drop in some matrix size per data type
6 years ago
Martin Kroeker
1f62a82789
Merge pull request #2376 from wjc404/develop
Fix remaining bugs in parallel GEMM3M
6 years ago
wjc404
e9fb8f62b1
Update level3_gemm3m_thread.c
6 years ago
Wang,Long
fbf4f48f4a
fix a few performance drop in some matrix size per data type
Signed-off-by: Wang,Long <long1.wang@intel.com>
6 years ago
Martin Kroeker
b9ad450295
Merge pull request #2373 from Qiyu8/optimize#gemmbeta
Optimize genenal Gemm Beta
6 years ago
Martin Kroeker
e011ad820a
Merge pull request #2372 from martin-frbg/winexit
Do not run any cleanup if the program is exiting anyway
6 years ago
Qiyu8
ff42e68652
Optimize genenal Gemm Beta
6 years ago
Martin Kroeker
23f322f997
Do not run any cleanup if the program is exiting anyway
From keno's PR #2350 - this avoids the potential hang in blas_thread_shutdown where we may wait for threads to exit while they are waiting on the loader lock from DllMain
6 years ago