2c47e6d15
(refs/pull/1739/head)
Drop travis_wait and replace QUIET_MAKE with make -s to get more information by
2018-08-22 14:56:53 +0200
c39fab348
Support arbitrary numbers of threads for memory allocation. by
2018-08-16 15:18:26 +0100
6d00c674a
(refs/pull/1741/head)
Add files via upload by
2018-08-22 10:03:02 +0200
52d3f7af5
Merge pull request #1738 from sharkcz/s390x by
2018-08-16 09:46:34 +0200
5c6e020f4
(refs/pull/1738/head)
detect z14 arch on s390x by
2018-08-14 12:30:38 +0200
e6c0e3949
Optimize Zgemv by
2018-08-13 12:23:40 +0300
d4d3113ad
Merge pull request #1731 from fenrus75/readme by
2018-08-13 00:01:37 +0200
375dff54f
Merge pull request #1733 from fenrus75/dsymv by
2018-08-12 18:18:36 +0200
a5f165275
Merge pull request #1732 from fenrus75/dgemv by
2018-08-12 18:17:42 +0200
8c13aa495
Merge pull request #1730 from fenrus75/fix-sdot by
2018-08-12 18:17:01 +0200
1ee6d087c
Merge pull request #1729 from fenrus75/dscal by
2018-08-12 18:16:45 +0200
a95a784ab
Merge pull request #1723 from maamountki/develop by
2018-08-11 21:08:45 +0200
9bec34cb6
(refs/pull/1733/head)
Add an AVX512 enabled DSYMV (L) function by
2018-08-11 17:46:24 +0000
87bebdbd8
(refs/pull/1732/head)
Add an AVX512 enabled DGEMV (n) function by
2018-08-11 17:38:12 +0000
9493f2630
(refs/pull/1731/head)
add short blurb about avx512 and needed compiler to README by
2018-08-11 17:21:46 +0000
36add7570
(refs/pull/1730/head)
Fix typo in sdot function by
2018-08-11 17:16:45 +0000
cacacc800
(refs/pull/1729/head)
Add an AVX512 enabled DSCAL function by
2018-08-11 17:14:57 +0000
1a00ef3d2
Merge pull request #1725 from fenrus75/axpy by
2018-08-11 11:01:20 +0200
4c0d832ec
Merge pull request #1724 from fenrus75/sdot by
2018-08-11 11:00:56 +0200
fc33cbc7b
Merge pull request #1728 from martin-frbg/changelog by
2018-08-10 13:24:36 +0200
c52a831ae
(refs/pull/1728/head)
Add changes from the 0.3.x releases by
2018-08-10 13:23:47 +0200
a72952f6e
(refs/pull/1726/head)
Allow overriding USE_COMPILER_TLS (formerly HAS_COMPILER_TLS). by
2018-08-10 09:10:29 +0100
2e99873ff
(refs/pull/1725/head)
Add a AVX512 enabled SAXPY/DAXPY functions by
2018-08-10 02:58:32 +0000
00abaa865
(refs/pull/1724/head)
Add an AVX512 enabled SDOT function by
2018-08-10 02:31:48 +0000
33043f563
(refs/pull/1723/head)
Disable scal to benchmark zgemv separately by default by
2018-08-10 01:54:18 +0300
b30b82ce4
(refs/pull/1712/merge)
Merge b1cc69e7a8 into 66da7677bd by
2018-08-09 14:30:05 +0000
9b56e815e
(refs/pull/1718/merge)
Merge 732abce9f1 into 66da7677bd by
2018-08-09 14:28:34 +0000
66da7677b
Merge pull request #1721 from fenrus75/ddot2 by
2018-08-09 15:39:06 +0200
7932ff3ea
(refs/pull/1721/head)
Add an AVX512 enabled DDOT function by
2018-08-08 02:59:11 +0000
732abce9f
(refs/pull/1718/head)
Use intrinsics instead of inline asm by
2018-08-05 14:45:54 +0000
4fb9f3b7a
use named arguments in the inline asm by
2018-08-05 14:22:38 +0000
62f4c6970
Merge pull request #1717 from martin-frbg/issue1708 by
2018-08-06 22:05:47 +0200
453bfa7e7
[ZARCH] Restore detect() function by
2018-08-06 20:03:49 +0300
23229011d
[ZARCH] Z14 support, BLAS 1/2 single precision implementations, Some missing double precision implementations, Gemv optimization by
2018-08-06 18:20:40 +0300
73478664d
(refs/pull/1717/head)
Add workaround for avx512 compilations on Cygwin by
2018-08-06 16:40:32 +0200
ee955757f
Merge pull request #1715 from stevengj/patch-1 by
2018-08-05 22:48:44 +0200
b1cc69e7a
(refs/pull/1712/head)
Convert dscal_haswell to intrinsics and add AVX512 support by
2018-08-05 19:19:49 +0000
93aa18b1a
daxpy_haswell: Change to C+instrinsics + AVX512 to mimic the change to saxpy_haswell by
2018-08-05 18:29:34 +0000
7af8a5445
saxpy_haswell: Go to a more compact intrinsics notation by
2018-08-05 18:28:47 +0000
850b73dbb
saxpy_haswell: Add AVX512 support by
2018-08-05 17:50:16 +0000
06ea72f5a
write saxpy_haswell kernel using C intrinsics and don't disallow inlining by
2018-08-05 17:43:40 +0000
d86604687
saxpy_haswell: Use named arguments in inline asm by
2018-08-05 17:16:14 +0000
ef30a7239
sdot_haswell: similar to ddot: turn into intrinsics based C code that supports AVX512 by
2018-08-05 16:38:19 +0000
21c6220d6
fix typo in dsymv avx512 code path by
2018-08-05 15:16:48 +0000
34d63df4b
Add AVX512 support to DDOT by
2018-08-05 15:16:20 +0000
ae38fa55c
Use intrinsics instead of inline asm by
2018-08-05 14:45:54 +0000
847bbd6f4
use named arguments in the inline asm by
2018-08-05 14:22:38 +0000
48610a452
(refs/pull/1715/head)
fix blasabs for windows by
2018-08-05 08:18:51 -0400
9c29524f5
various code cleanups and comments by
2018-08-05 02:44:40 +0000
f2810beaf
Add AVX512 support to dsymv_L_microk_haswell-2.c by
2018-08-04 23:56:06 +0000
c202e0629
Write dsymv_kernel_4x4 for Haswell using intrinsics by
2018-08-04 23:35:36 +0000
4a553e867
Merge pull request #1713 from martin-frbg/issue1710 by
2018-08-04 23:51:31 +0200
e788102c1
Merge pull request #1709 from stevengj/patch-1 by
2018-08-04 23:51:10 +0200
0faba28ad
dsymv_L haswell: use symbol names for inline asm by
2018-08-04 21:25:53 +0000
df31ec064
Add AVX512 support to the dgemv_n_microk_haswell-4.c kernel by
2018-08-04 20:48:59 +0000
165f00c15
(refs/pull/1709/head)
fabs -> fabsl by
2018-08-04 20:14:51 +0200
40c068a87
(refs/pull/1713/head)
Introduce blasabs() to switch between abs() and labs() for INTERFACE64 by
2018-08-04 20:07:59 +0200
933896a1d
Use blasabs to switch between abs and labs as needed for INTERFACE64 by
2018-08-04 20:06:49 +0200
e52d01cfe
Also make the kernel_4x2 use intrinsics for readability and consistency by
2018-08-04 17:53:55 +0000
4a8ae8b8a
replace the hasell dgemv_kernel_4x4 kernel with a the same code written in intrinsics by
2018-08-04 17:25:54 +0000
350531e76
dgemv_n_microk_haswell: Use symbolic names for asm inputs to make the code more readable by
2018-08-04 14:44:04 +0000
a4e321400
fabs -> fabsl by
2018-08-03 13:00:10 -0400
9e6543050
Merge pull request #1703 from wsttiger/cmake_fix by
2018-08-02 23:48:42 +0200
2cfa86b40
Merge pull request #1707 from extrowerk/haiku_support by
2018-08-02 22:27:00 +0200
2a9a9389e
(refs/pull/1703/head)
Added target_include_directories() by
2018-08-02 14:58:52 -0500
6463bffd5
(refs/pull/1707/head)
Haiku supporting patches by
2018-08-02 20:49:14 +0200
8ef7d4fb5
Merge pull request #1706 from oon3m0oo/develop by
2018-08-02 18:53:34 +0200
6400868e5
(refs/pull/1706/head)
Fix #1705 where we incorrectly calculate page locations. by
2018-08-02 16:21:19 +0100
8ebf541e9
Set EXPORT_NAME to match OpenBLASConfig.cmake by
2018-07-30 15:18:29 -0500
b03ae3f4d
Set version to 0.3.3.dev by
2018-07-30 08:23:13 +0200
2cc8fb0ad
Set version to 0.3.3.dev by
2018-07-30 08:22:38 +0200
e8a68ef26
(tag: v0.3.2)
Merge pull request #1702 from xianyi/develop by
2018-07-30 07:25:01 +0200
64826a0d7
(refs/pull/1702/head)
Merge branch 'release-0.3.0' into develop by
2018-07-29 22:37:09 +0200
25f2d25cf
Merge pull request #1697 from martin-frbg/issue1696 by
2018-07-25 19:55:29 +0200
73131fa30
(refs/pull/1697/head)
Do not treat WIndows UWB builds as cross-compiling by
2018-07-24 17:46:33 +0200
66fcdd5be
Merge pull request #1695 from martin-frbg/issue1692 by
2018-07-22 16:34:09 +0200
43ac839c1
(refs/pull/1695/head)
Unset memory table entry, not just the temporary pointer to it on shutdown by
2018-07-22 09:19:19 +0200
7ba5936ec
Merge pull request #1688 from martin-frbg/issue1673 by
2018-07-19 19:03:45 +0200
b14f44d2a
(refs/pull/1688/head)
Temporarily disable special handling of OPENMP thread memory allocation by
2018-07-19 08:57:56 +0200
e71d70ba8
Merge pull request #1681 from martin-frbg/issue1671 by
2018-07-16 22:47:05 +0200
d671870f5
Merge pull request #1684 from martin-frbg/issue1672 by
2018-07-16 22:46:49 +0200
4e103c822
(refs/pull/1684/head)
typo fix by
2018-07-16 12:56:39 +0200
d2142760e
Fix precision problem in DSDOT by
2018-07-15 17:11:40 +0200
2fbfc64da
Use C kernels for default c/zAXPY, xROT, c/zSWAP by
2018-07-15 17:09:55 +0200
5e937b602
(refs/pull/1683/merge)
Merge a0bd542648 into 36aea5ce2d by
2018-07-15 14:40:16 +0000
a0bd54264
(refs/pull/1683/head)
Map c/zAXPY, c/zSWAP and xROT to the mips C kernels by
2018-07-15 13:06:46 +0200
35902bfe1
Fix lack of precision in DSDOT by promoting arguments by
2018-07-15 13:02:26 +0200
8d5b33b6b
(refs/pull/1681/head)
Add cpu identification via mfpvr call for the BSDs by
2018-07-12 23:39:00 +0200
36aea5ce2
Merge pull request #1680 from martin-frbg/snprint by
2018-07-12 14:05:13 +0200
1309711e2
(refs/pull/1680/head)
Fix declaration of snprintf for older MSVC by
2018-07-12 11:47:52 +0200
571e9de2a
Fix definition of snprintf for MSVC by
2018-07-12 11:42:25 +0200
448ed1511
Merge pull request #1678 from martin-frbg/issue1677 by
2018-07-12 09:21:34 +0200
045fb5ea2
(refs/pull/1678/head)
Define snprintf for older versions of MSVC by
2018-07-12 07:30:58 +0200
bdb29242a
(refs/pull/1661/merge)
Merge ba586c3d16 into 4dd70d98d7 by
2018-07-04 07:02:39 +0000
4dd70d98d
Merge pull request #1667 from xianyi/revert-1642-develop by
2018-07-04 08:27:21 +0200
504310eeb
Merge pull request #1665 from martin-frbg/cpuid-ryzen2 by
2018-07-04 08:19:40 +0200
ea1f39518
Merge pull request #1663 from martin-frbg/issue1641 by
2018-07-04 08:19:11 +0200
5f2a3c05c
(refs/pull/1667/head, revert-1642-develop)
Revert "Rewrite &= -> = and simplify the initial blocking phase." by
2018-07-03 21:42:28 +0200
d0ec4325c
(refs/pull/1665/head)
Add cpuid for AMD Ryzen 2 by
2018-07-03 21:03:24 +0200
3f73e8b8c
Add cpuid for AMD Ryzen 2 by
2018-07-03 21:01:35 +0200