08fa83aba
Merge pull request #2312 from martin-frbg/power8be by
2019-11-20 15:12:06 +0100
63d3ee8df
Merge pull request #2313 from ewanglong/develop by
2019-11-20 14:49:15 +0100
1191db1a4
(refs/pull/2313/head)
For the sake of windows compatible, used "unsigned long long" to ensure 64-bit length by
2019-11-20 21:30:16 +0800
1f6071590
(refs/pull/2314/head)
Fix usage of TerminateThread() causing critical section corruption. by
2019-11-20 12:21:35 +0100
0caf1434c
Fix the integer overflow issue for large matrix size by
2019-11-20 11:50:37 +0800
73128f388
Merge pull request #2310 from martin-frbg/ppc440 by
2019-11-17 23:19:48 +0100
cad0d150d
(refs/pull/2312/head)
Define alternate kernels for big-endian POWER8 by
2019-11-17 23:12:10 +0100
eba0aeb7c
Fix compilation for big-endian POWER8 by
2019-11-17 22:58:32 +0100
0c07c356c
(refs/pull/2310/head)
Define alternate kernels for big-endian PPC440 by
2019-11-17 19:25:08 +0100
82b75f97e
Disable the old QCDOC qalloc by default and copy utility functions from memory.c by
2019-11-17 19:22:04 +0100
7887c4507
Merge pull request #17 from xianyi/develop by
2019-11-17 19:09:49 +0100
3e67017ac
Merge pull request #2309 from martin-frbg/ppc970-be by
2019-11-17 18:22:24 +0100
b3ac6ee22
(refs/pull/2309/head)
Define alternate kernels for big-endian PPC970 by
2019-11-17 15:19:39 +0100
6082e556c
Use "generic" S/CGEMM unroll M on big-endian PPC970 by
2019-11-17 15:10:26 +0100
92315173d
Merge pull request #2308 from martin-frbg/ctestfix by
2019-11-15 08:33:17 +0100
351d12b94
(refs/pull/2308/head)
Fix potential spurious failure from uninitialized variable by
2019-11-15 00:20:36 +0100
bf73aa141
Fix potential spurious failure from uninitialized variable by
2019-11-15 00:19:24 +0100
71e96163d
Merge pull request #2305 from wjc404/develop by
2019-11-12 07:38:37 +0100
819e852ae
(refs/pull/2305/head)
AVX512 CGEMM & ZGEMM kernels by
2019-11-11 20:04:52 +0800
4e466d739
Merge pull request #15 from xianyi/develop by
2019-11-09 18:52:08 +0100
4c6a45735
Merge pull request #2300 from wjc404/develop by
2019-11-06 07:27:33 +0100
836c414e2
(refs/pull/2300/head)
optimizations of software prefetching by
2019-11-05 13:36:56 +0800
d403eb3c2
Merge pull request #2302 from martin-frbg/ppc970 by
2019-11-04 22:55:05 +0100
3cd97f1a8
Merge pull request #2301 from martin-frbg/ppc8be by
2019-11-04 22:54:28 +0100
9955f0996
Merge pull request #2294 from martin-frbg/ios-cleanup by
2019-11-04 22:53:58 +0100
430c11e13
Add files via upload by
2019-11-04 20:10:12 +0800
fbacd2605
optimizations via software prefetches by
2019-11-04 19:37:19 +0800
6fa89b06a
(refs/pull/2302/head)
Use the two-operand form of DCBT on all PPC970 regardless of OS by
2019-11-03 22:55:31 +0100
68597002e
(refs/pull/2301/head)
The assembly microkernel is not safe to use on ELFv1 by
2019-11-03 22:42:46 +0100
d2a628554
The assembly microkernel is not safe to use on ELFv1 by
2019-11-03 22:41:19 +0100
d999688d1
The assembly microkernel is not safe to use on ELFv1 by
2019-11-03 22:39:06 +0100
928fe1b28
The assembly microkernel is not safe to use on ELFv1 by
2019-11-03 22:37:27 +0100
ccc28c6d6
Merge pull request #13 from xianyi/develop by
2019-11-03 22:33:31 +0100
ae43b75a6
Add files via upload by
2019-11-02 10:09:19 +0800
54fc06fd7
Add files via upload by
2019-11-02 10:06:13 +0800
1df9a2013
new sgemm kernel for skylakex by
2019-11-02 00:00:48 +0800
274ff5cdb
update sgemm_q on skylakex cpus by
2019-11-01 23:59:18 +0800
eb2eddf24
Merge pull request #2296 from kdunee/develop by
2019-10-28 13:24:18 +0100
869182594
(refs/pull/2296/head)
Fixed a minor cmake problem, occuring when DYNAMIC_CORE=ON and CMAKE_C_FLAGS was empty by
2019-10-28 08:51:05 +0100
7dc8a76f6
Merge pull request #2293 from martin-frbg/pr2288 by
2019-10-25 23:46:39 +0200
df857551c
(refs/pull/2294/head)
Remove special parameter set for obsolete IOS/ARMV8 workaround by
2019-10-25 23:07:00 +0200
85ccdce8c
Remove the IOS fallbacks to generic C kernels by
2019-10-25 23:02:37 +0200
aeabe0a83
Fix regex to parse -R options with and without whitespace by
2019-10-25 22:52:30 +0200
1b9098966
(refs/pull/2293/head)
Add NetBSD to the xBSD conditionals by
2019-10-25 12:52:49 +0200
e3e8b5cdc
Add NetBSD by
2019-10-25 12:51:06 +0200
69b16a894
Merge pull request #2292 from martin-frbg/g95fixes by
2019-10-25 10:35:17 +0200
6782e5767
Merge pull request #2291 from martin-frbg/gensymbol by
2019-10-25 10:34:50 +0200
48f5a89f9
Merge pull request #2282 from martin-frbg/issue2281 by
2019-10-25 09:56:30 +0200
4ae1610f3
Merge pull request #2290 from martin-frbg/cpuidfixes by
2019-10-24 22:52:15 +0200
911c3e2f4
(refs/pull/2292/head)
Improve support for g95 and non-GNU ld by
2019-10-24 22:43:27 +0200
fab49e49e
(refs/pull/2291/head)
Move most lapack 3.7/3.8 additions to the embedded_underscores list by
2019-10-24 21:26:20 +0200
b687fba5b
(refs/pull/2282/head)
Disable direct clock register access on IOS and Android by
2019-10-24 21:18:17 +0200
46a8c2519
Remove prototype of unused, unimplemented function (#2274) by
2019-10-24 12:56:53 -0400
e9437eebd
(refs/pull/2290/head)
Restore Goldmont ID and improve QEMU support by
2019-10-24 18:45:27 +0200
3a39062cf
Merge pull request #12 from xianyi/develop by
2019-10-24 18:40:13 +0200
0394e1195
(refs/pull/2288/head)
NetBSD fix by
2019-10-22 08:44:39 +0200
eaa0be131
Merge pull request #2286 from wjc404/develop by
2019-10-20 12:44:19 +0200
6ff013bae
(refs/pull/2286/head)
native support for icopy_4 by
2019-10-19 03:54:44 +0800
0d669e04b
Update dgemm_kernel_8x8_skylakex.c by
2019-10-18 15:00:17 +0800
17cdd9f9e
some correction by
2019-10-18 14:58:07 +0800
6bcb06fcb
make further changes to icopy_8 easier by
2019-10-18 10:47:31 +0800
b7315f840
Add files via upload by
2019-10-16 19:23:36 +0800
9b19e9e1b
Update dgemm_kernel_8x8_skylakex.c by
2019-10-16 10:14:51 +0800
6bd67ddba
Update dgemm_kernel_8x8_skylakex.c by
2019-10-16 03:20:08 +0800
5da9484d9
Add files via upload by
2019-10-16 02:01:13 +0800
844629af5
Add files via upload by
2019-10-16 02:00:34 +0800
467c55534
(refs/pull/2274/head)
Remove beta-thread function per request by
2019-10-11 08:04:03 -0400
2beaa82c0
Merge pull request #2283 from martin-frbg/issue2176 by
2019-10-09 22:06:09 +0200
e8a2aed2b
(refs/pull/2283/head)
Support QEMU cpu calling itself 64bit AMD Athlon as well by
2019-10-09 18:24:13 +0200
f26203168
Support QEMU virtual cpu as CORE2 by
2019-10-08 22:30:02 +0200
5f6206fa2
Simplify OSX/IOS cross-compilation and add a CI test for it (#2279) by
2019-10-08 20:13:14 +0200
f2cde2ccf
Update common_arm64.h by
2019-10-08 20:12:08 +0200
ba7838d2e
Merge pull request #2280 from martin-frbg/iosfix by
2019-10-08 10:25:25 +0200
a448884a6
(refs/pull/2280/head)
Remove automatic label postfixes from macro included only once by
2019-10-08 08:37:50 +0200
17609f88f
Merge pull request #11 from xianyi/develop by
2019-10-08 08:32:52 +0200
3a2df19db
Fix accidental duplication of jump instruction by
2019-10-08 08:09:26 +0200
12856eb8e
(refs/pull/2279/head)
Fix PROLOGUE for OSX/IOS by
2019-10-07 23:03:50 +0200
b6af6a5a5
Handle platforms that lack hwcap.h by falling back to ARMV8 by
2019-10-07 20:34:28 +0200
40eb3c22b
Update .travis.yml by
2019-10-07 18:26:56 +0200
6eac49178
Update .travis.yml by
2019-10-07 16:43:32 +0200
a1eb21fbb
Fix indentation by
2019-10-07 15:22:40 +0200
38ad1e4db
Add OSX/IOS cross-compilation test to Travis CI by
2019-10-07 13:57:01 +0200
76c2bf6c8
Add automatic fixups for OSX/IOS cross-compilation by
2019-10-07 13:54:47 +0200
5b65adc5f
Merge pull request #10 from xianyi/develop by
2019-10-07 13:52:34 +0200
d2093a40d
Merge pull request #2277 from martin-frbg/issue2275 by
2019-10-06 23:01:54 +0200
aa04b0925
Merge pull request #2276 from xianyi/revert-2272-thread-sqrt-of-negative by
2019-10-06 11:12:44 +0200
258ac56e0
Move 32bit OSX build back to xcode 8.3 but switch to gcc8 by
2019-10-05 10:52:47 +0200
56837e9d9
(refs/pull/2277/head)
Make local labels in macro compatible with the xcode assembler by
2019-10-04 14:53:23 +0200
bb5413863
Rewrite ARM64 PROLOGUE to make it compatible with xcode/ios by
2019-10-04 14:50:03 +0200
32f5907fe
Update 32bit macOS again to xcode 9.3 by
2019-10-03 01:09:02 +0200
ac10236cc
Update the OSX BINARY=32 test to xcode9.2 by
2019-10-02 22:35:34 +0200
8617d7554
(refs/pull/2276/head, revert-2272-thread-sqrt-of-negative)
Revert "Avoid taking root of negative number in symv_thread.c" by
2019-10-01 23:50:41 +0200
4cb4738f3
Fix source typo by
2019-09-30 09:10:19 -0400
ec7ab144b
(refs/pull/2273/head)
Fix various source comment typos by
2019-09-30 09:08:37 -0400
c07d78b9e
Merge pull request #2272 from seberg/thread-sqrt-of-negative by
2019-09-30 11:27:29 +0200
6355c25dd
(refs/pull/2272/head)
Avoid taking root of negative number in symv_thread.c by
2019-09-29 22:03:12 -0700
5e244d80f
Merge pull request #2271 from quickwritereader/strmm_fix by
2019-09-29 13:53:45 +0200
ede5efeba
(refs/pull/2271/head)
trmm fix by
2019-09-29 02:27:50 +0000
84908d60d
Merge pull request #2269 from martin-frbg/ppc-fixes by
2019-09-27 09:52:19 +0200
596a22325
(refs/pull/2269/head)
Fix prologue of power9 assembly cdot(c) kernel to provide cdotc by
2019-09-27 00:47:18 +0200