Zhang Xianyi
bfe1656b8b
Merge pull request #1225 from martin-frbg/stolen_from_wernsaar_fork
fixed syrk_thread.c taken from wernsaar
8 years ago
Martin Kroeker
49e62c0e77
fixed syrk_thread.c taken from wernsaar
Stride calculation fix copied from https://github.com/wernsaar/OpenBLAS/commit/88900e1
8 years ago
Zhang Xianyi
fa6a920caa
Link -lm or -lm_hard for Android ARMv7.
8 years ago
Zhang Xianyi
a6515bb858
Merge pull request #1218 from m-brow/power9
Optimise loads on Power9 LE
8 years ago
Zhang Xianyi
c66b842d66
Merge pull request #1212 from neilsh-msft/develop
Add Microsoft Windows 10 UWP build support
8 years ago
Martin Kroeker
e5e47cfdb5
Merge pull request #1220 from ashwinyes/develop_aarch64_20170701_t99_options
arm64: Change mtune/mcpu options for THUNDERX2T99 target
8 years ago
Ashwin Sekhar T K
ebf9e9dabe
arm64: Change mtune/mcpu options for THUNDERX2T99 target
8 years ago
Neil Shipp
34513be726
Add Microsoft Windows 10 UWP build support
8 years ago
Zhang Xianyi
482015f8d6
Merge branch 'arm_soft_fp_abi' into develop
8 years ago
Zhang Xianyi
639000e34f
Merge pull request #1211 from neilsh-msft/develop
Add 64bit support for Microsoft Visual Studio
8 years ago
Neil Shipp
5de7727cc7
Reorder dependencies to allow in-place build to succeed the first time.
8 years ago
Neil Shipp
96df4b9b17
Avoid truncating cblas.h when compiling gencblas target
8 years ago
Neil Shipp
29dc8e0c61
Revert changes to sed and awk
8 years ago
Neil Shipp
65e56cb29d
Add 64bit support for Microsoft Visual Studio
8 years ago
Matt Brown
bd831a03a8
Optimise sscal for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
8 years ago
Matt Brown
edc97918f8
Optimise srot for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
8 years ago
Matt Brown
e0034de22d
Optimise sdot for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
8 years ago
Matt Brown
32c7fe6bff
Optimise sasum for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
8 years ago
Matt Brown
19bdf9d52b
Optimise casum for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
8 years ago
Matt Brown
4f09030fdc
Optimise cswap for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
8 years ago
Matt Brown
6f4eca5ea4
Optimise sswap for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
8 years ago
Matt Brown
be55f96cbd
Optimise scopy for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
8 years ago
Matt Brown
96dd0ef4f7
Optimise ccopy for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
8 years ago
Martin Kroeker
8f0d6c06a9
Fix installation of header files with cmake ( #1186 )
* Fix installation of header files with cmake
Install only the required header files, with openblas_config.h preprocessed like in Makefile.install
Fixes #1184
* Update CMakeLists.txt
Escape remaining semicolons in awk argument list (to get it working on Windows as well)
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Add files via upload
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
see if it is the single quotes that cause the problem on windows
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Use C utility instead of awk for header generation in cmake builds
* Update CMakeLists.txt
* Fix generation and installation of header files
Generate openblas_config.h and f77blas.h with same contents as in plain Makefile builds and install only the public header files
8 years ago
Martin Kroeker
410a07cbec
Merge pull request #1190 from oviradoi/utest_make_complex
Update test to use openblas_make_complex_float and openblas_make_comp…
8 years ago
Ovidiu Radoi
72f95a0acc
Update test to use openblas_make_complex_float and openblas_make_complex_double functions
8 years ago
Martin Kroeker
e545b81e76
Merge pull request #1189 from pawosm-arm/flang
build: Flang has the same interface as PGI
8 years ago
Paul Osmialowski
d7afdf9137
build: Flang has the same interface as PGI
Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com>
8 years ago
Martin Kroeker
4f4daaa42a
Merge pull request #1188 from pawosm-arm/flang
build: Flang compiler support
8 years ago
Paul Osmialowski
42bbe74791
build: LLVM: Add Flang compiler support and enable OpenMP for Clang
Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com>
8 years ago
Zhang Xianyi
c8322c65e4
Merge pull request #1187 from mine260309/develop
build: fix libxlmass errors building on Power CPU
8 years ago
Lei YU
87dde1fde6
build: fix libxlmass errors building on Power CPU
IBM MASS library is upgraded to 8.1.5 and 8.1.3 is not available.
Update README.md and Makefile.power to use version 8.1.5 of libxlmass.
8 years ago
Martin Kroeker
42466e54fa
Merge pull request #1182 from martin-frbg/martin-frbg-patch-1
Build shared library on Android without SONAME versioning
8 years ago
Martin Kroeker
3b0624d50f
Build shared library on Android without SONAME versioning
Android does not support versioned SONAME entries, ref. #1173
8 years ago
Martin Kroeker
fd4e68128e
Merge pull request #1178 from jcowgill/mips-fixes
MIPS threading fixes
8 years ago
Martin Kroeker
6464d1723a
Merge pull request #1179 from jcowgill/memory-fixes
Fixes to driver/others/memory.c
8 years ago
James Cowgill
59c97cfee4
memory: Fix buffer overflow when position == NUM_BUFFERS
8 years ago
James Cowgill
de7875ca5d
mips: remove incorrect blas_lock implementations
MIPS 32-bit currently has an empty blas_lock implementation which is
worse than nothing at all. MIPS 64-bit does has a blas_lock
implementation but is broken. Remove them and fallback to the generic
version in common.h which should do the right thing on MIPS.
8 years ago
James Cowgill
67836c2ab4
mips: implement MB and WMB
The MIPS architecture has weak memory ordering and therefore requires
sutible memory barriers when doing lock free programming with multiple
threads (just like ARM does). This commit implements those barriers for
MIPS and MIPS64 using GCC bultins which is probably easiest way.
8 years ago
James Cowgill
5fecfe0f42
memory: switch loop condition around in blas_memory_free
Before this commit, the "position < NUM_BUFFERS" loop condition from
blas_memory_free will be completely optimized away by GCC. This is
because the condition can only be false after undefined behavior has
already been invoked (reading past the end of an array). As a
consequence of this bug, GCC also removes the subsequent if statement
and all the code after the error label because all of it is dead.
This commit switches the loop condition around so it works as intended.
8 years ago
Martin Kroeker
bba6676803
Merge pull request #1175 from martin-frbg/lapack_143
Fix workspace computation in LAPACKE ?tpmqrt
8 years ago
Martin Kroeker
5649b2c53a
Merge pull request #1176 from staticfloat/sf/dynamic_arch
Fix DYNAMIC_ARCH=1 breaking builds on non-x86 platforms
8 years ago
Elliot Saba
6e972994b2
Force `DYNAMIC_ARCH` to empty when `DYNAMIC_CORE` is not set
8 years ago
Elliot Saba
5b04cf7ab4
Add Makefile debugging trick so that we can inspect runtime Makefile variables
8 years ago
Martin Kroeker
d5ea8fd823
Fix workspace computation for side=L
From netlib PR#144
8 years ago
Martin Kroeker
4beffaaa4b
Fix workspace computation for side=L
From netlib PR#144
8 years ago
Martin Kroeker
fb28e4adc9
Fix workspace computation for side=L
From netlib PR#144
8 years ago
Martin Kroeker
26faa3ca47
Fix workspace allocation in lapacke_ctp for side=L
from netlib PR #144
8 years ago
Martin Kroeker
4f75989634
Merge pull request #1169 from martin-frbg/cblas_xerbla
Add trivial implementation of cblas_xerbla
8 years ago
Martin Kroeker
1e06b49854
Update xerbla.c
8 years ago