Andrew
11a627c54e
remove surplus parentheses to silence clang5
8 years ago
Andrew
bfc2a88594
remove unused buffer
8 years ago
Andrew
ef95cd471f
elminate unread variable, after reiteration 3 of them (clang4)
8 years ago
Martin Kroeker
db72ad8f6a
Merge pull request #1320 from timmoon10/develop
2D thread distribution for multi-threaded GEMMs
8 years ago
Martin Kroeker
514d237257
Merge pull request #1279 from xsacha/develop
CMake improvements
8 years ago
Tim Moon
30486a356c
Reduce number of data partitions in n.
8 years ago
Tim Moon
9de52b489a
Cleaning up and documenting multi-threaded GEMM code.
8 years ago
Tim Moon
860dcfc703
Use 2D thread distribution for small GEMMs.
Allows maximum use of available cores if one of M and N is small and the other is large.
8 years ago
Tim Moon
6aaa107865
Reducing threads for multi-threaded GEMMs on small matrices.
8 years ago
Sacha Refshauge
37858d1146
Fix threading usage in CMake: s/SMP/USE_THREAD/
8 years ago
Isuru Fernando
d245caa49a
Support out-of-source build
8 years ago
Martin Kroeker
49e62c0e77
fixed syrk_thread.c taken from wernsaar
Stride calculation fix copied from https://github.com/wernsaar/OpenBLAS/commit/88900e1
8 years ago
Werner Saar
a2672d5589
prepared driver/level3 functions for UNROLL values, that are not a power of two
9 years ago
John Biddiscombe
053044ae4d
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
9 years ago
Zhang Xianyi
d06b92906a
Add gemm3m building for CMake.
10 years ago
Werner Saar
b07d733a71
added updates for syrk and syr2k
10 years ago
Zhang Xianyi
055b481386
Fixed CMake bug for single core.
10 years ago
Ralph Campbell
fbc21266e6
Minor C code fixes in driver/
10 years ago
Zhang Xianyi
d8392c1245
Fixe cmake config bugs.
10 years ago
Zhang Xianyi
f874465bb8
Use cmake to build OpenBLAS GENERIC Target on MSVC x86 64-bit.
Disable CBLAS and LAPACK.
10 years ago
Hank Anderson
9eaea02f33
Added additional gemm defines for complex types.
11 years ago
Hank Anderson
0d8e227ea7
Changed strategy for setting preprocessor definitions.
Instead of generating separate object files for each permutation of
defines for a source file, GenerateNamedObjects now writes an entirely
new source file and inserts the defines as #define c statements.
This solves a problem I ran into with ar.exe where it was refusing to
link objects that had the same filename despite having different paths.
11 years ago
Hank Anderson
371071d461
Added CONJ defines for trmm/trsm.
11 years ago
Hank Anderson
8a143516e3
Added alternate_name to a couple of the name mangling schemes.
Added zherk_k sources to driver/level3.
11 years ago
Hank Anderson
e5897ecb9b
Added zherk_kernel.c objects to driver/level3.
11 years ago
Hank Anderson
4662a0b13a
Changed generate functions to iterate through a list of float types.
This will generate obj files for SINGLE/DOUBLE/COMPLEX/DOUBLE COMPLEX.
11 years ago
Hank Anderson
e74462a3f5
Moved declarations to start of functions to satisfy MSVC C89 implementation.
11 years ago
Hank Anderson
056ba26755
Changed a number of inline calls to use __inline.
MSVC doesn't inmplement C99, so can't use the inline keyword. __inline
appears to work in MSVC and GCC.
11 years ago
Hank Anderson
e8c39138c6
Removed return value from GenerateNamedObjects.
It sets DBLAS_OBJS directly to save a bunch of list appending in the
CMakeLists.txt files.
11 years ago
Hank Anderson
627d5e7401
Added SMP objects to driver/level3.
11 years ago
Hank Anderson
943fa2fb58
Fixed object names in level2.
11 years ago
Hank Anderson
461e691127
Codes when define is absent are now a parameter to AllCombinations.
The level3 object names should now be correct.
11 years ago
Hank Anderson
cfaf1c678f
Added option to append define codes with an underscore.
Fixed the code array not getting reset on subsequent AllCombinations
calls.
11 years ago
Hank Anderson
0d7bad1f35
Changed GenerateObjects to append combination codes (e.g. dtrmm_TU).
11 years ago
Hank Anderson
d11bde60d0
DOUBLE define for DBLAS objects is now set in main CMakeLists.txt.
Since the objects are the same, could generate SINGLE/COMPLEX/etc here
without having to rewrite all the object enumeration code again.
11 years ago
Hank Anderson
5057a4b4df
Added openblas add_library call that uses DBLAS_OBJS ojbects.
11 years ago
Hank Anderson
d3dcdddf75
Moved functions into util cmake file.
11 years ago
Hank Anderson
e5e7595bf9
Added paramater to GenerateObjects for defines that affect all sources.
11 years ago
Hank Anderson
7693887d61
Added empty set to the combinations generated by AllCombinations.
11 years ago
Hank Anderson
8d9b196e0d
Moved loop over define combos into a function.
This function takes a set of sources and a set of preprocessor
definitions. It will iterate over the sources and build an object
file for each combination of preprocessor definitions for each
source file.
11 years ago
Hank Anderson
a6cf8aafc0
Updated level3/CMakeLists with correct defines using all combos.
11 years ago
Hank Anderson
dbdca7bf0c
Added first pass at driver/level3 Makefile conversion.
Added a rather convoluted CMake function to find all combinations
of a given list. This will be useful for the object files that are
compiled multiple times with different combinations of preprocessor
definitions.
11 years ago
wernsaar
7aae4a62e7
enabled use of GEMM3M functions
11 years ago
wernsaar
1d33547222
optimized zgemm kernel for haswell
11 years ago
wernsaar
3ea4dadd30
optimizations for trsm
11 years ago
wernsaar
1b10ff129a
optimizations for trmm
11 years ago
wernsaar
125610d23b
allow to set custom value for ?GEMM_DEFAULT_UNROLL_MN, optimizations for syrk
11 years ago
wernsaar
be94db096c
disabled *3M functions for x86_64 platforms
11 years ago
Timothy Gu
6c2ead30f0
Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
11 years ago
wernsaar
c947ab85dc
changed level3.c
12 years ago