Martin Kroeker
45333d5793
Fix error introduced during cleanup
7 years ago
Martin Kroeker
78d9910236
Correct range_n limiting
same bug as seen in #1388 , somehow missed in corresponding PR #1389
7 years ago
Martin Kroeker
5a720cf9ca
Re-enable loop unrolling in trmv and remove the scary warning
fixes #1748 as that half of the fix for #1332 appears to have been an overreaction on my part.
7 years ago
Martin Kroeker
368d14f8c8
Fix harmless typo
fixes #1872
7 years ago
Martin Kroeker
0427277cef
Allow optimization for small m, large n only if it can be made threadsafe
otherwise the introduction of a static array in 8e5a108 to improve #532 breaks concurrent calls from multiple threads as seen in #1844
7 years ago
Martin Kroeker
cc9500db41
Merge pull request #1403 from brada4/develop
Address few more warnings
8 years ago
Andrew
bfc2a88594
remove unused buffer
8 years ago
Martin Kroeker
177b78c8b4
Issue1388 ( #1389 )
* Calculation of chunk range limits was ignoring num_cpu
bug introduced by me in #1262 - should fix #1388
* Calculation of range limits was ignoring num_cpu
bug introduced by me in #1262
* Calculation of chunk range limits was ignoring num_cpu
bug introduced by me in #1262
* Calculation of chunk range limits was ignoring num_cpu
bug introduced by me in #1262
* Calculation of chunk range limits was ignoring num_cpu
bug introduced by me in #1262
* Calculation of chunk range limits was ignoring num_cpu
bug introduced by me in #1262
8 years ago
Andrew
281a2b952f
warning cleanup ( #1380 )
* dead increments in driver/level2
* dead increments in kernel/generic
* part dead increments in kernel/x86_64
8 years ago
Martin Kroeker
b414283f48
Disable gemv unrolling
as a (hopefully temporary) workaround for #1332
8 years ago
Andrew
e14d50d86e
eliminate Wunused-const gcc7 warning
8 years ago
Sacha Refshauge
37858d1146
Fix threading usage in CMake: s/SMP/USE_THREAD/
8 years ago
Martin Kroeker
719fcc56b0
Merge pull request #1262 from martin-frbg/xmv_thread-splitting
Make sure that range limit of last thread never exceeds data size
8 years ago
Martin Kroeker
0ba64cee60
Update trmv_thread.c
8 years ago
Martin Kroeker
c4e5ba1bfe
Make sure that range_n of last thread never exceeds the actual data size when splitting the workload
8 years ago
Martin Kroeker
a6f533b248
Revert "Fix calculated range limit exceeding actual data size for last thread"
8 years ago
Isuru Fernando
d245caa49a
Support out-of-source build
8 years ago
Martin Kroeker
585c0010a5
Fix range limit exceeding actual data size in last step
8 years ago
Martin Kroeker
857f61bc5d
Fix range limit exceeding data size in last step
8 years ago
Martin Kroeker
9332042d5f
Fix range exceeding actual data size in quick_divide
8 years ago
Andrew
529bfc36ec
Fix write past fixed size buffer
8 years ago
John Biddiscombe
053044ae4d
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
9 years ago
Jerome Robert
53ba1a77c8
ztrmv_L.c: no longer need a 4kB buffer
Fix #786
10 years ago
Jerome Robert
78dcf5c3d5
Improve performances of ztrmv on small matrices
* Use stack allocation
* Disable multi-threading
* Ref #727
10 years ago
Ralph Campbell
fbc21266e6
Minor C code fixes in driver/
10 years ago
Zhang Xianyi
d8392c1245
Fixe cmake config bugs.
10 years ago
Zhang Xianyi
f8eba3d548
Fixed cmake build bugs on Linux.
10 years ago
Zhang Xianyi
f874465bb8
Use cmake to build OpenBLAS GENERIC Target on MSVC x86 64-bit.
Disable CBLAS and LAPACK.
10 years ago
Zhang Xianyi
dcd5ba4443
Merge branch 'cmake' of https://github.com/hpanderson/OpenBLAS into hpanderson_cmake
10 years ago
Zhang Xianyi
8e5a1083bb
Refs #532 . Improve gemv paralel with small m and large n case.
Splite the matrix and reduction.
10 years ago
Hank Anderson
ab7043373f
Fixed bug generating trmv complex source names.
11 years ago
Hank Anderson
0553476fba
Added TRANS defines for complex sources in lapack.
11 years ago
Hank Anderson
2416d9dbac
Fixed TRANSA defines for complex sources in driver/level2.
11 years ago
Hank Anderson
0d8e227ea7
Changed strategy for setting preprocessor definitions.
Instead of generating separate object files for each permutation of
defines for a source file, GenerateNamedObjects now writes an entirely
new source file and inserts the defines as #define c statements.
This solves a problem I ran into with ar.exe where it was refusing to
link objects that had the same filename despite having different paths.
11 years ago
Hank Anderson
1b7f427401
Added conj gemv objects for complex build.
11 years ago
Hank Anderson
fb5d5bb971
Added defines for complex trmv.
11 years ago
Hank Anderson
33c5e8db7f
Added a helper function for setting the L1 kernel defaults.
Added loop to build objects with different KERNEL defines.
11 years ago
Hank Anderson
4662a0b13a
Changed generate functions to iterate through a list of float types.
This will generate obj files for SINGLE/DOUBLE/COMPLEX/DOUBLE COMPLEX.
11 years ago
Hank Anderson
e8c39138c6
Removed return value from GenerateNamedObjects.
It sets DBLAS_OBJS directly to save a bunch of list appending in the
CMakeLists.txt files.
11 years ago
Hank Anderson
2f59135eb6
Added gemv to level2 CMakeLists.txt.
11 years ago
Hank Anderson
6b5d26e07b
Added SMP sources to level2 CMakeLists.txt.
11 years ago
Hank Anderson
943fa2fb58
Fixed object names in level2.
11 years ago
Hank Anderson
0d7bad1f35
Changed GenerateObjects to append combination codes (e.g. dtrmm_TU).
11 years ago
Hank Anderson
5057a4b4df
Added openblas add_library call that uses DBLAS_OBJS ojbects.
11 years ago
Hank Anderson
a6cf8aafc0
Updated level3/CMakeLists with correct defines using all combos.
11 years ago
Hank Anderson
8c23965da3
prebuild.cmake now reads the output from getarch into CMake vars.
11 years ago
Hank Anderson
8ede4a8da4
getarch now compiles and sets config.h defines properly.
Still isn't parsed into CMake variables, and getarch_2 needs to
get the same treatment.
11 years ago
Hank Anderson
1c5b6bb4f7
Added CORE define to config.h in prebuild.cmake (temporarily).
11 years ago
Hank Anderson
9a508abdc7
Added first pass at driver/level2 makefile conversion.
11 years ago
Timothy Gu
6c2ead30f0
Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
11 years ago