Maarten Baert
b37889e52d
Merge branch 'OpenMathLib:develop' into fix-dlasd7
10 months ago
Martin Kroeker
11ce79a4f0
Merge pull request #5329 from foxtran/fix/docs
Update FAQ
10 months ago
Maarten Baert
0904a42fa4
Fix documentation error and ordering bug in DLASD7
10 months ago
Martin Kroeker
d24195e9a1
Merge pull request #5295 from Pengzhou0810/develop
Fix some hyperthreading errors.
10 months ago
zhoupeng
134b21ae60
Fix some hyperthreading errors.
When there are multiple NUMA nodes and hyper-threading causes adjacent logical cores to share a physical core (e.g., common -> avail[i] = 0x5555555555555555UL), the numa_mapping function should not use a bitmask for filtering, as this would lead to redundant masking with the subsequent local_cpu_map function.
11 months ago
Martin Kroeker
d96daa220d
Merge pull request #5290 from Srangrang/develop
Add support for FP16 to openBLAS and shgemm on RISCV
10 months ago
Martin Kroeker
fdc1c32340
Merge pull request #5336 from martin-frbg/issue5332
Use response files on old PPC/Intel Macs in single-target builds too
10 months ago
Martin Kroeker
5aa483e16c
Use response files on old PPC/Intel Macs in single-target builds too
10 months ago
Martin Kroeker
12591caa91
Merge pull request #5334 from azuresky01/develop
Fix INTERFACE64 builds on Loongarch64 with LLVM
10 months ago
Martin Kroeker
ee26caffb3
Merge pull request #5309 from davidz-ampere/dev-ampereone
Add support for Ampere AmpereOne processors
10 months ago
Martin Kroeker
8b08df5c5a
Merge pull request #5335 from martin-frbg/issue5330
Remove non-portable option from objcopy calls in the CMake build
10 months ago
Martin Kroeker
3bba35b8f7
Remove non-portable option from objcopy calls
10 months ago
azuresky01
8953ba9c2f
Fix INTERFACE64 builds on Loongarch64 with LLVM
fix https://github.com/OpenMathLib/OpenBLAS/issues/5331
10 months ago
davidz-ampere
aa90ab4142
Add support for Ampere AmpereOne processors
10 months ago
Igor S. Gerasimov
46b0dfef8f
Use links to issues
10 months ago
Igor S. Gerasimov
83efceb3cd
Keep dgemm_snb_1thread.png in repo
10 months ago
Martin Kroeker
b4945057b7
Merge pull request #5319 from imciner2/im/armtypes
Update SBGEMM neoversev1 kernel to use standard C types
10 months ago
Martin Kroeker
b3904aeed7
Merge pull request #5323 from imciner2/im/ofast
Switch power to use O3 instead of Ofast
10 months ago
Ian McInerney
721c80644b
Switch power to use O3 instead of Ofast
Ofast enables possibly unsafe optimizations in addition to O3. This
appears to have been added and then just continually copied into later
Power architectures, and it wasn't included in the CMake build system
when that was introduced.
Replace this with O3 so that the same level of optimization is done by
the compiler.
10 months ago
Ian McInerney
badef1d32e
Update sbgemm_tcopy_4_neoversev1 kernel to use standard C types
10 months ago
Martin Kroeker
4e6da5ed34
Update version to 0.3.30.dev
10 months ago
Martin Kroeker
8dff37827e
Update version to 0.3.30.dev
10 months ago
Martin Kroeker
c055c36b40
Merge pull request #5317 from OpenMathLib/release-0.3.0
merge back from 0.3.0 to copy tag
10 months ago
Martin Kroeker
993fad6aeb
Update version to 0.3.30
10 months ago
Martin Kroeker
3382763df6
Update version to 0.3.30
10 months ago
Martin Kroeker
e81fca06dd
Merge pull request #5316 from OpenMathLib/develop
Update from develop for 0.3.30 release
10 months ago
Martin Kroeker
d339bd5515
Merge pull request #5308 from martin-frbg/changelog0330
Update the Changelog for version 0.3.30
10 months ago
Martin Kroeker
157273fda0
another round of last minute updates for 0.3.30
10 months ago
Martin Kroeker
1546599a13
Merge pull request #5315 from loss-and-quick/arm-exec-stack
Add .note.GNU-stack in ARM epilogue to avoid writable stack
10 months ago
minicx
79b4dd0fb0
fix(arm): add .note.GNU-stack to ARM assembly to prevent writable-stack warnings
Add .section .note.GNU-stack in ARM assembly epilogue on Linux/ELF targets to
avoid warnings about a writable/executable stack and ensure shared objects do
not require an executable stack.
Signed-off-by: minicx <minicx@disroot.org>
10 months ago
Martin Kroeker
c2342fc2d0
Merge pull request #5314 from martin-frbg/dynampere1
Support AmpereOne/OneA as NeoverseN1 in DYNAMIC_ARCH builds
10 months ago
Martin Kroeker
e541bf68f5
support AmpereOne/OneA as NeoverseN1
10 months ago
Martin Kroeker
5ad6435660
Merge pull request #5312 from martin-frbg/x86cdot
Work around X86 POTRS/CDOT bug on old systems and add CI job for 32bit manylinux
10 months ago
Martin Kroeker
e684e36377
Add 32bit manylinux to match what python wheel build tests use
10 months ago
Martin Kroeker
3318a2b904
override CDOT and ZDOT with the generic C kernel
10 months ago
davidz-ampere
84730068af
reduce duplicate kernel code
10 months ago
Martin Kroeker
85337c5160
Merge pull request #5310 from nakagawa-fj/bugfix/identify_cpu_part_for_arm64
Bug Fix: Problem with identifying some ARM64 processors
10 months ago
Martin Kroeker
53cd6e7ff7
Update Changelog.txt
10 months ago
Masato Nakagawa
1dd396033a
Fix:Problem with identifying some ARM64 processors.
10 months ago
Srangrang
3b1ac29b77
disable BUILD_HFLOAT16
10 months ago
davidz-ampere
be68ef03b4
Add support for Ampere processors
10 months ago
Martin Kroeker
f1097d1cba
Merge pull request #5306 from martin-frbg/lapack1131
Fix missing initialization leading to bypassing corner cases in C/ZGEQP3RK (Reference-LAPACK PR #1131 )
10 months ago
Martin Kroeker
3fe7f196e6
Update the Changelog for version 0.3.30
10 months ago
Martin Kroeker
bad47bd024
Fix too strict leading dimensions check in LAPACKE_?gesdd_work (Reference-LAPACK PR #1126 ) ( #5307 )
* relax leading dimensions check (Reference-LAPACK PR #1126 )
10 months ago
Martin Kroeker
7f3093a0ad
Merge pull request #5305 from martin-frbg/lapack1135
Fix 2nd dimension used by LAPACKE_c/zunmlq in NaN check and transposition (Reference-LAPACK PR #1135 )
10 months ago
Martin Kroeker
1804ff58d7
fix missing initialization
10 months ago
Martin Kroeker
906b9df316
fix missing initialization
10 months ago
Martin Kroeker
f4e5177050
fix dimension used in nancheck (Reference-LAPACK PR 1135)
10 months ago
Martin Kroeker
2a6beac88f
fix dimension used in transposition (Reference-LAPACK PR 1135)
10 months ago
Martin Kroeker
d8a2324699
fix dimension used in nancheck (Reference-LAPACK PR 1135)
10 months ago