Martin Kroeker
67bbde71e5
Update harmonyos.yml
1 year ago
Martin Kroeker
108bf599ae
Create harmonyos.yml
1 year ago
Martin Kroeker
e4f83d4485
Merge pull request #5041 from martin-frbg/issue2715
Identify all cores in ARM64 autodetection, return fastest TARGET and performance group sizes
1 year ago
Martin Kroeker
7fd73a40dc
Fix accidentally dropped cpu ids and add MacOS performance groups
1 year ago
Martin Kroeker
a182251284
fix typo
1 year ago
Martin Kroeker
ed95791618
fix conflicting variables
1 year ago
Martin Kroeker
3c3d1c4849
Identify all cores and select the most performant one as TARGET
1 year ago
Martin Kroeker
be807c98a6
Identify all cores, group by performance and report the fastest TARGET
1 year ago
Martin Kroeker
a63282a688
Merge pull request #5037 from tingboliao/develop
Optimize the nrm2_rvv function to further improve performance.
1 year ago
Martin Kroeker
2f86913209
Merge pull request #5040 from martin-frbg/issue922
Add an install_tests target to facilitate testing cross-compiles
1 year ago
Martin Kroeker
e9ff70b394
Add an install_tests target to facilitate testing on cross-compiled targets
1 year ago
Martin Kroeker
85a33326a1
Merge pull request #5039 from martin-frbg/fixgmakenaming
Fix "make install" creating incorrect names for suffixed libraries in the cmake and pkgconfig files
1 year ago
Martin Kroeker
6ad793d65e
Fix naming of suffixed libraries in the cmake and pkgconfig files
1 year ago
Martin Kroeker
0a2d9aaf32
Merge pull request #4982 from svillemot/develop
Restore libsuffix support in pkg-config file
1 year ago
Martin Kroeker
9297c46dfb
Merge pull request #5036 from martin-frbg/issue4032
Add a paragraph on "benign" LAPACK-TEST errors to the FAQ document
1 year ago
tingbo.liao
c37509c213
Optimize the nrm2_rvv function to further improve performance.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
1 year ago
Martin Kroeker
a1075477c3
Merge pull request #4994 from martin-frbg/issue4886
Disable multithreading in ?TRTRI for small workloads
1 year ago
Martin Kroeker
fff2e214ca
Add LAPACK-TEST errors topic
1 year ago
Martin Kroeker
718fb73bd8
Merge pull request #4976 from martin-frbg/m3m_exprec
[WIP]Add better workaround for GEMM3M on GENERIC and re-enable EXPRECISION for x86/x86_64 targets
1 year ago
Martin Kroeker
73527aab3c
Merge pull request #5030 from tingboliao/develop
Optimize the zgemm_tcopy_4_rvv function to be compatible with the situations where the vector lengths(vlens) are 128 and 256.
1 year ago
Martin Kroeker
c1258662db
Merge branch 'OpenMathLib:develop' into m3m_exprec
1 year ago
Martin Kroeker
36b0fb3aff
Merge pull request #5035 from martin-frbg/issue4396
Improve OpenBLASConfig.cmake contents in gmake builds
1 year ago
Martin Kroeker
d863dcf83c
Merge pull request #5033 from rgommers/doc-port-last-wiki-edits
docs: update extensions and install pages with last wiki edits
1 year ago
Martin Kroeker
d5e255519e
Improve OpenBLASConfig.cmake contents
1 year ago
Ralf Gommers
df42f79c4c
docs: update extensions and install pages with last wiki edits
I went through the wiki pages and found two pages with edits that
weren't reflected in the html docs yet, so syncing that content here.
1 year ago
Martin Kroeker
17803e7901
Merge pull request #5031 from david-cortes/fix_doc_links
Fix invalid link to FAQ
1 year ago
david-cortes
762fa1afa9
fix link to faq
1 year ago
Martin Kroeker
6af4e76f31
Merge pull request #5029 from martin-frbg/issue5020
Add support for compiling with Intel oneAPI 2025.0 on MS Windows
1 year ago
Martin Kroeker
fbf594b62f
Guard against empty CMAKE_Fortran_COMPILER_ID
1 year ago
tingbo.liao
c4c3d9e68a
Merge remote-tracking branch 'refs/remotes/origin/develop' into develop
1 year ago
tingbo.liao
0bea1cfd9d
Optimize the zgemm_tcopy_4_rvv function to be compatible with the situations where the vector lengths(vlens) are 128 and 256.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
1 year ago
Martin Kroeker
e6fd629770
Expressly declare the .S extension for assembly (documented as standard, but current cmake does not set it for icx)
1 year ago
Martin Kroeker
05fe49ddaf
Rename local copy functions to avoid name clash with the standard BLAS ones
1 year ago
Martin Kroeker
64c6c79201
Assume no underline suffixes on symbols when compiling with Intel ifx on Windows
1 year ago
Martin Kroeker
5c9417d306
Assume no underline suffixes on symbols when compiling with ifx on Windows
1 year ago
Martin Kroeker
5d81e514e4
Assume no underline suffixes on symbols when compiling with ifx on Windows
1 year ago
Martin Kroeker
d78fbe425c
Assume no underline suffixes on symbols when compiling with ifx on Windows
1 year ago
Martin Kroeker
30188a55d1
Don't assume underlined symbols for ifx; make cpuid.S inclusion conditional
1 year ago
Martin Kroeker
32319a33ac
Add options for Intel oneAPI 2025.0 ifx on Windows
1 year ago
Martin Kroeker
37a4ca7e46
Merge pull request #5025 from martin-frbg/nvidia_arm64
Add target-specific options to enable ARM64 SVE with the NVIDIA compiler
1 year ago
Martin Kroeker
1c4401ebf1
Add target-specific options to enable SVE with the NVIDIA compiler
1 year ago
Martin Kroeker
f2be482d43
Merge pull request #5024 from martin-frbg/issue5001
Improve the wording of the build instructions for Windows on Arm in the docs
1 year ago
Martin Kroeker
70dddacb9f
Merge pull request #5023 from rgommers/fix-warnings
Fix two compiler warnings in `memory.c`
1 year ago
Martin Kroeker
a93d3db34a
fix formatting of WoA section
1 year ago
Martin Kroeker
e460512685
Update WoA build instructions from rewording in issue #5001
1 year ago
Martin Kroeker
d3cc8c65ed
Merge pull request #5022 from tingboliao/develop
Replace the __riscv_vid_v_i32m2 and __riscv_vid_v_i64m2 with __riscv…_vid_v_u32m2 and __riscv_vid_v_u64m2 for riscv64-unknown-linux-gnu-gcc compiling.
1 year ago
Ralf Gommers
765ad8bcd2
Fix guard around `alloc_hugetlb`, fixes compile warning
The warning was:
```
/home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c: At top level:
/home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c:2565:14: warning: 'alloc_hugetlb' defined but not used [-Wunused-function]
2565 | static void *alloc_hugetlb(void *address){
| ^~~~~~~~~~~~~
```
The added define is the same as is already present in the TLS part of
`memory.c`. This follows up on gh-4681.
1 year ago
Ralf Gommers
48caf2303d
Fix build warning about discarding volatile qualifier in memory.c
The warning was:
```
[4339/5327] Building C object driver/others/CMakeFiles/driver_others.dir/memory.c.o
/home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c: In function 'blas_shutdown':
/home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c:3257:10: warning: passing argument 1 of 'free' discards 'volatile' qualifier from pointer target type [-Wdiscarded-qualifiers]
3257 | free(newmemory);
| ^~~~~~~~~
In file included from /home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/common.h:83,
from /home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c:74:
/home/rgommers/code/pixi-dev-scipystack/openblas/.pixi/envs/default/x86_64-conda-linux-gnu/sysroot/usr/include/stdlib.h:482:25: note: expected 'void *' but argument is of type 'volatile struct newmemstruct *'
482 | extern void free (void *__ptr) __THROW;
| ~~~~~~^~~~~
```
The use of `volatile` for `newmemstruct` seems on purpose, and there are
more such constructs in this file. The warning appeared after gh-4451
and is correct. The `free` prototype doesn't expect a volatile pointer,
hence this change adds a cast to silence the warning.
1 year ago
tingbo.liao
d00cc400b1
Replaced the __riscv_vid_v_i32m2 and __riscv_vid_v_i64m2 with __riscv_vid_v_u32m2 and __riscv_vid_v_u64m2 for riscv64-unknown-linux-gnu-gcc compiling.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
1 year ago
Martin Kroeker
229d8a025e
Merge pull request #4959 from CDAC-Bengaluru/level-1-sve
SVE Implementation for Level-1 BLAS Routines
1 year ago