Martin Kroeker
4e2a32ff51
Merge pull request #4454 from kseniyazaytseva/riscv-rvv07
Fix BLAS and LAPACK tests for C910V and RISCV64_ZVL256B targets
2 years ago
Martin Kroeker
a21b2fa5e4
Merge pull request #4452 from kseniyazaytseva/riscv-generic
Fix BLAS, BLAS-like functions and Generic RISC-V kernels
2 years ago
Andrey Sokolov
73530b03fa
remove RISCV64_ZVL256B additional extentions
2 years ago
kseniyazaytseva
86943afa9c
Fix x280 taget include riscv_vector.h
2 years ago
Andrey Sokolov
9c49a81d54
Resolve conflicts
2 years ago
kseniyazaytseva
e1afb23811
Fix BLAS and LAPACK tests for C910V and RISCV64_ZVL256B targets
* Fixed bugs in dgemm, [a]min\max, asum kernels
* Added zero checks for BLAS kernels
* Added dsdot implementation for RVV 0.7.1
* Fixed bugs in _vector files for C910V and RISCV64_ZVL256B targets
* Added additional definitions for RISCV64_ZVL256B target
3 years ago
Martin Kroeker
10c22f4a39
Merge pull request #4355 from imaginationtech/img-riscv64-zvl128b
[RISC-V] Add RISC-V Vector 128-bit target
2 years ago
Octavian Maghiar
ccbc3f875b
[RISC-V] Add RISCV64_ZVL128B target to common_riscv64.h
2 years ago
Octavian Maghiar
deecfb1a39
Merge branch 'risc-v' into img-riscv64-zvl128b
2 years ago
kseniyazaytseva
f89e0034a4
Fix LAPACK usage from BLAS
2 years ago
Martin Kroeker
f7cf637d7a
redo lost edit
3 years ago
Martin Kroeker
85548e66ca
Fix build failures seen with the NO_LAPACK option - cspr/csymv/csyr belong on the LAPACK list
3 years ago
Martin Kroeker
f129161453
restore C/Z SPMV, SPR, SYR,SYMV
3 years ago
kseniyazaytseva
5222b5fc18
Added axpby kernels for GENERIC RISC-V target
2 years ago
Martin Kroeker
1c04df20bd
Re-enable overriding the LAPACK SYMV,SYR,SPMV and SPR implementations
3 years ago
Martin Kroeker
5b4df851d7
fix stray blank on continuation line
3 years ago
kseniyazaytseva
ff41cf5c49
Fix BLAS, BLAS-like functions and Generic RISC-V kernels
* Fixed gemmt, imatcopy, zimatcopy_cnc functions
* Fixed cblas_cscal testing in ctest
* Removed rotmg unreacheble code
* Added zero size checks
3 years ago
Martin Kroeker
88e994116c
Merge pull request #4354 from imaginationtech/img-rvv-kernel-generator
[RISC-V] Improve RVV kernel generator LMUL usage
2 years ago
Martin Kroeker
e3508d3713
Merge pull request #4439 from sergei-lewis/risc-v
Fix builds with t-head toolchains that use old intrinsics spec
2 years ago
Sergei Lewis
9edb805e64
fix builds with t-head toolchains that use old versions of the intrinsics spec
2 years ago
Martin Kroeker
1332f8a822
Merge pull request #4159 from OMaghiarIMG/risc-v-tail-policy
Set tail policy to undisturbed for RVV intrinsics accumulators
2 years ago
Martin Kroeker
2d316c2920
Merge pull request #4125 from OMaghiarIMG/risc-v
Fixes RVV masked intrinsics for iamax/iamin/imax/imin kernels
2 years ago
Octavian Maghiar
4a12cf53ec
[RISC-V] Improve RVV kernel generator LMUL usage
The RVV kernel generation script uses the provided LMUL to increase the number of accumulator registers.
Since the effect of the LMUL is to group together the vector registers into larger ones, it actually should be used as a multiplier in the calculation of vlenmax.
At the moment, no matter what LMUL is provided, the generated kernels would only set the maximum number of vector elements equal to VLEN/SEW.
Commit changes the use of LMUL to properly adjust vlenmax. Note that an increase in LMUL results in a decrease in the number of effective vector registers.
2 years ago
Octavian Maghiar
e4586e81b8
[RISC-V] Add RISC-V Vector 128-bit target
Current RVV x280 target depends on vlen=512-bits for Level 3 operations.
Commit adds generic target that supports vlen=128-bits.
New target uses the same scalable kernels as x280 for Level 1&2 operations, and autogenerated kernels for Level 3 operations.
Functional correctness of Level 3 operations tested on vlen=128-bits using QEMU v8.1.1 for ctests and BLAS-Tester.
2 years ago
Octavian Maghiar
826a9d5fa4
Adds tail undisturbed for RVV Level 2 operations
During the last iteration of some RVV operations, accumulators can get overwritten when VL < VLMAX and tail policy is agnostic.
Commit changes intrinsics tail policy to undistrubed.
2 years ago
Octavian Maghiar
8df0289db6
Adds tail undisturbed for RVV Level 1 operations
During the last iteration of some RVV operations, accumulators can get overwritten when VL < VLMAX and tail policy is agnostic.
Commit changes intrinsics tail policy to undistrubed.
2 years ago
Octavian Maghiar
1e4a3a2b5e
Fixes RVV masked intrinsics for izamax/izamin kernels
2 years ago
Octavian Maghiar
e1958eb705
Fixes RVV masked intrinsics for iamax/iamin/imax/imin kernels
Changes masked intrinsics from _m to _mu and reintroduces maskedoff argument.
2 years ago
Martin Kroeker
62f0f506ec
Merge pull request #4049 from sh-zheng/risc-v
Add rvv support for zsymv and active rvv support for zhemv
2 years ago
ZhengSh
2a8bc38cdc
Merge branch 'xianyi:risc-v' into risc-v
2 years ago
Martin Kroeker
5147831f25
Merge pull request #4074 from HellerZheng/risc-v
fix wrong vr = VFMVVF_FLOAT(0, vl); in symv_L_rvv.c and symv_U_rvv.c
2 years ago
Heller Zheng
0954746380
remove argument unused during compilation.
fix wrong vr = VFMVVF_FLOAT(0, vl);
2 years ago
sh-zheng
d3bf5a5401
Combine two reduction operations of zhe/symv into one, with tail undisturbed setted.
3 years ago
sh-zheng
18d7afe69d
Add rvv support for zsymv and active rvv support for zhemv
3 years ago
Zhang Xianyi
30222d0832
Merge pull request #3971 from HellerZheng/risc-v
RISC-V for new intrinsic API changes
3 years ago
Heller Zheng
6b74bee2f9
Update TARGET=x280 description.
3 years ago
Heller Zheng
1374a2d08b
This PR adapts latest spec changes
Add prefix (_riscv) for all riscv intrinsics
Update some intrinsics' parameter, like vfredxxxx, vmerge
3 years ago
Zhang Xianyi
19f17c8bc6
Merge pull request #3893 from HellerZheng/develop
add riscv level3 C,Z kernel functions.
3 years ago
Zhang Xianyi
20511dfa65
Merge pull request #3919 from sergei-lewis/risc-v-latest-rvv-intrinsics
update riscv intrinsics for latest spec
3 years ago
Sergei Lewis
9b61be4545
factoring riscv64/dot.c fix into separate PR as requested
3 years ago
Sergei Lewis
2406958629
* update intrinsics to match latest spec at https://github.com/riscv-non-isa/rvv-intrinsic-doc (in particular, __riscv_ prefixes for rvv intrinsics)
* fix multiple numerical stability and corner case issues
* add a script to generate arbitrary gemm kernel shapes
* add a generic zvl256b target to demonstrate large gemm kernel unrolls
3 years ago
Heller Zheng
63cf4d0166
add riscv level3 C,Z kernel functions.
3 years ago
Xianyi Zhang
c19dff0a31
Fix T-Head RVV intrinsic API changes.
3 years ago
Xianyi Zhang
d9993e21a2
Refs #3825 Merge branch 'HellerZheng-develop' into risc-v
3 years ago
Xianyi Zhang
e5313f53d5
Merge branch 'develop' of https://github.com/HellerZheng/OpenBLAS_riscv_x280 into HellerZheng-develop
3 years ago
Xianyi Zhang
e284c048df
Merge branch 'develop' into risc-v
3 years ago
Martin Kroeker
0a24f631e9
Merge pull request #3844 from Mousius/switch-ratio-16
Set SWITCH_RATIO for Arm(R) Neoverse(TM) V1 CPUs
3 years ago
Martin Kroeker
65984fbe68
Merge pull request #3847 from bartoldeman/scal-benchmark
scal benchmark: eliminate y, move init/timing out of loop
3 years ago
Martin Kroeker
f6f0d13b9f
Merge pull request #3842 from Mousius/sve-dot
Add SVE implementation for sdot/ddot
3 years ago
Chris Sidebottom
eea006a688
Wrap SVE header with __has_include check
3 years ago