OpenBLAS

Commit Graph

Author	SHA1	Message	Date
Martin Kroeker	cf80bd8500	Update nrm2_rvv.c	2 years ago
Martin Kroeker	9baa757905	Update nrm2_vector.c	2 years ago
Martin Kroeker	18a6db6862	Update nrm2_vector.c	2 years ago
Martin Kroeker	3752e73919	handle incx < 0	2 years ago
Martin Kroeker	db70c7f7fb	handle incx < 0	2 years ago
Martin Kroeker	dee8557d58	handle incx < 0	2 years ago
Martin Kroeker	d9dff17aec	handle incx < 0	2 years ago
Martin Kroeker	6b89e1f1d7	fix loop condition for incx < 0	2 years ago
Martin Kroeker	20016a0096	fix loop condition for incx < 0	2 years ago
Sergei Lewis	ba17758c02	fix axpy implementations where y has a stride of 0	2 years ago
Sergei Lewis	ff1523163f	Fix axpy test hangs when n==0. Reenable zaxpy_vector kernel for C910V.	2 years ago
Martin Kroeker	6d8a273cca	Handle zero increment(s) in C910V ?AXPBY (#4483 ) * Handle zero increment(s)	2 years ago
Martin Kroeker	4d8dee508c	temporarily disable the CAXPY/ZAXPY kernels	2 years ago
Sergei Lewis	a3b0ef6596	Restore riscv64 fixes from develop branch: dot product double precision accumulation, zscal NaN handling	2 years ago
Sergei Lewis	1093def0d1	Merge branch 'risc-v' into develop	2 years ago
Martin Kroeker	889c5d026a	Merge pull request #4456 from kseniyazaytseva/riscv-rvv10 Fix BLAS and LAPACK tests for RVV 1.0 target, update to 0.12.0 intrincics	2 years ago
Martin Kroeker	4e2a32ff51	Merge pull request #4454 from kseniyazaytseva/riscv-rvv07 Fix BLAS and LAPACK tests for C910V and RISCV64_ZVL256B targets	2 years ago
Martin Kroeker	a21b2fa5e4	Merge pull request #4452 from kseniyazaytseva/riscv-generic Fix BLAS, BLAS-like functions and Generic RISC-V kernels	2 years ago
Andrey Sokolov	9c49a81d54	Resolve conflicts	2 years ago
kseniyazaytseva	e1afb23811	Fix BLAS and LAPACK tests for C910V and RISCV64_ZVL256B targets * Fixed bugs in dgemm, [a]min\max, asum kernels * Added zero checks for BLAS kernels * Added dsdot implementation for RVV 0.7.1 * Fixed bugs in _vector files for C910V and RISCV64_ZVL256B targets * Added additional definitions for RISCV64_ZVL256B target	3 years ago
Octavian Maghiar	deecfb1a39	Merge branch 'risc-v' into img-riscv64-zvl128b	2 years ago
kseniyazaytseva	5222b5fc18	Added axpby kernels for GENERIC RISC-V target	2 years ago
kseniyazaytseva	ff41cf5c49	Fix BLAS, BLAS-like functions and Generic RISC-V kernels * Fixed gemmt, imatcopy, zimatcopy_cnc functions * Fixed cblas_cscal testing in ctest * Removed rotmg unreacheble code * Added zero size checks	3 years ago
kseniyazaytseva	b193ea3d7b	Fix BLAS and LAPACK tests for RVV 1.0 target, update to 0.12.0 intrincics * Update intrincics API to 0.12.0 version (Stride Segment Loads/Stores) * Fixed nrm2, axpby, ncopy, zgemv and scal kernels * Added zero size checks	2 years ago
Martin Kroeker	88e994116c	Merge pull request #4354 from imaginationtech/img-rvv-kernel-generator [RISC-V] Improve RVV kernel generator LMUL usage	2 years ago
Sergei Lewis	9edb805e64	fix builds with t-head toolchains that use old versions of the intrinsics spec	2 years ago
Martin Kroeker	f637e12713	Handle INF and NAN	2 years ago
Martin Kroeker	f0808d856b	Handle NAN in input	2 years ago
Octavian Maghiar	4a12cf53ec	[RISC-V] Improve RVV kernel generator LMUL usage The RVV kernel generation script uses the provided LMUL to increase the number of accumulator registers. Since the effect of the LMUL is to group together the vector registers into larger ones, it actually should be used as a multiplier in the calculation of vlenmax. At the moment, no matter what LMUL is provided, the generated kernels would only set the maximum number of vector elements equal to VLEN/SEW. Commit changes the use of LMUL to properly adjust vlenmax. Note that an increase in LMUL results in a decrease in the number of effective vector registers.	2 years ago
Octavian Maghiar	e4586e81b8	[RISC-V] Add RISC-V Vector 128-bit target Current RVV x280 target depends on vlen=512-bits for Level 3 operations. Commit adds generic target that supports vlen=128-bits. New target uses the same scalable kernels as x280 for Level 1&2 operations, and autogenerated kernels for Level 3 operations. Functional correctness of Level 3 operations tested on vlen=128-bits using QEMU v8.1.1 for ctests and BLAS-Tester.	2 years ago
Martin Kroeker	a34a0a7abc	Allow negative INCX (API change from version 3.10 of the reference implementation)	2 years ago
Octavian Maghiar	826a9d5fa4	Adds tail undisturbed for RVV Level 2 operations During the last iteration of some RVV operations, accumulators can get overwritten when VL < VLMAX and tail policy is agnostic. Commit changes intrinsics tail policy to undistrubed.	2 years ago
Octavian Maghiar	8df0289db6	Adds tail undisturbed for RVV Level 1 operations During the last iteration of some RVV operations, accumulators can get overwritten when VL < VLMAX and tail policy is agnostic. Commit changes intrinsics tail policy to undistrubed.	2 years ago
Martin Kroeker	76ef1672f8	Override DSDOT with generic code to get rid of qemu precision error	2 years ago
Octavian Maghiar	1e4a3a2b5e	Fixes RVV masked intrinsics for izamax/izamin kernels	2 years ago
Octavian Maghiar	e1958eb705	Fixes RVV masked intrinsics for iamax/iamin/imax/imin kernels Changes masked intrinsics from _m to _mu and reintroduces maskedoff argument.	2 years ago
Xianyi Zhang	e14a025bb1	Temporily walk around zaxpy vector kernel bug.	2 years ago
Martin Kroeker	772b0cc715	Fix early bailout	2 years ago
Martin Kroeker	d6be5036d7	Fix IDAMAX	2 years ago
Martin Kroeker	1fe96f8da7	Fix failures to handle increments of zero	2 years ago
Martin Kroeker	73b30b1dec	Fix VLEV_FLOAT/VSEV_FLOAT macros to compile with t-head 2.6.1	2 years ago
ZhengSh	2a8bc38cdc	Merge branch 'xianyi:risc-v' into risc-v	2 years ago
Heller Zheng	0954746380	remove argument unused during compilation. fix wrong vr = VFMVVF_FLOAT(0, vl);	2 years ago
sh-zheng	d3bf5a5401	Combine two reduction operations of zhe/symv into one, with tail undisturbed setted.	3 years ago
sh-zheng	18d7afe69d	Add rvv support for zsymv and active rvv support for zhemv	3 years ago
Heller Zheng	1374a2d08b	This PR adapts latest spec changes Add prefix (_riscv) for all riscv intrinsics Update some intrinsics' parameter, like vfredxxxx, vmerge	3 years ago
Zhang Xianyi	19f17c8bc6	Merge pull request #3893 from HellerZheng/develop add riscv level3 C,Z kernel functions.	3 years ago
Sergei Lewis	cb0a70e0e2	dot.c early bail fix	3 years ago
Sergei Lewis	9b61be4545	factoring riscv64/dot.c fix into separate PR as requested	3 years ago
Sergei Lewis	2406958629	* update intrinsics to match latest spec at https://github.com/riscv-non-isa/rvv-intrinsic-doc (in particular, __riscv_ prefixes for rvv intrinsics) * fix multiple numerical stability and corner case issues * add a script to generate arbitrary gemm kernel shapes * add a generic zvl256b target to demonstrate large gemm kernel unrolls	3 years ago

1 2

69 Commits (d421dec2781563563b8418f06080bdfd5c554ce2)