Rohit Goswami
76be8f851d
BLD: Add the ? variant for kernel
1 year ago
Rohit Goswami
01717ce320
ENH: Use the kernel style
Necessary to extend this to L2/L3
1 year ago
Rohit Goswami
cced76830e
MAINT: Cleanup kernel meson
1 year ago
Rohit Goswami
91b355e953
MAINT: Fix filepaths for q variants [L1]
1 year ago
Rohit Goswami
dcf05e00d4
MAINT: Cleanup a bit
1 year ago
Rohit Goswami
85db158f02
MAINT: Minor refactors to have common precisions
1 year ago
Rohit Goswami
28bfd1b3e5
MAINT: Simplify and generalize
1 year ago
Rohit Goswami
5a7a5a4e55
MAINT: Move the precisions out to main meson.build
1 year ago
Rohit Goswami
97861ab436
MAINT: Cleanup makefile to meson for parallel opt
Needs some work
1 year ago
Rohit Goswami
ec9f6504d6
MAINT: Cleanup undefined symbols
1 year ago
Rohit Goswami
33e66c5400
MAINT,BLD: Cleanup SIMD with meson arrays
1 year ago
Rohit Goswami
61aab3ce11
MAINT: Move -m64 out to cpu_family()
1 year ago
Rohit Goswami
9d9b4337ad
MAINT: Add simd flags
1 year ago
Rohit Goswami
34cf7fd754
MAINT: Generalize and setup F_INTERFACE
1 year ago
Rohit Goswami
10481ed4f4
MAINT: Rework make defines to meson arguments
For SMALL_MATRIX_OPT and MAX_STACK_ALLOC
1 year ago
Rohit Goswami
5a1dba3346
TMP: Focus on getting a single test example up
Use:
nm -gC bbdir/libopenblas.a | grep drot
❯ gcc trial.c -o trail -I$(pwd)/tmpmake/include -L$(pwd)/bbdir -lopenblas -Wl,--verbose | grep openblas
❯ ./trail
Resulting vectors:
x: 3.000000 4.000000 5.000000 6.000000
y: 2.000000 2.000000 2.000000 2.000000
1 year ago
Rohit Goswami
523a57f985
BLD: Add generic BLAS2 modes
1 year ago
Rohit Goswami
e91b0216cd
ENH: Add more L2 flags
1 year ago
Rohit Goswami
552f81045d
BLD: Add swap and refactor a bit
1 year ago
Rohit Goswami
c76e7c6b95
TMP: Be more DRY
1 year ago
Rohit Goswami
e9a3897174
ENH: Start abstracting rules for kernels
1 year ago
Martin Kroeker
a875304eb0
fix inverted conditional for NAN handling
1 year ago
Martin Kroeker
24acdd6bbb
correct offset
1 year ago
Martin Kroeker
fb7c53c5e5
Merge pull request #4807 from martin-frbg/scalfixes
[WIP]Make NAN handling in the SCAL kernels depend on the dummy2 parameter
1 year ago
Martin Kroeker
15c53dd2e0
Merge pull request #4794 from XiWeiGu/Fixed_Numpy_CI_Test
Try to fixed numpy ci test failures
1 year ago
Martin Kroeker
a4e56e0452
Merge pull request #4806 from Mousius/small-gemm
Small GEMM for AArch64 with SVE
1 year ago
yamazaki-mitsufumi
88caf02f62
Fix ambiguous error on Mac OS
1 year ago
Martin Kroeker
b613754143
Update scal..c
1 year ago
Martin Kroeker
f5d04318e3
Merge branch 'OpenMathLib:develop' into scalfixes
1 year ago
Martin Kroeker
73f8866ffb
make NAN handling depend on DUMMY2 parameter
1 year ago
Martin Kroeker
dfbc2348a8
fix NAN handling
1 year ago
Martin Kroeker
c064319ecb
fix alpha=NAN case
1 year ago
Martin Kroeker
c2ffd90e8c
make NAN handling depend on dummy2 parameter
1 year ago
Chris Sidebottom
ea4ab3b310
Better header guard around bridge
1 year ago
Chris Sidebottom
7311d93016
Unroll TT further
1 year ago
Martin Kroeker
a815594fd1
Merge pull request #4801 from markdryan/markdryan/riscv-dynamic-arch
Add autodetection for riscv64
1 year ago
Martin Kroeker
dd6c33d34d
make NAN handling depend on dummy2 parameter
1 year ago
Hong Bo Peng
db98f8753f
Try to fix LAPACK testing failures on P7.
1. Remove the FADD insn from the GEMV Transpose code.
2. Remove the FADD insn from GEMM and ZGEMM code.
3. Reorder the compution of the Imaginary part in ZGEMM code.
1 year ago
Chris Sidebottom
a9edddb695
Unroll TN further
1 year ago
Chris Sidebottom
9984c5ce9d
Clean up k2 removal more and unroll SGEMM more
1 year ago
Chris Sidebottom
b1c9fafabb
Remove k2 loop from DGEMM TN and use a more conservative heuristic for SGEMM
1 year ago
Martin Kroeker
2020569705
fix NAN handling and make it depend on dummy2 parameter
1 year ago
Martin Kroeker
3870995f01
make NAN handling depend on dummy2 parameter
1 year ago
Martin Kroeker
7284c533b5
make NAN handling depend on dummy2 parameter
1 year ago
Martin Kroeker
73751218a4
make NAN handling depend on dummy2 parameter
1 year ago
Martin Kroeker
b9bfc8ce09
make NAN handling depend on dummy2 parameter
1 year ago
Martin Kroeker
eb4879e04c
make NAN handling depend on the dummy2 parameter
1 year ago
Martin Kroeker
ee87cb90d0
Merge pull request #4803 from iha-taisei/SVESupportSDGEMV
A64FX: Add support for SVE to SGEMV/DGEMV kernels.
1 year ago
gxw
34b80ce03f
mips64: Fixed numpy CI failure
1 year ago
gxw
f6d6c14a96
mips: Fixed numpy CI failure
1 year ago