Rohit Goswami
|
3601cb2709
|
MAINT: Cleanup meson builds
|
1 year ago |
Rohit Goswami
|
e87fcdc360
|
MAINT: Cleanup interface build and be generic
For the L1 interface symbols at any rate
|
1 year ago |
Rohit Goswami
|
5a7a5a4e55
|
MAINT: Move the precisions out to main meson.build
|
1 year ago |
Rohit Goswami
|
97861ab436
|
MAINT: Cleanup makefile to meson for parallel opt
Needs some work
|
1 year ago |
Rohit Goswami
|
ec9f6504d6
|
MAINT: Cleanup undefined symbols
|
1 year ago |
Rohit Goswami
|
33e66c5400
|
MAINT,BLD: Cleanup SIMD with meson arrays
|
1 year ago |
Rohit Goswami
|
61aab3ce11
|
MAINT: Move -m64 out to cpu_family()
|
1 year ago |
Rohit Goswami
|
9d9b4337ad
|
MAINT: Add simd flags
|
1 year ago |
Rohit Goswami
|
34cf7fd754
|
MAINT: Generalize and setup F_INTERFACE
|
1 year ago |
Rohit Goswami
|
10481ed4f4
|
MAINT: Rework make defines to meson arguments
For SMALL_MATRIX_OPT and MAX_STACK_ALLOC
|
1 year ago |
Rohit Goswami
|
5a1dba3346
|
TMP: Focus on getting a single test example up
Use:
nm -gC bbdir/libopenblas.a | grep drot
❯ gcc trial.c -o trail -I$(pwd)/tmpmake/include -L$(pwd)/bbdir -lopenblas -Wl,--verbose | grep openblas
❯ ./trail
Resulting vectors:
x: 3.000000 4.000000 5.000000 6.000000
y: 2.000000 2.000000 2.000000 2.000000
|
1 year ago |
Rohit Goswami
|
75ea24cdea
|
MAINT: Refactor and rename
|
1 year ago |
Rohit Goswami
|
0043edb066
|
MAINT: Try a better way to grab archs
|
1 year ago |
Rohit Goswami
|
69dd74dedf
|
MAINT: Add a bit on generating additional defines
|
1 year ago |
Rohit Goswami
|
6818dd821e
|
MAINT: Try a double loop for configs
In interface
|
1 year ago |
Rohit Goswami
|
88f37df443
|
BLD: Try working on building the interface
With inputs from @eli-schwartz
|
1 year ago |
gxw
|
f3cebb3ca3
|
x86: Fixed numpy CI failure when the target is ZEN.
|
1 year ago |
Martin Kroeker
|
2f12a47405
|
fix build options for CAXPYC/ZAXPYC
|
1 year ago |
Martin Kroeker
|
db9f7bc552
|
fix float array types to include bfloat16
|
1 year ago |
Martin Kroeker
|
076766df4e
|
Update CMakeLists.txt
|
1 year ago |
Martin Kroeker
|
ff6670cb83
|
don't generate non-cblas files for gemm_batch
|
1 year ago |
Martin Kroeker
|
362a063396
|
remove return value
|
1 year ago |
Martin Kroeker
|
89c7bbcba6
|
add cblas_?gemm_batch
|
1 year ago |
Martin Kroeker
|
2957281275
|
Introduce a lower limit for multithreading
|
1 year ago |
Martin Kroeker
|
5fd871d7ea
|
Introduce a lower limit for multithreading
|
1 year ago |
gxw
|
637c650f4f
|
loongarch64: Add buffer offset for target LOONGSON3R5
|
1 year ago |
Martin Kroeker
|
93d975d8fd
|
Merge pull request #4593 from XiWeiGu/loongarch_add_buffer_offset
loongarch: Optimizing the performance of the GEMM on servers
|
1 year ago |
gxw
|
d8c4ea8793
|
loongarch: Optimizing the performance of the GEMM on servers
|
1 year ago |
Martin Kroeker
|
d277c6d15b
|
Merge pull request #4585 from martin-frbg/issue1881
Cap the number of parallel threads for GEMM;GETRF and POTRF to ensure sensible workloads on big systems
|
1 year ago |
Igor Zhuravlov
|
22d305e2df
|
fix dtrtrs_ and ztrtrs_ to accept case-insensitive parameters uplo and diag
Changes to be committed:
modified: interface/lapack/trtrs.c
modified: interface/lapack/ztrtrs.c
|
1 year ago |
Martin Kroeker
|
68ab5185d0
|
Update potrf.c
|
1 year ago |
Martin Kroeker
|
19b29b3448
|
Update getrf.c
|
1 year ago |
Martin Kroeker
|
a3354a7630
|
Cap the number of parallel threads
|
1 year ago |
Martin Kroeker
|
5da4c93ef2
|
Cap the number of parallel threads
|
1 year ago |
Martin Kroeker
|
496106642f
|
Cap the number of parallel threads
|
1 year ago |
Martin Kroeker
|
cb8131cfd9
|
Merge pull request #4499 from kseniyazaytseva/new-tests
Tests for BLAS-like and BLAS API
|
1 year ago |
Martin Kroeker
|
baf88564bc
|
Fix potential buffer overflow
|
1 year ago |
kseniyazaytseva
|
7e9b1c0807
|
fix uninitialized data usage
|
2 years ago |
kseniyazaytseva
|
c6f30fd414
|
check for zero inc
|
2 years ago |
kseniyazaytseva
|
5e9ead09ac
|
fix info return
|
2 years ago |
Martin Kroeker
|
500ac4de5e
|
fix incompatible pointer types
|
2 years ago |
Martin Kroeker
|
d4db6a9f16
|
Separate the interface for SBGEMMT from GEMMT due to differences in GEMV arguments
|
2 years ago |
Martin Kroeker
|
68d354814f
|
Fix incompatible pointer type in BFLOAT16 mode
|
2 years ago |
Sergei Lewis
|
3ffd6868d7
|
Merge branch 'develop' into dev/slewis/merge-from-riscv
|
2 years ago |
Martin Kroeker
|
47bd064763
|
Fix names in build rules
|
2 years ago |
Martin Kroeker
|
a7d004e820
|
Fix CBLAS prototype
|
2 years ago |
Martin Kroeker
|
b54cda8490
|
Unify creation of CBLAS interfaces for ?AMIN/?AMAX and C/ZAXPYC between gmake and cmake builds
|
2 years ago |
Sergei Lewis
|
1093def0d1
|
Merge branch 'risc-v' into develop
|
2 years ago |
kseniyazaytseva
|
f89e0034a4
|
Fix LAPACK usage from BLAS
|
2 years ago |
Martin Kroeker
|
f7cf637d7a
|
redo lost edit
|
2 years ago |