Rohit Goswami
|
c8d1599411
|
BUG: Build the cblas_ wrappers correctly
|
1 year ago |
Rohit Goswami
|
b5fe9ba789
|
MAINT: Minor lint
|
1 year ago |
Rohit Goswami
|
b4df6e89bc
|
ENH: Finalize symbols in interface
From the Makefile, with notes on missing symbols
The missing symbols are not built in the standard make build either.
|
1 year ago |
Rohit Goswami
|
36c53a6841
|
MAINT,BLD: Add a note for later
Generate less non-standard cblas_ symbols
|
1 year ago |
Rohit Goswami
|
ea55287038
|
ENH,BLD: Finalize interface meson.build
|
1 year ago |
Rohit Goswami
|
998602062a
|
BLD: Add the cblas dictionary for the interface
|
1 year ago |
Rohit Goswami
|
5b497b6a32
|
BLD: Add all interface symbols
|
1 year ago |
Rohit Goswami
|
62123d0491
|
BLD: Rewrite more of the interface
|
1 year ago |
Rohit Goswami
|
bc966e1209
|
BUG,MAINT: Compile the base symbol too
|
1 year ago |
Rohit Goswami
|
3601cb2709
|
MAINT: Cleanup meson builds
|
1 year ago |
Rohit Goswami
|
e87fcdc360
|
MAINT: Cleanup interface build and be generic
For the L1 interface symbols at any rate
|
1 year ago |
Rohit Goswami
|
5a7a5a4e55
|
MAINT: Move the precisions out to main meson.build
|
1 year ago |
Rohit Goswami
|
97861ab436
|
MAINT: Cleanup makefile to meson for parallel opt
Needs some work
|
1 year ago |
Rohit Goswami
|
ec9f6504d6
|
MAINT: Cleanup undefined symbols
|
1 year ago |
Rohit Goswami
|
33e66c5400
|
MAINT,BLD: Cleanup SIMD with meson arrays
|
1 year ago |
Rohit Goswami
|
61aab3ce11
|
MAINT: Move -m64 out to cpu_family()
|
1 year ago |
Rohit Goswami
|
9d9b4337ad
|
MAINT: Add simd flags
|
1 year ago |
Rohit Goswami
|
34cf7fd754
|
MAINT: Generalize and setup F_INTERFACE
|
1 year ago |
Rohit Goswami
|
10481ed4f4
|
MAINT: Rework make defines to meson arguments
For SMALL_MATRIX_OPT and MAX_STACK_ALLOC
|
1 year ago |
Rohit Goswami
|
5a1dba3346
|
TMP: Focus on getting a single test example up
Use:
nm -gC bbdir/libopenblas.a | grep drot
❯ gcc trial.c -o trail -I$(pwd)/tmpmake/include -L$(pwd)/bbdir -lopenblas -Wl,--verbose | grep openblas
❯ ./trail
Resulting vectors:
x: 3.000000 4.000000 5.000000 6.000000
y: 2.000000 2.000000 2.000000 2.000000
|
1 year ago |
Rohit Goswami
|
75ea24cdea
|
MAINT: Refactor and rename
|
1 year ago |
Rohit Goswami
|
0043edb066
|
MAINT: Try a better way to grab archs
|
1 year ago |
Rohit Goswami
|
69dd74dedf
|
MAINT: Add a bit on generating additional defines
|
1 year ago |
Rohit Goswami
|
6818dd821e
|
MAINT: Try a double loop for configs
In interface
|
1 year ago |
Rohit Goswami
|
88f37df443
|
BLD: Try working on building the interface
With inputs from @eli-schwartz
|
1 year ago |
gxw
|
f3cebb3ca3
|
x86: Fixed numpy CI failure when the target is ZEN.
|
1 year ago |
Martin Kroeker
|
2f12a47405
|
fix build options for CAXPYC/ZAXPYC
|
1 year ago |
Martin Kroeker
|
db9f7bc552
|
fix float array types to include bfloat16
|
1 year ago |
Martin Kroeker
|
076766df4e
|
Update CMakeLists.txt
|
1 year ago |
Martin Kroeker
|
ff6670cb83
|
don't generate non-cblas files for gemm_batch
|
1 year ago |
Martin Kroeker
|
362a063396
|
remove return value
|
1 year ago |
Martin Kroeker
|
89c7bbcba6
|
add cblas_?gemm_batch
|
1 year ago |
Martin Kroeker
|
2957281275
|
Introduce a lower limit for multithreading
|
1 year ago |
Martin Kroeker
|
5fd871d7ea
|
Introduce a lower limit for multithreading
|
1 year ago |
gxw
|
637c650f4f
|
loongarch64: Add buffer offset for target LOONGSON3R5
|
1 year ago |
Martin Kroeker
|
93d975d8fd
|
Merge pull request #4593 from XiWeiGu/loongarch_add_buffer_offset
loongarch: Optimizing the performance of the GEMM on servers
|
1 year ago |
gxw
|
d8c4ea8793
|
loongarch: Optimizing the performance of the GEMM on servers
|
1 year ago |
Martin Kroeker
|
d277c6d15b
|
Merge pull request #4585 from martin-frbg/issue1881
Cap the number of parallel threads for GEMM;GETRF and POTRF to ensure sensible workloads on big systems
|
1 year ago |
Igor Zhuravlov
|
22d305e2df
|
fix dtrtrs_ and ztrtrs_ to accept case-insensitive parameters uplo and diag
Changes to be committed:
modified: interface/lapack/trtrs.c
modified: interface/lapack/ztrtrs.c
|
1 year ago |
Martin Kroeker
|
68ab5185d0
|
Update potrf.c
|
1 year ago |
Martin Kroeker
|
19b29b3448
|
Update getrf.c
|
1 year ago |
Martin Kroeker
|
a3354a7630
|
Cap the number of parallel threads
|
1 year ago |
Martin Kroeker
|
5da4c93ef2
|
Cap the number of parallel threads
|
1 year ago |
Martin Kroeker
|
496106642f
|
Cap the number of parallel threads
|
1 year ago |
Martin Kroeker
|
cb8131cfd9
|
Merge pull request #4499 from kseniyazaytseva/new-tests
Tests for BLAS-like and BLAS API
|
1 year ago |
Martin Kroeker
|
baf88564bc
|
Fix potential buffer overflow
|
1 year ago |
kseniyazaytseva
|
7e9b1c0807
|
fix uninitialized data usage
|
2 years ago |
kseniyazaytseva
|
c6f30fd414
|
check for zero inc
|
2 years ago |
kseniyazaytseva
|
5e9ead09ac
|
fix info return
|
2 years ago |
Martin Kroeker
|
500ac4de5e
|
fix incompatible pointer types
|
2 years ago |