nihui
eed257df1f
ci update llvmpipe ( #5954 )
* check image fp16
1 year ago
nihui
19caca3140
port rvv intrinsic 1.0+ ( #5642 )
* zfh zvfh xtheadvector infra
* dispatch for rvv and xtheadvector
* dispatch for non-vector zfh
* port xtheadvector recp rsqrt trunc
* general rvv gemm
* c906 and c910 ci
* old tuple code clean
* update riscv64 ci
* update build doc
* drop old th1520 toolchain
1 year ago
nihui
8fe62812c9
arm neon optimization for layernorm fp32/bf16s/fp16s ( #5746 )
1 year ago
nihui
3752d71200
fix potential fp16s bf16s conflicts on arm vfpv4 ( #5578 )
* fix potential fp16s bf16s conflicts on armv7 vfpv4
* but prefer fp16 on armv8.2
1 year ago
nihui
da7d1a10f7
test x86 arm convolution oom ( #5492 )
* skip mips loongarch riscv oom test atm
* test softmax oom
1 year ago
nihui
08b7d99a75
rnn/lstm/gru dynamic quantization ( #5435 )
2 years ago
nihui
984d6dd844
promote vfpv4 for auto fp16 storage conversion ( #5325 )
* promote vfpv4 for auto fp16 storage conversion
* always report neon and vfpv4 for arm64
2 years ago
nihui
5329d32e74
check vulkan fp16 uniform support and implement lfp conversion without fp16u ( #5287 )
2 years ago
nihui
556b79ce4d
create layer decoupled ( #5258 )
* create layer decoupled
* no more virtual public
* allow build test with shared library
* decouple cpu vulkan
* drop old scripts
2 years ago