2596 Commits (9376ba71c1251f82994de14cee0a662677037c0b)
 

Author SHA1 Message Date
  nihui 9376ba71c1 less unroll for unaryop arm, fix padding arm warning 4 years ago
  li mengyang 3a2ac84e3c
add benchmark for amd 5700g (#3878) 4 years ago
  nihui 7886e90c65
split arm82 source for smaller binary and memory footprint (#3877) 4 years ago
  nihui c1f9b03c0b
unified arm absval clip relu dropout hardsigmoid hardswish sigmoid swish unaryop (#3876) 4 years ago
  nihui c5fed063f5
pnnx fuse expression skip foldable (#3875) 4 years ago
  nihui 40a69a2dd3
discard riscv weight memory (#3874) 4 years ago
  nihui 7441d8a7c8
eliminate noop cat for single input (#3871) 4 years ago
  nihui 06a36e9c1f
discard weight memory for mips (#3869) 4 years ago
  nihui 241524ffce
discard weight memory for x86 arm vulkan (#3865) 4 years ago
  nihui d2e87a8264
mips general optimization for convdw3x3 (#3859) 4 years ago
  nihui d373407bcb
add c906 c910 v240 toolchain 4 years ago
  nihui 48fb166a48
mips loongson mmi optimization for convolution gemm int8 (#3855) 4 years ago
  nihui 667be10fb0
riscv general optimization for convolution sgemm and winograd and innerproduct (#3857) 4 years ago
  nihui c3adbcf9f3
mips optimization for convolution sgemm (#3853) 4 years ago
  nihui a5bcc8895f
armv5 optimization for convolution sgemm (#3852) 4 years ago
  nihui 50d04dee30
optimize sgemm and winograd remain size register layout (#3851) 4 years ago
  nihui 0a4f50dbf4
arm neon assembly optimization for innerproduct unroll outch 4 (#3848) 4 years ago
  nihui 569ee37c52
arm neon optimization for pooling bf16s, fix some bf16s packing issue, relax winograd transform intrinsic order (#3847) 4 years ago
  dependabot[bot] 781145f15b
Bump pypa/cibuildwheel from 2.5.0 to 2.6.0 (#3844) 4 years ago
  nihui e49f0226e1
multi-threading rnn/lstm/gru with openmp (#3834) 4 years ago
  FeiGeChuanShu 02107e0fbf
fix yolox input shape w!=h (#3839) 4 years ago
  nihui e7a664c6e5
convert pnnx torch.index_select and torch.scatter_add (#3842) 4 years ago
  nihui 02a7e64e18
optimize x86 winograd input transform transpose (#3818) 4 years ago
  Yoh 2a05c69cdd
fix x86 unaryop bug on gcc-4.4 (#3838) 4 years ago
  nihui 4c7965781f
add pnnx ncnn pass for chain dict output (#3836) 4 years ago
  nihui f48d45209b
binaryop type specialization (#3830) 4 years ago
  nihui 026a04f220
convert torch.norm to ncnn, fix F.normalize vector (#3828) 4 years ago
  tripleMu 6f4b444fe5
Fix .gitignore (#3824) 4 years ago
  nihui bf64d8f1ec
fix winograd function name (#3820) 4 years ago
  陸 言 2161ab2a0c
Edit _bias128 in scale_x86.cpp for useless if (#3821) 4 years ago
  nihui c16cac2678
update glslang, fix system glslang include path (#3819) 4 years ago
  nihui 9e11dac7d1
simpleomp with libgomp abi (#3816) 4 years ago
  BUG1989 c2fb93b6ff
Add the benchmark of AX620A (#3813) 4 years ago
  nihui f79073c182 update how-to-build doc for raspberrypi and d1 4 years ago
  FeiGeChuanShu 617d23f6ce
Add RK3588 benchmark (#3808) 4 years ago
  nihui 3a827434a9
optimize arm sgemm convolution condition (#3806) 4 years ago
  nihui 0daad605e0
fix make slice expression with dynamic parameters 4 years ago
  wzyforgit 46a6d9b422
Add Loongson and Sunway benchmark data (#3802) 4 years ago
  nihui 5cdb7f6617
make compiler optimization happy with loop (#3799) 4 years ago
  tpoisonooo 6fd801b6d7
feat(src/layer): add vision_transformer benchmark (#3730) 4 years ago
  nihui 817bd1fdc4
fix vs2022 ci (#3792) 4 years ago
  Xavier Hsinyuan 29b6a32ac0
RVV: follow intrinsic doc, replace vfredsum_* with vfredusum_* (#3790) 4 years ago
  nihui 2dc1ae45fe
pnnx bitwise and compare op (#3791) 4 years ago
  nihui d476191ff1
pnnx export_onnx function (#3784) 4 years ago
  nihui 0aba8af1d3
pnnx swin transformer (#3783) 4 years ago
  村长大人 c697e988b0
fixbug: linux-arm-hisiv500 load mem "Bus error" (#3779) 4 years ago
  Zhiqiang Wang a7aa0fe70d
Update README.md (#3778) 4 years ago
  Zhuo Zhang 3dfc10647c
docs: update QQ group info (#3777) 4 years ago
  NaLan ZeYu 5388f9f312
test: fix printf arguments mismatch (#3774) 4 years ago
  jasonZhang 663b42e0d2
add tanh avx512 optimize (#3770) 4 years ago