157 Commits (4136de3b8d2bfaba0f758af0c9b7d1c726bdcef2)

Author SHA1 Message Date
  nihui 4136de3b8d
arm optimization for convolution int8 packed unified elempack (#5147) 2 years ago
  nihui 4494aadd74
deconvolution dynamic weight (#5119) 2 years ago
  nihui 80b3b9c6f0
arm optimization for convolution int8 winograd unified elempack (#5087) 2 years ago
  nihui c8662cce5e
arm optimization for convolution int8 gemm unified elempack (#5016) 2 years ago
  nihui 5ac17df797
arm optimization for packed convolution unified elempack (#4590) 3 years ago
  nihui c777bf09dc
arm convolution sgemm unified elempack (#4572) 3 years ago
  nihui dabc4c065f
arm convolution winograd unified elempack (#4556) 3 years ago
  nihui 9c6f1107d2
fix #4315 (#4316) 3 years ago
  nihui 9b8272e86d
arm edsp and arm neon optimization for convolution int8 winograd (#4017) 3 years ago
  nihui 20a14bf5ae
arm convolution winograd dot function, adjust arm convolution winograd strategy (#3915) 4 years ago
  nihui ca0ba4b25f
fine grained winograd options, adjust x86 convolution winograd strategy (#3908) 4 years ago
  Evgeny Proydakov 85e483e6ba
Fixed several compile warnings for ios build: (#3885) 4 years ago
  nihui 7886e90c65
split arm82 source for smaller binary and memory footprint (#3877) 4 years ago
  nihui 241524ffce
discard weight memory for x86 arm vulkan (#3865) 4 years ago
  nihui bf64d8f1ec
fix winograd function name (#3820) 4 years ago
  nihui 3a827434a9
optimize arm sgemm convolution condition (#3806) 4 years ago
  nihui 9298d05e86
split convolution winograd transform input output (#3688) 4 years ago
  nihui 3f2799d706
always build tightly packed weight, fix #3545 (#3547) 4 years ago
  nihui c0a94cd9ca
fix armv7 without neon (#3514) 4 years ago
  nihui 6941ec8fc9
arm neon optimization for general packed convolution (#3426) 4 years ago
  nihui 878cb713d5
optional arm82 dot source (#3415) 4 years ago
  nihui 999e640d43
dynamic convolution weight (#3408) 4 years ago
  nihui 24fbb6e8cb
honor thread setting on load and vulkan command, ci avx512 t4 (#3391) 4 years ago
  zhiliu6 814f89ef1a
Fuse HardSwish activation into Convolution and InnerProduct (#3233) 4 years ago
  nihui cdf45a6512
cmake option NCNN_BF16 (#3068) 4 years ago
  Evgeny Proydakov e01e965c68
Fixed compile warnings for clang compiler on MacOS ARM. [-Wunused-variable] (#3000) 5 years ago
  nihuini d6b2ea5aac arm neon optimization for convolution 3x3 on small channels 5 years ago
  nihuini 68468dccbd arm neon assembly optimization for padding int8 pack8, convolution int8 out elempack 4 5 years ago
  nihui 7e1aaa5828
cmake option NCNN_INT8 (#2839) 5 years ago
  nihuini 01f5dcb700 arm neon optimization for convolution sgemm pack1 pack8to1 int8 5 years ago
  nihuini e9ab1acf27 arm neon optimization for convolution sgemm pack1to8 int8 5 years ago
  nihuini e975de1f36 better condition for arm82 conv3x3s1 winograd 5 years ago
  nihui e9cc637573
arm neon optimization for int8 packing kernels (#2809) 5 years ago
  nihui 5fe75f19ef
architecture changes for int8 packing (#2771) 5 years ago
  nihui 3c92a1184b
arm neon optimization for general convolution im2col sgemm (#2668) 5 years ago
  nihui ab56083ca5
arm neon optimization for conv3x3s1 winograd42 (#2664) 5 years ago
  Zhuo Zhang a1e9993616
fix convolution_arm.cpp shadowed variables warning (#2448) 5 years ago
  nihui bf09af21be exp arm fp16sa neon optimization 5 years ago
  nihui 39bbb34ffc conv1x1s1 conv3x3s1 pack4 pack8to4 arm fp16sa neon assembly optimization 5 years ago
  nihuini 440db2c8fc conv1x1 pack4 arm fp16sa 5 years ago
  nihuini b5be1449d9 conv3x3s1 winograd pack8to4 arm fp16sa 5 years ago
  nihuini d17c26e925 conv1x1s1 pack4to8 pack8to4 arm fp16sa 5 years ago
  nihui db5f05c6f0 conv1x1s1 conv3x3s1 winograd pack8to1 arm fp16sa 5 years ago
  nihuini 9c33b6c1c8 conv1x1s1 arm fp16sa 5 years ago
  nihuini 30ff3800d8 conv5x5s1 conv5x5s2 pack8 arm fp16sa neon assembly optimization 5 years ago
  nihuini 62c453b16d conv7x7s2 pack1to8 arm fp16sa neon assembly optimization 5 years ago
  nihuini b5b486fbfa conv3x3s2 pack8 arm fp16sa neon assembly optimization 5 years ago
  nihui d8e9fc1443 conv3x3s1 conv3x3s2 pack1to8, padding pack8, relu pack8 arm neon fp16sa assmebly optimization 5 years ago
  nihuini f6d808b090 crop pack8 arm fp16s, conv3x3s2 pack1to8 arm fp16sa intrinsic 5 years ago
  nihuini 5d5a3d1434 conv1x1s1 conv1x1s2 conv3x3s1 winograd pack8 arm fp16sa 5 years ago