nihuini
567e2bd501
a dirty hack for resolving int8 pack4 crash
6 years ago
bindog
9dfd1b05d3
fix bug in reduction ( #1321 )
6 years ago
nihuini
f8caef7691
add shufflenet_v2 benchmark
6 years ago
nihuini
517d1b2053
improve compability with megvii shufflenet model
6 years ago
nihuini
e8bb88830d
convert mxnet squeeze expanddims, convert onnx squeeze unsqueeze
6 years ago
nihuini
a6c60068e6
convert numpy style slice to crop
6 years ago
nihuini
009c4d9a75
convert mxnet reduce axis and keepdims, pad reflect, fix #739
6 years ago
nihuini
c4b84262d9
fix arm neon sgemm, fix #1283
6 years ago
nihuini
65ce6bccfd
faster weight transform for optimized kernel
6 years ago
BUG1989
69e2693c87
fix the bug of SMP cpu powersave not supported.
6 years ago
nihuini
cd4be6d0fa
call vulkan create_pipeline on the vkdev condition, drop opt_cpu hacks
6 years ago
nihuini
81a028547a
fix bus error on armv7
6 years ago
daichuanliang
6176ada9f0
update ncnn2int8.cpp ( #1315 )
Fix compile issue with ncnn2int8
6 years ago
nihuini
19d75955d6
arm neon assembly optimization for conv3x3s1 winograd pack4to1
6 years ago
bindog
04b4b02324
[WIP] add reduce op support for onnx ( #1308 )
* [WIP] add reduce op support for onnx
* extend reduction to support 1,2-dim reduction and keepdims
* fix compile error
* split type to 3 flags && split keepdims to another function
6 years ago
nihuini
22a2be4e6c
fix crop pack4 with reference blob
6 years ago
nihuini
6a8e5c58da
fix build on armv7
6 years ago
nihuini
e63e2449fd
arm neon assembly optimization for conv7x7s2 pack1to4
6 years ago
nihui
56fd26a2da
arm neon assembly optimization for conv1x1s1 pack4to1
6 years ago
nihui
7ad514917b
fix potential out of write on unroll 12 remainder
6 years ago
nihuini
15e86dc8e9
reduce pack4 weight memory usage for specialized kernel, reduce runtime memory usage in conv3x3s1 winograd
6 years ago
nihuini
581a06d471
since innerproduct pack4 always consumes flattened blob, which layout is same as pack1 branch, so reuse pack1 implementation to reduce memory usage
6 years ago
nihuini
c5f1dc3fe4
arm neon assembly optimization for conv3x3s1 pack4to1
6 years ago
nihui
2f8b31c3b4
unroll outch 2 for conv3x3s1 pack1to4
6 years ago
ShuangLiu1992
396b057248
fix opencv compatibility issue with major verion > 2 for the quantisation tool ( #1302 )
* Update CMakeLists.txt
enable quant
* fix opencv compatibility issue with major higher than 2
6 years ago
nihui
e0f6e3f669
pre-interleave 8-channel weight data on aarch64, conv1x1s1 version
6 years ago
nihuini
d11bf14d44
pre-interleave 8-channel weight data on aarch64
6 years ago
nihui
7173b6e38e
arm neon assembly optimization for conv3x3s2 pack4
6 years ago
nihuini
cf0c49dd71
arm neon assembly optimization for conv5x5s1 pack4 and conv5x5s2 pack4
6 years ago
bindog
53d54115a8
add split op support for onnx2ncnn ( #1299 )
6 years ago
nihui
9e529354fb
arm neon optimization for conv1x1s2 pack4
6 years ago
tpoisonooo
25979c1bdb
move ncnnoptimize doc to docs directory ( #1297 )
6 years ago
nihuini
f8f3b0b5aa
shufflechannel pack4
6 years ago
nihuini
50d5896ce7
reshape pack4
6 years ago
nihuini
624291e2b2
use subop optimization for group convolution deconvolution pack4 family
6 years ago
nihui
48e3e7d49c
move neon activation into a wrapper function
6 years ago
nihui
8c1b87b1a2
fallback to cpu if no vulkan device found
6 years ago
nihui
b37ecab630
auto flatten before innerproduct pack4
6 years ago
nihui
afd1f08194
arm neon assembly optimization for pooling2x2s2 max pack4
6 years ago
nihui
e19b7097df
arm neon assembly optimization for conv3x3s1 pack1to4
6 years ago
nihui
3ac6335ba3
hardsigmoid and hardswish pack4
6 years ago
nihui
21e74487b4
arm neon optimization for convdw5x5 pack4
6 years ago
volvet
ecd64fb36b
Fixed lots of compile warnings ( #1286 )
* Fixed lots of compile warnings
* refine the unused warning change
6 years ago
nihui
3e1bad4880
arm neon assembly optimization for pooling3x3s2 max pack4
6 years ago
nihui
08a97c169f
arm neon assembly optimization for relu pack4
6 years ago
nihui
a1bd88fb4a
arm neon assembly optimization for padding constant pack4
6 years ago
nihui
17f343e7e4
convdw3x3 pack4 arm neon assembly optimization
6 years ago
nihui
6703286831
the very long ld1, one less load
6 years ago
nihui
22a3ade6ce
unroll size 12 for conv1x1s1 and conv3x3s1 winograd pack4 on aarch64
6 years ago
tpoisonooo
e5eb3e427c
Update CMakeLists.txt ( #1279 )
6 years ago