nihuini
cd4be6d0fa
call vulkan create_pipeline on the vkdev condition, drop opt_cpu hacks
6 years ago
nihuini
81a028547a
fix bus error on armv7
6 years ago
daichuanliang
6176ada9f0
update ncnn2int8.cpp ( #1315 )
Fix compile issue with ncnn2int8
6 years ago
nihuini
19d75955d6
arm neon assembly optimization for conv3x3s1 winograd pack4to1
6 years ago
bindog
04b4b02324
[WIP] add reduce op support for onnx ( #1308 )
* [WIP] add reduce op support for onnx
* extend reduction to support 1,2-dim reduction and keepdims
* fix compile error
* split type to 3 flags && split keepdims to another function
6 years ago
nihuini
22a2be4e6c
fix crop pack4 with reference blob
6 years ago
nihuini
6a8e5c58da
fix build on armv7
6 years ago
nihuini
e63e2449fd
arm neon assembly optimization for conv7x7s2 pack1to4
6 years ago
nihui
56fd26a2da
arm neon assembly optimization for conv1x1s1 pack4to1
6 years ago
nihui
7ad514917b
fix potential out of write on unroll 12 remainder
6 years ago
nihuini
15e86dc8e9
reduce pack4 weight memory usage for specialized kernel, reduce runtime memory usage in conv3x3s1 winograd
6 years ago
nihuini
581a06d471
since innerproduct pack4 always consumes flattened blob, which layout is same as pack1 branch, so reuse pack1 implementation to reduce memory usage
6 years ago
nihuini
c5f1dc3fe4
arm neon assembly optimization for conv3x3s1 pack4to1
6 years ago
nihui
2f8b31c3b4
unroll outch 2 for conv3x3s1 pack1to4
6 years ago
ShuangLiu1992
396b057248
fix opencv compatibility issue with major verion > 2 for the quantisation tool ( #1302 )
* Update CMakeLists.txt
enable quant
* fix opencv compatibility issue with major higher than 2
6 years ago
nihui
e0f6e3f669
pre-interleave 8-channel weight data on aarch64, conv1x1s1 version
6 years ago
nihuini
d11bf14d44
pre-interleave 8-channel weight data on aarch64
6 years ago
nihui
7173b6e38e
arm neon assembly optimization for conv3x3s2 pack4
6 years ago
nihuini
cf0c49dd71
arm neon assembly optimization for conv5x5s1 pack4 and conv5x5s2 pack4
6 years ago
bindog
53d54115a8
add split op support for onnx2ncnn ( #1299 )
6 years ago
nihui
9e529354fb
arm neon optimization for conv1x1s2 pack4
6 years ago
tpoisonooo
25979c1bdb
move ncnnoptimize doc to docs directory ( #1297 )
6 years ago
nihuini
f8f3b0b5aa
shufflechannel pack4
6 years ago
nihuini
50d5896ce7
reshape pack4
6 years ago
nihuini
624291e2b2
use subop optimization for group convolution deconvolution pack4 family
6 years ago
nihui
48e3e7d49c
move neon activation into a wrapper function
6 years ago
nihui
8c1b87b1a2
fallback to cpu if no vulkan device found
6 years ago
nihui
b37ecab630
auto flatten before innerproduct pack4
6 years ago
nihui
afd1f08194
arm neon assembly optimization for pooling2x2s2 max pack4
6 years ago
nihui
e19b7097df
arm neon assembly optimization for conv3x3s1 pack1to4
6 years ago
nihui
3ac6335ba3
hardsigmoid and hardswish pack4
6 years ago
nihui
21e74487b4
arm neon optimization for convdw5x5 pack4
6 years ago
volvet
ecd64fb36b
Fixed lots of compile warnings ( #1286 )
* Fixed lots of compile warnings
* refine the unused warning change
6 years ago
nihui
3e1bad4880
arm neon assembly optimization for pooling3x3s2 max pack4
6 years ago
nihui
08a97c169f
arm neon assembly optimization for relu pack4
6 years ago
nihui
a1bd88fb4a
arm neon assembly optimization for padding constant pack4
6 years ago
nihui
17f343e7e4
convdw3x3 pack4 arm neon assembly optimization
6 years ago
nihui
6703286831
the very long ld1, one less load
6 years ago
nihui
22a3ade6ce
unroll size 12 for conv1x1s1 and conv3x3s1 winograd pack4 on aarch64
6 years ago
tpoisonooo
e5eb3e427c
Update CMakeLists.txt ( #1279 )
6 years ago
nihuini
c3eea4cc0b
fuse onnx unsqueeze-batchnorm-squeeze as batchnorm
6 years ago
nihui
3a452f734a
arm neon assembly optimization for conv3x3s2 pack1to4
6 years ago
nihui
6edd42f566
arm neon assembly for conv1x1s1 and conv3x3s1 winograd pack4
6 years ago
nihuini
e147debe07
add retinaface example
6 years ago
nihuini
87ae1e9a2a
%e in ncnnoptimize
6 years ago
nihuini
7d3ebe2d69
%g may output number without dot, stick to %e
6 years ago
nihuini
a5b6826a16
workaround duplicated op name, fusion function, fix wrong layer count when the last layer is Reshape
6 years ago
nihuini
583f0a7c2f
eliminate noop
6 years ago
Howave
50f69f1755
mxnet hardsigmoid hardswish op fusion ( #1266 )
6 years ago
nihuini
f7808b2dfc
write pad_value
6 years ago