nihuini
|
a84ba8fc0f
|
element type storage support in Mat, move data member the first so that a pointer to Mat is a pointer to data, convenient index access for float vector
|
8 years ago |
nihuini
|
8773035891
|
another implementation for winograd 3x3, about 15%~30% speed gains on small images
|
8 years ago |
nihui
|
2a62a98e1a
|
allow constructing paramdict and modelbin from userside
|
8 years ago |
nihui
|
10b86c2af5
|
create layer from type name
|
8 years ago |
nihui
|
118e037f33
|
arm neon optimize for mat fill
|
8 years ago |
nihui
|
7a43c45e80
|
remove deprecated code
|
8 years ago |
nihui
|
a181d25098
|
new model load api, fix #215
|
8 years ago |
nihuini
|
b84ba31c23
|
enable light mode by default
|
8 years ago |
peng
|
5ac2de8963
|
fix shufflechannel
|
8 years ago |
nihuini
|
df5e04260a
|
fix conv1x1s1 bug
|
8 years ago |
nihuini
|
9280a068fe
|
unroll outch for convolution 3x3 winograd64, reduce memory usage
|
8 years ago |
nihui
|
1f5c646ee0
|
pipeline optimize
|
8 years ago |
nihuini
|
0564021afc
|
fix armv7 assembly
|
8 years ago |
nihuini
|
55ec189998
|
unroll outch for convolution 1x1 stride 1
|
8 years ago |
nihuini
|
57df1076ff
|
neon optimize for depthwise convolution 3x3, about 20%~35% speed gain
|
8 years ago |
wind19870521
|
822214269d
|
fix pooling2x2s2_max_neon stride bug
图像width为奇数时,stride为2时,出错
|
8 years ago |
zengping
|
a54f14feca
|
[fix-compile-warnings] fix compiler warnings, and add werror in CMakeLists.txt (#217)
* [fix-compile-warnings] fix compiler warnings, and add werror in CMakeLists.txt
* [fix-compile-warnings] fix compiler warnings, remote ycm_extra_conf.py
|
8 years ago |
nihui
|
0f52418023
|
change input param order to w h c, replace caffe MemoryData to Input
|
8 years ago |
nihui
|
62913964d6
|
non-square kernel stride and padding w h for pooling
|
8 years ago |
nihui
|
bdb70a2010
|
padding w h in convolution and deconvolution
|
8 years ago |
nihui
|
44b4519307
|
non-square convolution and deconvolution kernel stride dilation
|
8 years ago |
HustCoderHu
|
23a3254a7b
|
fix memcpy error
|
8 years ago |
Zexin, Hu
|
81fb3818a5
|
add ShuffleChannel layer, only cpp, no arm yet (#210)
* add ShuffleChannel layer, only cpp, no arm yet
* add assign
|
8 years ago |
nihuini
|
964040fe3c
|
more runtime decisions for winograd path
|
8 years ago |
nihui
|
c77ca16468
|
enable conv3x3s1 winograd optimization, two paths for small image on armv7 and all for aarch64
|
8 years ago |
nihui
|
f2f7ecd2ec
|
fix winograd neon2 for aarch64
|
8 years ago |
nihui
|
26303615a6
|
memcpy for concat
|
8 years ago |
nihuini
|
a4d28107f4
|
check clone empty
|
8 years ago |
nihuini
|
25f19c2009
|
implement external scale blob, support SENet
|
8 years ago |
nihui
|
15ad4dfb9f
|
forward reuse forward_inplace routine, reduce binary size with little memcpy overhead in non-light mode
|
8 years ago |
nihui
|
32cd5f2a5c
|
use mul for the first multiply, drop accumulator clear instructions, about 5% speed performance gains
|
8 years ago |
nihuini
|
d5da0e84ba
|
fix deconv4x4s2, fix #202
|
8 years ago |
wind19870521
|
429e98c91c
|
fix unaryop bug (#200)
* fix unaryop bug
- incorrect memory access when matrix is multi-channel
Signed-off-by: wangshunli <shunli0521.hi@163.com>
* Update unaryop.cpp
|
8 years ago |
huyn
|
8b9365a68c
|
fix top_blob not set (#199)
|
8 years ago |
azrael0fog
|
f232c1a6c5
|
Update relu_arm.cpp (#189)
* Update relu_arm.cpp
* Update prelu_arm.cpp
|
8 years ago |
tedder59
|
4d59d0afda
|
Add depthwise Deconvolution. (#187)
* add depthwise deconvolution.
* add depthwise deconvolution.
* fix some syntax error and uncessary modification
|
8 years ago |
nihui
|
790829bc62
|
partition dot tiles and reuse kernel register, over 20% improvement for tiny image
|
8 years ago |
nihuini
|
a3be17eb7e
|
special path for 1x1xc innerproduct
|
8 years ago |
nihuini
|
50d591cb50
|
softmax inplace
|
8 years ago |
peng
|
39445b5233
|
no memcpy for small size copy_cut_border/copy_make_boder
|
8 years ago |
彭
|
a86cc8f620
|
memcpy optimize copy_cut_border/copy_make_boder (#179)
* memcpy optimize copy_cut_border/copy_make_boder
* copy small border memcpy may slow
* remove unuse line
* code style
|
8 years ago |
nihuini
|
d99f9d9ac3
|
implement softmax on vector and image
|
8 years ago |
liuchang
|
ac3b4768aa
|
fix the missing header file for visual studio.
|
8 years ago |
nihuini
|
ff3c03cfb1
|
q9 is useless
|
8 years ago |
nihuini
|
8cfd02d633
|
Merge branch 'master' of https://github.com/Tencent/ncnn
|
8 years ago |
nihuini
|
9a55404c72
|
fix dot on aarch64, still needs improvement ...
|
8 years ago |
nihui
|
eea3ca577a
|
disable winograd atm ...
|
8 years ago |
nihui
|
0385d8e8ad
|
implement winograd64 optimization for convolution 3x3s1
|
8 years ago |
nihui
|
20b1330cdb
|
fix lrn within channel
|
8 years ago |
nihui
|
8e490d4b68
|
fix array parsing, first try
|
8 years ago |