nihuini
db5e805eff
padding_mode for Pooling, fix #261
8 years ago
nihui
2d9410742b
concat slice shufflechannel honor elemsize
8 years ago
nihui
8ccae1d4fd
prevent reuse of param array, fix #258
8 years ago
nihuini
75218953cc
aarch64 assembly for conv1x1s1, unroll outch inch as 8x8
8 years ago
nihuini
76a55693a6
decouple convolutiondepthwise and convolution, reduce binary size by 10%, fix #254
8 years ago
nihuini
3ffb502bc6
reuse if the same shape
8 years ago
nihuini
c6506d6ecd
remaining inch for winograd neon3
8 years ago
nihui
c12fab569f
fix convdw3x3s1 on aarch64
8 years ago
nihui
f133729c78
code style changes
8 years ago
nihuini
03621aa7f9
more x86 stub for convolution and convolutiondepthwise
8 years ago
Lamply
6612178960
correct arm convolution depthwise mistakes ( #246 )
8 years ago
nihui
848c9a1ea7
code clean
8 years ago
nihui
80fb28de90
unroll outch for convolution 3x3s1, about 10%~20% speed gain
8 years ago
nihui
df218110be
unroll num_output for innerproduct, about 60% speed gain
8 years ago
nihui
aaa1ffcef0
emmmm, prefer w h
8 years ago
nihui
d68eb4cd15
wrap benchmark gettimeofday
8 years ago
Linghan Cheung
811b6ba1b6
print benchmark information for every layer, especially for CONVOLUTION ( #241 )
* print benchmark information for every layer, especially for CONVOLUTION
* print benchmark information for every layer, especially for CONVOLUTION, for cross-platform.
* move the function implementation to cpp file to avoid multiple definitions
8 years ago
nihuini
d2ee4e7d27
ld1 and st1 handle data endian mode per element
8 years ago
nihui
08e261f423
innerproduct produce continous blob, fix #236
8 years ago
nihui
682b0d3c0d
prelu on vector and image
8 years ago
nihui
14a2e23407
enable embed layer
8 years ago
nihui
c9789fb879
slice dim
8 years ago
nihuini
67b80183dd
fix param load using external memory
8 years ago
nihuini
7fc23025d4
unroll outch for convolution 1x1 stride 2, about 15%~55% speed gain
8 years ago
nihuini
ccbb94d835
fix build
8 years ago
nihuini
e471028f53
fix avg pooling in tail pad
8 years ago
nihuini
a84ba8fc0f
element type storage support in Mat, move data member the first so that a pointer to Mat is a pointer to data, convenient index access for float vector
8 years ago
nihuini
8773035891
another implementation for winograd 3x3, about 15%~30% speed gains on small images
8 years ago
nihui
2a62a98e1a
allow constructing paramdict and modelbin from userside
8 years ago
nihui
10b86c2af5
create layer from type name
8 years ago
nihui
118e037f33
arm neon optimize for mat fill
8 years ago
nihui
7a43c45e80
remove deprecated code
8 years ago
nihui
a181d25098
new model load api, fix #215
8 years ago
nihuini
b84ba31c23
enable light mode by default
8 years ago
peng
5ac2de8963
fix shufflechannel
8 years ago
nihuini
df5e04260a
fix conv1x1s1 bug
8 years ago
nihuini
9280a068fe
unroll outch for convolution 3x3 winograd64, reduce memory usage
8 years ago
nihui
1f5c646ee0
pipeline optimize
8 years ago
nihuini
0564021afc
fix armv7 assembly
8 years ago
nihuini
55ec189998
unroll outch for convolution 1x1 stride 1
8 years ago
nihuini
57df1076ff
neon optimize for depthwise convolution 3x3, about 20%~35% speed gain
8 years ago
wind19870521
822214269d
fix pooling2x2s2_max_neon stride bug
图像width为奇数时,stride为2时,出错
8 years ago
zengping
a54f14feca
[fix-compile-warnings] fix compiler warnings, and add werror in CMakeLists.txt ( #217 )
* [fix-compile-warnings] fix compiler warnings, and add werror in CMakeLists.txt
* [fix-compile-warnings] fix compiler warnings, remote ycm_extra_conf.py
8 years ago
nihui
0f52418023
change input param order to w h c, replace caffe MemoryData to Input
8 years ago
nihui
62913964d6
non-square kernel stride and padding w h for pooling
8 years ago
nihui
bdb70a2010
padding w h in convolution and deconvolution
8 years ago
nihui
44b4519307
non-square convolution and deconvolution kernel stride dilation
8 years ago
HustCoderHu
23a3254a7b
fix memcpy error
8 years ago
Zexin, Hu
81fb3818a5
add ShuffleChannel layer, only cpp, no arm yet ( #210 )
* add ShuffleChannel layer, only cpp, no arm yet
* add assign
8 years ago
nihuini
964040fe3c
more runtime decisions for winograd path
8 years ago