685 Commits (a3a2548aa28a9ff7924f76d5085fabe94de79ddb)
 

Author SHA1 Message Date
  Lamply 6612178960 correct arm convolution depthwise mistakes (#246) 8 years ago
  nihui 31985b18f8 do not convert to depthwise if group is one 8 years ago
  nihui 848c9a1ea7 code clean 8 years ago
  nihui 80fb28de90 unroll outch for convolution 3x3s1, about 10%~20% speed gain 8 years ago
  nihui df218110be unroll num_output for innerproduct, about 60% speed gain 8 years ago
  nihui aaa1ffcef0 emmmm, prefer w h 8 years ago
  nihui d68eb4cd15 wrap benchmark gettimeofday 8 years ago
  Linghan Cheung 811b6ba1b6 print benchmark information for every layer, especially for CONVOLUTION (#241) 8 years ago
  nihuini d2ee4e7d27 ld1 and st1 handle data endian mode per element 8 years ago
  nihui 08e261f423 innerproduct produce continous blob, fix #236 8 years ago
  nihui 682b0d3c0d prelu on vector and image 8 years ago
  yetiancai 5e358b831d add softmax activation 8 years ago
  nihui 14a2e23407 enable embed layer 8 years ago
  nihui 49df53dd7e mxnet elemwise op is binary op 8 years ago
  nihui c9789fb879 slice dim 8 years ago
  nihui 0415f16650 fix split on subncnn blob, convert Embedding 8 years ago
  nihui 634c6568ff parse input sub index, convert element_mul SliceChannel 8 years ago
  nihui 80d14dd252 parse attrs 8 years ago
  sheen 539440c4f3 add bitcode setting 8 years ago
  nihuini 67b80183dd fix param load using external memory 8 years ago
  nihuini 7fc23025d4 unroll outch for convolution 1x1 stride 2, about 15%~55% speed gain 8 years ago
  nihuini ccbb94d835 fix build 8 years ago
  nihuini e471028f53 fix avg pooling in tail pad 8 years ago
  nihuini a84ba8fc0f element type storage support in Mat, move data member the first so that a pointer to Mat is a pointer to data, convenient index access for float vector 8 years ago
  nihuini 8773035891 another implementation for winograd 3x3, about 15%~30% speed gains on small images 8 years ago
  nihui 2a62a98e1a allow constructing paramdict and modelbin from userside 8 years ago
  nihui 10b86c2af5 create layer from type name 8 years ago
  nihui 118e037f33 arm neon optimize for mat fill 8 years ago
  nihui 7a43c45e80 remove deprecated code 8 years ago
  nihui a181d25098 new model load api, fix #215 8 years ago
  nihuini b84ba31c23 enable light mode by default 8 years ago
  peng 5ac2de8963 fix shufflechannel 8 years ago
  nihuini df5e04260a fix conv1x1s1 bug 8 years ago
  nihuini 9280a068fe unroll outch for convolution 3x3 winograd64, reduce memory usage 8 years ago
  nihui 1f5c646ee0 pipeline optimize 8 years ago
  nihuini 0564021afc fix armv7 assembly 8 years ago
  nihuini 55ec189998 unroll outch for convolution 1x1 stride 1 8 years ago
  nihuini 57df1076ff neon optimize for depthwise convolution 3x3, about 20%~35% speed gain 8 years ago
  wind19870521 822214269d fix pooling2x2s2_max_neon stride bug 8 years ago
  vsooda e85bebbf48 mxnet no square convolution 8 years ago
  nihuini 9e36e2ba0e strict Werror may cause unexpected compile error 8 years ago
  nihuini a240678299 warning-- 8 years ago
  vsooda 02cff845ea fix mobilenet: add depthwise, fix batch norm (#218) 8 years ago
  zengping a54f14feca [fix-compile-warnings] fix compiler warnings, and add werror in CMakeLists.txt (#217) 8 years ago
  nihui 0f52418023 change input param order to w h c, replace caffe MemoryData to Input 8 years ago
  nihui abfd3ea6c8 convert non-square convolution pooling param 8 years ago
  nihui 62913964d6 non-square kernel stride and padding w h for pooling 8 years ago
  nihui bdb70a2010 padding w h in convolution and deconvolution 8 years ago
  nihui 44b4519307 non-square convolution and deconvolution kernel stride dilation 8 years ago
  HustCoderHu 23a3254a7b fix memcpy error 8 years ago