278 Commits (e213605cd484fc7970f602cd83e08597e128ffad)

Author SHA1 Message Date
  nihuini e213605cd4 reduce memory usage of weight packing 7 years ago
  nihuini 39f2c71d5b fix name conflict on ios 7 years ago
  nihui f4e12101c0 fix convolution typed innerproduct pack4 7 years ago
  nihui 960ffa1a50 optimize workgroup size for convolution depthwise and innerproduct pack4 7 years ago
  nihui a15b389d86 fix innerproduct pack1to4 pack4to1 weight upload 7 years ago
  Emmanuel Benazera a8fd79e1bc fixed cell initialization in LSTM layer 7 years ago
  nihui 62543f9b1e flatten pack1to4 7 years ago
  nihui 9480dcbc36 fix innerproduct out packing 7 years ago
  nihui f9dc551081 add innerproduct pack1to4 pack4to1 glue code 7 years ago
  nihui 3f91d6b529 add innerproduct pack1to4 pack4to1 shader 7 years ago
  nihui cd7f120250 lrn norm across channel pack4, rename member name with pipeline prefix 7 years ago
  nihui 7ee3216fff add convolution pack1to4 pack4to1 7 years ago
  nihui 9d2b345eab lrn region within channel pack4 7 years ago
  nihui ad68e1e0e6 enable googlenet alexnet vulkan benchmark, fix build on msvc 7 years ago
  nihui f9ea621305 pooling full padding 7 years ago
  nihui ee59f14900 add lrn shader 7 years ago
  nihui 1792fe79ec drop deprectaed softmax shader, destory softmax pipeline 7 years ago
  nihui 9e2b327c17 packing shader for 3-dim blob 7 years ago
  nihuini 9a805b045e innerproduct receive flattened blob 7 years ago
  nihui c60773bde4 add transfer-transfer barrier, concat pack4 7 years ago
  nihui 303996af4c auto flatten before innerproduct 7 years ago
  nihuini ba723706bb add flatten pack4 7 years ago
  nihui f0b4933eac
massive simd optimize in compute shader (#772) 7 years ago
  nihui 8e5674363b
element packing (#770) 7 years ago
  mzpan 777f3f98d9 add w=1 h=1 op (#765) 7 years ago
  Eric Liu e6b1412217 Increase a few performance of yolov3 and change tab to space (#767) 7 years ago
  nihui 10b8ac68cc
[WIP] vulkan compute (#618) 7 years ago
  ShuangLiu1992 ddba274b96 fix compile on ios simulator (#756) 7 years ago
  BUG1989 d7bd415832 add the armv7a conv3x3s1 implement without overflow (#746) 7 years ago
  weiliangweiliang ac7121f54b fix the wrong variable name that cause wrong results. (#741) 7 years ago
  nihui b75516b9b1
add the armv7a conv3x3s2, convdw3x3s1/s2 int8 implement without overflow : ) (#738) 7 years ago
  weiliangweiliang c79a6f7413 fix missing branch in arm 3x3s2 conv (#737) 7 years ago
  BUG1989 229f8fd8db add the armv7a conv3x3s2, convdw3x3s1/s2 int8 implement without overflow 7 years ago
  shujunhua 225d4925c0 fix lstm bug 7 years ago
  BUG1989 4850eeed6f add the armv7a conv1x1s1 sgemm int8 implement without overflow : ) (#713) 7 years ago
  nihuini 6a9ac1581e fix clobber list in convdw5x5s1_neon and convdw5x5s2_neon 7 years ago
  nihuini 69d2c48bb8 improve group deconvolution openmp scheduler 7 years ago
  Eric Liu 9558c27daa Fixed a yolov3 resolution bug (#696) 7 years ago
  nihui adfbb9eb25 fix caffe ssd, somewhat ugly though ... 7 years ago
  nihuini 65be036aa6 mxnet-ssd done 7 years ago
  nihuini c77383f623 mxnet-ssd wip ... 7 years ago
  nihuini 481722648c fix ooops .... :] 7 years ago
  nihuini 94e3a79ee9 fix crash in priorbox when min_sizes or max_sizes is empty 7 years ago
  nihuini 0b34dd59fb trival fix for the last element 7 years ago
  nihuini c44e49d3e9 implement roialign layer 7 years ago
  nihuini 250f6b8bdd fix build 7 years ago
  nihuini 4837af4c25 initial effort for mxnet-ssd 7 years ago
  nihuini 3a69f1c68b implement psroipooling layer 7 years ago
  nihuini a5e57fa22e implement multiscale yolov2, update example model comment 7 years ago
  Jianjun Liu 8c79c90102 variable name should be remain. 7 years ago