35 Commits (7655b9e4e9265b817ad103aeb52bb6a302ed445a)

Author SHA1 Message Date
  nihui 7655b9e4e9 fix build on armv7 again ... 6 years ago
  nihui a97439988f fix build on armv7 6 years ago
  nihuini 81a5dfe76b general convolution and convolutiondepthwise arm neon pack4, wip 6 years ago
  BUG1989 bcfe9f453f initial the ncnn post training quantization tools (#1067) 7 years ago
  BUG1989 d9f269fa3d use sgemm fp32 on arm platform,optimize conv1x1s2 (#1031) 7 years ago
  nihuini 4de4078779 move platform includes out of namespace 7 years ago
  nihui 3e003ffd98 fuse sigmoid 7 years ago
  nihuini 7a8f68aca6 move vulkan code to subdir, new layer interface create_pipeline and destroy_pipeline for post-loading works 7 years ago
  nihuini 3f85cafc08 fuse relu leakyrelu clip into convolution/deconvolution/innerproduct 7 years ago
  BUG1989 780c7d9a72 merge de/requantize op, optimize some int8 conv layer on arm64-v8a (#867) 7 years ago
  BUG1989 ff38053321 [WIP] arm64-v8a int8 optimization (#823) 7 years ago
  BUG1989 8e337d440e fix the bug with convdw7x7 op working on int8 mode (#818) 7 years ago
  BUG1989 8ff831f7cd fix the segmentation fault when load int8 model (#811) 7 years ago
  BUG1989 df3d224484 new int8 implement,better accuracy (#749) 7 years ago
  nihuini 8fda293f91 neon optimize for depthwise convolution 5x5 :P 7 years ago
  nihuini ef36d79b7e implement the missing dequantize image on armv7, prefer neon-optimized 3-dim dequantize, fix #547 7 years ago
  nihuini 6f1b0b0a61 quantized padding in convolution, use range sweets 7 years ago
  nihuini 2dbaf6f7b7 store int8 scale in binary 7 years ago
  nihui 2fe7ada4d8 add arm int8 convolution stub, preload group op for x86 7 years ago
  nihuini 6b536701c3 sub-mat shall be allocator-aware 7 years ago
  nihui a169cec363 core int8 inference, quantize and dequantize, net using flag, caffe2ncnn reads int8 scale table 7 years ago
  nihui 9706cd1447 implement ncnn blob/workspace allocator, fine-grained per-layer openmp threads control, fix #469 8 years ago
  nihuini 0ce0c11851 load sub-op in advance for group convolution 8 years ago
  nihuini 9ac305e160 create 3-dim sub blob for group convolution, fix #315 8 years ago
  nihui 6c4c810fda decouple modelbin of different input types, simplify timestamp function 8 years ago
  nihuini 76a55693a6 decouple convolutiondepthwise and convolution, reduce binary size by 10%, fix #254 8 years ago
  Lamply 6612178960 correct arm convolution depthwise mistakes (#246) 8 years ago
  nihuini a84ba8fc0f element type storage support in Mat, move data member the first so that a pointer to Mat is a pointer to data, convenient index access for float vector 8 years ago
  nihuini 57df1076ff neon optimize for depthwise convolution 3x3, about 20%~35% speed gain 8 years ago
  nihui bdb70a2010 padding w h in convolution and deconvolution 8 years ago
  nihui 44b4519307 non-square convolution and deconvolution kernel stride dilation 8 years ago
  nihuini 47218db6e5 fix minus padding SAME, fix #116 8 years ago
  nihuini 7830b3da42 fix potential overread when bias_term is zero 8 years ago
  nihuini b4e3615ee4 depth-wise optimize 8 years ago
  nihuini 934f48cb5e arm neon optimize for group convolution 8 years ago