1257 Commits (d6b2ea5aacee36141bf4597d8d68a2aafe472559)

Author SHA1 Message Date
  nihuini d6b2ea5aac arm neon optimization for convolution 3x3 on small channels 5 years ago
  nihuini 6a397716ca arm neon optimization for instancenorm 5 years ago
  nihuini 687cc857b1 some x86 sse2 optimization for convolution int8 5 years ago
  zhiliu6 ec0f904c16
improve x86 1x1 pack8 convolution performance (#2852) 5 years ago
  nihuini 68468dccbd arm neon assembly optimization for padding int8 pack8, convolution int8 out elempack 4 5 years ago
  nihuini 31d436c627 more verbose load failure, ncnn2int8 write int8 data properly 5 years ago
  nihuini 05d457c78f innerproduct int8 support all fused activation types 5 years ago
  nihuini 1bc0126302 fix crash when input cpu blob and extract the same from gpu, update vgg16 int8 model 5 years ago
  nihui 7e1aaa5828
cmake option NCNN_INT8 (#2839) 5 years ago
  nihui 66455c1b95
implement 2823 binary broadcasting type (#2827) 5 years ago
  nihuini 85efe132ff unroll inch 4 for convolution sgemm int8 5 years ago
  nihui c6cd5e8628 fix armv7 no-neon build 5 years ago
  nihuini e38a5fcbe6 fix build 5 years ago
  nihuini 01f5dcb700 arm neon optimization for convolution sgemm pack1 pack8to1 int8 5 years ago
  zhiliu6 c4700c52ca
optimize x86 1x1 pack8 convolution (#2820) 5 years ago
  nihui 0d1d5b66c5 fix arm64 asm build 5 years ago
  nihuini e9ab1acf27 arm neon optimization for convolution sgemm pack1to8 int8 5 years ago
  nihuini e975de1f36 better condition for arm82 conv3x3s1 winograd 5 years ago
  nihuini 41a4bea954 unroll size 8 for conv3x3s1 pack8to1 int8 arm64 5 years ago
  nihuini 3631c1933d non-inlined addref and release slows down overall speed, move them to header 5 years ago
  nihui e9cc637573
arm neon optimization for int8 packing kernels (#2809) 5 years ago
  nihui d7cbc055f3
fix illegal instruction on pi4 when NCNN_ARM82 enabled 5 years ago
  nihuini 256754bff9 fix build with old gcc, fix #2805 5 years ago
  zhiliu6 61cd9da55b
optimize x86 3x3 pack8 leftover (#2797) 5 years ago
  nihuini 912e81d086 fix tanh neon, fix #2751 5 years ago
  nihui 32b48f0157 fix int8 auto pack layout 5 years ago
  nihui 1ea8bfbd2e x86 avx2 conv3x3s1 pack8 direct optimization, fix #2789 5 years ago
  nihui 5fe75f19ef
architecture changes for int8 packing (#2771) 5 years ago
  nihui d4a7abc218 fix onnx2ncnn clip without max blob, fix #2788 5 years ago
  nihui eaee64c782 export vkcommand and vkcompute 5 years ago
  nihuini e86799e95f fix get_big_cpu_count return zero on smp cpu 5 years ago
  nihui 7c079b853e
default to big cpu count 5 years ago
  restyled-io[bot] 5f00ba89d2
feat(ncnnoptimize): replace denormals to zero on layers with weights (#2690) 5 years ago
  nihui 67e24e0703
use local pool allocator (#2736) 5 years ago
  nihuini 15d63ec0f5 fuse onnx multiheadattention with same qkv blob 5 years ago
  Cai Shanli f5b307689b
fix net and extractor destroy order when use vulkan (#2732) 5 years ago
  RBelogorodtsevFBase 1212ed6e94
implements gelu activation (#2749) 5 years ago
  nihui b58cd14678 fix non arm-neon build 5 years ago
  nihui 0870bf45b1 optimize warpaffine family 5 years ago
  nihuini c17eb4e208 multiheadattention layer 5 years ago
  nihuini b51959802c fix buffer2host copy, fix #2725 5 years ago
  nihuini 7ac23ab34d fuse onnx layernorm, fix 2-dim layernorm implementation, add test 5 years ago
  zhiliu6 57397c418d
Optimize general AVX2 convolution. (#2714) 5 years ago
  Xu Yang fd634e9a58
remove unnecessary mat clone when NCNN_BENCHMARK enabled (#2708) 5 years ago
  Dahan Gong cbd410c237
fix broken inplace forward (#2709) 5 years ago
  restyled-io[bot] 8c9bea2322
Restyle faster bbox calculation by background score (#2693) 5 years ago
  zylo117 41fba71fa0
fix adaptive avg pooling accumulation overflow in vulkan using fp16 arithmetic (#2698) 5 years ago
  nihui 3c92a1184b
arm neon optimization for general convolution im2col sgemm (#2668) 5 years ago
  zylo117 65d71d8f23
support adaptive_pooling in vulkan implementation (#2681) 5 years ago
  Youngsoo Lee b9bed8d993
feat: add denormal options (#2656) 5 years ago