150 Commits (2bc77e7487d07a40667fcf9f8fffa17ca75e0523)

Author SHA1 Message Date
  nihui aa9753b2f0
detach mat from local blob allocator so net instance could be destroyed much earlier (#3287) 4 years ago
  nihuini affbefe311
some space cleanup, blob clone from allocator 4 years ago
  nihui cdf45a6512
cmake option NCNN_BF16 (#3068) 4 years ago
  Tijmen Verhulsdonck eaa7e24db6
Added ability to switch AVX/AVX2 during runtime (#3076) 4 years ago
  nihui 3a77b09c31 fix test failure 4 years ago
  nihuini 9b5cb959b9 auto convert int8 to fp32 on extract 5 years ago
  nihui ad37c34d25 disable NCNN_ARM82DOT whenever NCNN_ARM82 disabled 5 years ago
  Cai Shanli 8cc8cd716a
Add get input and output names (#2890) 5 years ago
  nihui 17936e9f54 fix packing risc-v test, add cpu_riscv_vlenb() 5 years ago
  nihui 11958424c2 runtime riscv v and zfh dispatch, riscv v optimization for cast 5 years ago
  nihui 1c26291757 more verbose hint for find_blob_index_by_name failure 5 years ago
  nihuini 34bd5ef161 update eq quant info 5 years ago
  nihuini 72ef77a469 fix build with NCNN_STRING off and NCNN_VULKAN on 5 years ago
  zhiliu6 fb9d529487
fix compile error when NCNN_STRING is disabled (#2874) 5 years ago
  nihuini 31d436c627 more verbose load failure, ncnn2int8 write int8 data properly 5 years ago
  nihuini 1bc0126302 fix crash when input cpu blob and extract the same from gpu, update vgg16 int8 model 5 years ago
  nihui e9cc637573
arm neon optimization for int8 packing kernels (#2809) 5 years ago
  nihui 32b48f0157 fix int8 auto pack layout 5 years ago
  nihui 5fe75f19ef
architecture changes for int8 packing (#2771) 5 years ago
  nihui d4a7abc218 fix onnx2ncnn clip without max blob, fix #2788 5 years ago
  nihui 67e24e0703
use local pool allocator (#2736) 5 years ago
  Cai Shanli f5b307689b
fix net and extractor destroy order when use vulkan (#2732) 5 years ago
  nihuini b51959802c fix buffer2host copy, fix #2725 5 years ago
  Xu Yang fd634e9a58
remove unnecessary mat clone when NCNN_BENCHMARK enabled (#2708) 5 years ago
  Dahan Gong cbd410c237
fix broken inplace forward (#2709) 5 years ago
  Youngsoo Lee b9bed8d993
feat: add denormal options (#2656) 5 years ago
  nihui 9fd4d371ae
bridge image for adreno image upload and download (#2658) 5 years ago
  nihuini 2a57ca4942 reduce memory usage in lightmode, handle upload image allocation failure properly 5 years ago
  nihuini bd68ee487b fallback to cpu when image allocation failed, fix #2648 5 years ago
  nihui af7d8184aa handle image allocation failure properly 5 years ago
  nihui 09b2bf6213
Break down forward_layer (#2577) 5 years ago
  nihui 54c0a13b9f
build shared library (#2525) 5 years ago
  nihui 1040f40c8b update c api for custom allocator datareader modelbin and layer registration, add cookie userdata to layer 5 years ago
  nihui 79efe33fdc
cmake option for platform api uses (#2502) 5 years ago
  nihui 343bc3b7dc
single blob consumer (#2493) 5 years ago
  Zhuo Zhang 3c99287da5
fix src/net.cpp missing-field-initializers warning (#2494) 5 years ago
  maxfy1992 0f325d7910
add decrease unpack pack overhead (#2489) 5 years ago
  Cai Shanli a9df4f6c59
add custom layer destroyer (#2481) 5 years ago
  Martin Han b441f738bd
Extract on CPU without pack/fp16fp32 (#2288) 5 years ago
  PENGUINLIONG 8f8f2de4d0
SSE2 optimization pack (#2123) 5 years ago
  nihui cf3cf83cd3
unified image shader storage type (#2231) 5 years ago
  nihuini b766c8cd9e fix potential divide by zero fault when bf16s / fp16s enabled, fix #2125 5 years ago
  nihuini a334513b5e fp16a option fix 5 years ago
  nihuini e841ae73c6 fix arm fp16s feat output, fix #2003 5 years ago
  nihui 54e79a62d7 fix crash on non-arm82 build 5 years ago
  nihui c173d51c9b mish sigmoid swish tanh arm fp16s 5 years ago
  nihui 71f86af8a6 fix non-arm82 ci 5 years ago
  nihui 9a2e2a6937 convert fp32 blobs for layers with fp16 storage support 5 years ago
  nihui 308145254e mask bf16 option in layer forward, disable gpu when bf16 enabled, fix #1962 5 years ago
  nihui 71dc13625f disable bf16 storage for int8 inference 5 years ago