121 Commits (329e2eeae86b0dfd3a60c53204974efba1781ca5)

Author SHA1 Message Date
  nihuini a170ef1acf remove the default option usage in layer interface, fix write out of range in cast arm pack4, handle fp16p conversion on cpu/gpu transfer 6 years ago
  nihuini e73b06bbb8 fix build with NCNN_STRING=OFF 6 years ago
  nihuini 64333429bb data reader wrapper, fix #1325 6 years ago
  nihui 8c1b87b1a2 fallback to cpu if no vulkan device found 6 years ago
  Natsu 637d96c1d2 Fix gcc 9 compilation failure (#1189) 6 years ago
  nihui ff62e7eed9 use_packing_layout option works 6 years ago
  nihui b4c388a72a Mat misc function accept option parameter, deconvolution pack4 arm neon 6 years ago
  nihui 8c53706987 net vkdev getter api 7 years ago
  BUG1989 bcfe9f453f initial the ncnn post training quantization tools (#1067) 7 years ago
  nihuini b25f76833a restore per extractor allocator setters, patially revert e09607bc22 7 years ago
  nihuini 21b5508c96 shared locked vkallocator cannot prevent concurrent accessing during actual gpu inference, use seperated vkallocator for each queue 7 years ago
  nihuini 040a8d2427 set vulkan device by gpu index 7 years ago
  nihui 21f79b8546 prefer cpu fp16 casting to reduce upload/download overhead on discrete gpu 7 years ago
  nihuini e09607bc22 add option to upload model function, pipeline creation honors option use flags, setting allocator per extractor do not make much sense 7 years ago
  BUG1989 d9f269fa3d use sgemm fp32 on arm platform,optimize conv1x1s2 (#1031) 7 years ago
  nihuini 838c5df839 option api changes 7 years ago
  nihuini 7f7bbf12e5 new api for getting the default gpu device 7 years ago
  nihuini cd7559c639 more fix for fp16p, still disabled by default 7 years ago
  nihui 25b9736f82 shader fp16 packed 7 years ago
  nihuini 738fb6bb14 print gpu per layer benchmark 7 years ago
  nihuini c9a9486307 merge command submit and wait, expose queue_count, concurrent queue submission shall work 7 years ago
  nihuini 7a8f68aca6 move vulkan code to subdir, new layer interface create_pipeline and destroy_pipeline for post-loading works 7 years ago
  nihuini b81e1f3906 get rid of the old workaround :) 7 years ago
  nihuini 4729ea3505 bottom blob memory never alias, reuse blob memory more elegantly relying on refcount 7 years ago
  nihui 8724440c59 bind wait barrier count member to memory, fix #932 7 years ago
  nihui 162c46647d do not create fp16 shader module on unsupported platform 7 years ago
  nihui d753fe2589 upload fp16 weight, enable fp16 storage and arithmetic 7 years ago
  Gemfield add8c73922 Fix the return value of load_param and load_model (#855) 7 years ago
  Gemfield 573c2bcd93 Fix crash issue during load_model (#848) 7 years ago
  nihui caeb85d6cd multithreaded pipeline creation and destruction may cause driver crash :( 7 years ago
  nihuini b2e41bf83d fallback convolution to cpu path for pad -233 7 years ago
  nihuini d999f43b87 fix vulkan initialization using memory loading 7 years ago
  nihuini d263cd507c gpu packing and unpacking 7 years ago
  nihuini d3a11eb6c9 one codepath for unified and discrete device 7 years ago
  nihuini 433a92401a auto barrier in pipeline and copy command 7 years ago
  nihuini 1f4bdd91b5 uint32_t typed workgroup size 7 years ago
  BUG1989 df3d224484 new int8 implement,better accuracy (#749) 7 years ago
  nihui 979ed57487 packing param for identity packing when padding disabled, auto packing conversion between cpu and gpu blob 7 years ago
  nihui b49cb56ad9 constify vulkan device handle, use default local vulkan device if not specified 7 years ago
  nihui 5e07749a4a do not emit upload transfer on unified memory 7 years ago
  nihui 9ebac3fe9e dedicated reference counter for staging data 7 years ago
  nihuini 83efa73cf6 fallback to cpu forward if layer not support vulkan, automatically! 7 years ago
  nihuini 4a57f88c3c vkcompute auto begin end, use proper alignment for vktransfer staging buffer offset 7 years ago
  nihui f0b4933eac
massive simd optimize in compute shader (#772) 7 years ago
  nihui 10b8ac68cc
[WIP] vulkan compute (#618) 7 years ago
  nihui a577d71c12
Update net.cpp 7 years ago
  nihuini 099189384f fix load_param_bin, fix #732 7 years ago
  nihuini b2ffc339c0 reset internal_nconsumed_ptr before mem_scanf on msvc, fix #706 7 years ago
  Abdel Younes a941701f98 fix: c++ warnings (#666) 7 years ago
  nihuini 4e68a29eff fix build on msvc, second try 7 years ago