513 Commits (123ca35e00aa6bd55e5f9d00a039bb5a01dbacfc)

Author SHA1 Message Date
  Howave 123ca35e00 fix compile warnings (#1042) 7 years ago
  nihuini bade132589 comment++ 7 years ago
  nihuini 81be8c86ae fix bus error in resize_bilinear_c2 on armv7 7 years ago
  nihuini 17d63a1491 fix bus error in resize_bilinear_c3 on armv7 7 years ago
  nihuini e9ffdb5bdd 16bit storage on arm mali is buggy 7 years ago
  nihuini 73911492d7 fix validation warning on querypool destruction, enable fp16p by default 7 years ago
  nihuini 040a8d2427 set vulkan device by gpu index 7 years ago
  nihui 21f79b8546 prefer cpu fp16 casting to reduce upload/download overhead on discrete gpu 7 years ago
  nihui 721abe91a8 packed mat is handy 7 years ago
  nihui afcfe0936f fix false warnings 7 years ago
  nihuini e56f0d47cc fix out of range load and store in bilinear resize c2/c3 neon block 7 years ago
  BUG1989 c2022f4501 optimize conv sgemm with sse on intel platform (#1035) 7 years ago
  nihuini e09607bc22 add option to upload model function, pipeline creation honors option use flags, setting allocator per extractor do not make much sense 7 years ago
  nihuini e09d11f936 rough fix build without arm neon 7 years ago
  nihuini 5fdffbcaac destroy_gpu_instance is not threadsafe anyway, fix deadlock on exit 7 years ago
  BUG1989 d9f269fa3d use sgemm fp32 on arm platform,optimize conv1x1s2 (#1031) 7 years ago
  nihuini 838c5df839 option api changes 7 years ago
  nihuini 7f7bbf12e5 new api for getting the default gpu device 7 years ago
  nihuini 4de4078779 move platform includes out of namespace 7 years ago
  BUG1989 b53541e8f9 fix arm winograd int8,optimize winograd x86 (#1025) 7 years ago
  BUG1989 01b3804828 optimization the x86 convolution layer with avx2 (#1019) 7 years ago
  nihui fe4b00f7a2 unroll outh 4 for winograd gemm 7 years ago
  nihuini 74276314bb unroll size 4 for conv1x1s1 pack4 7 years ago
  nihuini cd7559c639 more fix for fp16p, still disabled by default 7 years ago
  nihuini 4b6bffa560 Mat row should be elemsize-aware 7 years ago
  harhar539 5e317b98c5 fix illegal memory access at conv layer of vulkan (#1011) 7 years ago
  nihui 25b9736f82 shader fp16 packed 7 years ago
  nihuini 4b50a97e31 implement vulkan winograd23 7 years ago
  nihuini 37e150162a do not retrieve timestamp availabitliy bits 7 years ago
  nihuini 738fb6bb14 print gpu per layer benchmark 7 years ago
  nihuini 8e2fb2e710 expose timestamp_period and timestamp_valid_bits 7 years ago
  nihuini c9a9486307 merge command submit and wait, expose queue_count, concurrent queue submission shall work 7 years ago
  nihuini 2b21cf9e02 move mutex class family to platform.h 7 years ago
  nihuini aa94e77e68 fix pipeline object leak 7 years ago
  nihui 3e003ffd98 fuse sigmoid 7 years ago
  nihui 5adfa290a5 1x1s1d1_lds_4_4_4 is non-optimal, delete it 7 years ago
  nihuini 8ac300c3a2 mat4 type in shared memory makes some driver unhappy .. 7 years ago
  nihuini f5ba97e7c6 lds optimize for conv3x3s1, conv1x1s1 and fc 7 years ago
  nihuini 8322a14964 set fixed local size 7 years ago
  nihuini 7a8f68aca6 move vulkan code to subdir, new layer interface create_pipeline and destroy_pipeline for post-loading works 7 years ago
  nihuini c6e075cef7 fuse deconv/innerproduct relu arm 7 years ago
  nihui be81ecf1f6 fix build on msvc 7 years ago
  nihuini 528fe8e9e3 gpu convolution/deconvolution/innerproduct fuse activation 7 years ago
  nihuini 3f85cafc08 fuse relu leakyrelu clip into convolution/deconvolution/innerproduct 7 years ago
  nihuini 7984ffcb4d ncnnoptimize tool 7 years ago
  nihuini b81e1f3906 get rid of the old workaround :) 7 years ago
  791136190 e2e8e1b9d7 mxnet2ncnn tool support symbol.softmax op (#938) 7 years ago
  nihuini 5d86014d9c add missing barrier for transfer dst, fix softmax pack4, fix #932 7 years ago
  nihuini 4729ea3505 bottom blob memory never alias, reuse blob memory more elegantly relying on refcount 7 years ago
  nihui 274392eb80 convolution padding same on gpu 7 years ago