804 Commits (123ca35e00aa6bd55e5f9d00a039bb5a01dbacfc)
 

Author SHA1 Message Date
  Howave 123ca35e00 fix compile warnings (#1042) 7 years ago
  nihuini bade132589 comment++ 7 years ago
  nihuini 81be8c86ae fix bus error in resize_bilinear_c2 on armv7 7 years ago
  nihuini 17d63a1491 fix bus error in resize_bilinear_c3 on armv7 7 years ago
  nihuini e9ffdb5bdd 16bit storage on arm mali is buggy 7 years ago
  nihui 1273d69c20 update qcom410 imx7d benchmark 7 years ago
  nihuini 73911492d7 fix validation warning on querypool destruction, enable fp16p by default 7 years ago
  nihuini 040a8d2427 set vulkan device by gpu index 7 years ago
  nihuini 9f9ac56538 update qcom810 and iphone5s benchmark 7 years ago
  nihui 21f79b8546 prefer cpu fp16 casting to reduce upload/download overhead on discrete gpu 7 years ago
  nihui af950819cd convert add_n and ElementWiseSum, fix #1008 7 years ago
  nihui 721abe91a8 packed mat is handy 7 years ago
  nihui afcfe0936f fix false warnings 7 years ago
  nihuini e56f0d47cc fix out of range load and store in bilinear resize c2/c3 neon block 7 years ago
  BUG1989 c2022f4501 optimize conv sgemm with sse on intel platform (#1035) 7 years ago
  nihuini e09607bc22 add option to upload model function, pipeline creation honors option use flags, setting allocator per extractor do not make much sense 7 years ago
  nihuini 83d7154be8 adapt option api changes 7 years ago
  nihuini e09d11f936 rough fix build without arm neon 7 years ago
  nihuini 5fdffbcaac destroy_gpu_instance is not threadsafe anyway, fix deadlock on exit 7 years ago
  BUG1989 d9f269fa3d use sgemm fp32 on arm platform,optimize conv1x1s2 (#1031) 7 years ago
  nihuini 838c5df839 option api changes 7 years ago
  nihuini 7f7bbf12e5 new api for getting the default gpu device 7 years ago
  nihuini 4de4078779 move platform includes out of namespace 7 years ago
  BUG1989 b53541e8f9 fix arm winograd int8,optimize winograd x86 (#1025) 7 years ago
  nihui 3aae0748e3
Update README.md 7 years ago
  BUG1989 01b3804828 optimization the x86 convolution layer with avx2 (#1019) 7 years ago
  nihuini 9b33e647bd use fixed blob names for benchmark 7 years ago
  nihuini 8cb107e78c apply model optimize 7 years ago
  nihui fe4b00f7a2 unroll outh 4 for winograd gemm 7 years ago
  nihuini 74276314bb unroll size 4 for conv1x1s1 pack4 7 years ago
  nihuini cd7559c639 more fix for fp16p, still disabled by default 7 years ago
  nihuini 4b6bffa560 Mat row should be elemsize-aware 7 years ago
  harhar539 5e317b98c5 fix illegal memory access at conv layer of vulkan (#1011) 7 years ago
  nihui 25b9736f82 shader fp16 packed 7 years ago
  nihuini 4b50a97e31 implement vulkan winograd23 7 years ago
  nihuini 37e150162a do not retrieve timestamp availabitliy bits 7 years ago
  nihuini 738fb6bb14 print gpu per layer benchmark 7 years ago
  nihuini 8e2fb2e710 expose timestamp_period and timestamp_valid_bits 7 years ago
  nihuini c9a9486307 merge command submit and wait, expose queue_count, concurrent queue submission shall work 7 years ago
  nihuini 2b21cf9e02 move mutex class family to platform.h 7 years ago
  nihuini aa94e77e68 fix pipeline object leak 7 years ago
  kalcohol a6aab42f95 add himix200 toolchain for Hi3516CV500, Hi3516DV300, Hi3519AV100. (#989) 7 years ago
  nihui 07260527fc fix activation params 7 years ago
  nihui 3e003ffd98 fuse sigmoid 7 years ago
  nihui 5adfa290a5 1x1s1d1_lds_4_4_4 is non-optimal, delete it 7 years ago
  nihuini 8ac300c3a2 mat4 type in shared memory makes some driver unhappy .. 7 years ago
  nihuini f5ba97e7c6 lds optimize for conv3x3s1, conv1x1s1 and fc 7 years ago
  nihuini 8322a14964 set fixed local size 7 years ago
  nihuini e46a3e428a cmake warning-- 7 years ago
  nihuini 7a8f68aca6 move vulkan code to subdir, new layer interface create_pipeline and destroy_pipeline for post-loading works 7 years ago