31 Commits (882f114a8debc4c98a5b87d2a43c00a67ae5eb03)

Author SHA1 Message Date
  nihuini afc02d57f9 runtime detect armv8.2 dotprod 5 years ago
  nihui 11958424c2 runtime riscv v and zfh dispatch, riscv v optimization for cast 5 years ago
  nihui e9cc637573
arm neon optimization for int8 packing kernels (#2809) 5 years ago
  nihui 3ed6c21565 find threads in cmake config 5 years ago
  nihui 14d319db36 include arm82 on native macos arm64 5 years ago
  nihui 54c0a13b9f
build shared library (#2525) 5 years ago
  nihui 1040f40c8b update c api for custom allocator datareader modelbin and layer registration, add cookie userdata to layer 5 years ago
  Cai Shanli a9df4f6c59
add custom layer destroyer (#2481) 5 years ago
  nihui e93ad408c5
Ci release (#2440) 5 years ago
  nihui f4f790ca1f
ci macos arm64 (#2321) 5 years ago
  nihui b9296c259d
bring up vulkan 1.1 (#2191) 5 years ago
  youzainn 3b1b41ec0b
Add some compile options, add vulkan dependency export (#2062) 5 years ago
  nihui bb5bfe3841
avx2 infrastructure (#1943) 5 years ago
  nihui 11cffce114
armv8.2 infrastructure (#1856) 5 years ago
  nihui fe6bc1ed4d
Ci rv64gcv and rv64gc (#1936) 5 years ago
  nihuini f3b182da1f fix ci build 6 years ago
  nihuini 989b0f70cc convert shader source to hex data at build time 6 years ago
  nihuini b5f85eee13 fix image1d_xx8 macro, normalize image shader 6 years ago
  nihuini 6682cd1638 image fp16pa, mark some bugihfa todo 6 years ago
  nihui e8688b042f fuse packing cast storage, binaryop image shader, dummy buffer and image, device-wide utility packing converter operators, fix multi-blob layer test 6 years ago
  nihui 62da1228e1
adreno image shader + fp16 + fp16a (#1714) 6 years ago
  nihuini 1ea9de3bdf create shader pipeline by type index, resolve binding count and push constant count from spirv. since we don't create compound shader module for macos and ios compatibility, it is enough to use fixed main as the shader entry point 6 years ago
  nihui 999da7158f old glslang reject -Os option, as optimizing for size does not make a big difference, drop it for now, fix #1544 6 years ago
  nihui bbaa4dcce2 compile fp16pa, optimize shader for size, enable implicit fp16 arithmetic for qcom855 and qcom855plus 6 years ago
  nihui 0f7e7bca02
shader shape specialization constant and basic local group size partition (#1523) 6 years ago
  nihui 33b16811ce reimplement sfp afp conversion macro as function style buffer load store, drop lds shader for the moment 6 years ago
  nihui 5042d14d7d define sfpvec8 afpvec8 macro, use modern glsl extension for fp16 arithmetic, fix padding aarch64 build 6 years ago
  nihuini 628989770b return values correctly 6 years ago
  nihuini eb9326002f cmake ncnn_generate_shader_spv_header function 6 years ago
  Natsu 637d96c1d2 Fix gcc 9 compilation failure (#1189) 6 years ago
  Natsu 6d1944f2c3 CMake improvement (#1115) 6 years ago