45 Commits (c09d7b359116c02891ef0bb678e596d67cb0d363)

Author SHA1 Message Date
  nihui 1fcad0e765 loongson mmi optional layer 4 years ago
  nihui 457e066eb5
x86 f16c infrastructure (#3577) 4 years ago
  nihui 4654030541
decouple x86 fma avx2 (#3560) 4 years ago
  nihuini 51ecc33d9d
check avx512vl extension for discarding old-slow avx512 chips, enable avx512 option by default 4 years ago
  nihui 672daa7e04
xop infrastructure and optimization (#3541) 4 years ago
  nihui 930c36ebe2
avx512 infrastructure (#3407) 4 years ago
  nihui 878cb713d5
optional arm82 dot source (#3415) 4 years ago
  nihuini 11794675f3
apple a11 and a12 do not support armv8.2 dotprod, restore the fp16-only optimized path 4 years ago
  nihuini affbefe311
some space cleanup, blob clone from allocator 4 years ago
  Tijmen Verhulsdonck eaa7e24db6
Added ability to switch AVX/AVX2 during runtime (#3076) 4 years ago
  Tijmen Verhulsdonck a7f301a99d
Add clang compatiblity (#3071) 4 years ago
  nihui 1c31ac2549 runtime cpu dispatch for mips msa and loongson mmi 4 years ago
  nihui 2f70343aec
cmake clean (#3032) 4 years ago
  nihui bcbb55f033
apple device always has armv8.2 dot (#2963) 5 years ago
  nihuini afc02d57f9 runtime detect armv8.2 dotprod 5 years ago
  nihui 11958424c2 runtime riscv v and zfh dispatch, riscv v optimization for cast 5 years ago
  nihui e9cc637573
arm neon optimization for int8 packing kernels (#2809) 5 years ago
  nihui 3ed6c21565 find threads in cmake config 5 years ago
  nihui 14d319db36 include arm82 on native macos arm64 5 years ago
  nihui 54c0a13b9f
build shared library (#2525) 5 years ago
  nihui 1040f40c8b update c api for custom allocator datareader modelbin and layer registration, add cookie userdata to layer 5 years ago
  Cai Shanli a9df4f6c59
add custom layer destroyer (#2481) 5 years ago
  nihui e93ad408c5
Ci release (#2440) 5 years ago
  nihui f4f790ca1f
ci macos arm64 (#2321) 5 years ago
  nihui b9296c259d
bring up vulkan 1.1 (#2191) 5 years ago
  youzainn 3b1b41ec0b
Add some compile options, add vulkan dependency export (#2062) 5 years ago
  nihui bb5bfe3841
avx2 infrastructure (#1943) 5 years ago
  nihui 11cffce114
armv8.2 infrastructure (#1856) 5 years ago
  nihui fe6bc1ed4d
Ci rv64gcv and rv64gc (#1936) 5 years ago
  nihuini f3b182da1f fix ci build 6 years ago
  nihuini 989b0f70cc convert shader source to hex data at build time 6 years ago
  nihuini b5f85eee13 fix image1d_xx8 macro, normalize image shader 6 years ago
  nihuini 6682cd1638 image fp16pa, mark some bugihfa todo 6 years ago
  nihui e8688b042f fuse packing cast storage, binaryop image shader, dummy buffer and image, device-wide utility packing converter operators, fix multi-blob layer test 6 years ago
  nihui 62da1228e1
adreno image shader + fp16 + fp16a (#1714) 6 years ago
  nihuini 1ea9de3bdf create shader pipeline by type index, resolve binding count and push constant count from spirv. since we don't create compound shader module for macos and ios compatibility, it is enough to use fixed main as the shader entry point 6 years ago
  nihui 999da7158f old glslang reject -Os option, as optimizing for size does not make a big difference, drop it for now, fix #1544 6 years ago
  nihui bbaa4dcce2 compile fp16pa, optimize shader for size, enable implicit fp16 arithmetic for qcom855 and qcom855plus 6 years ago
  nihui 0f7e7bca02
shader shape specialization constant and basic local group size partition (#1523) 6 years ago
  nihui 33b16811ce reimplement sfp afp conversion macro as function style buffer load store, drop lds shader for the moment 6 years ago
  nihui 5042d14d7d define sfpvec8 afpvec8 macro, use modern glsl extension for fp16 arithmetic, fix padding aarch64 build 6 years ago
  nihuini 628989770b return values correctly 6 years ago
  nihuini eb9326002f cmake ncnn_generate_shader_spv_header function 6 years ago
  Natsu 637d96c1d2 Fix gcc 9 compilation failure (#1189) 6 years ago
  Natsu 6d1944f2c3 CMake improvement (#1115) 6 years ago