982 Commits (bedf00a5edf85ae8af33bc72ebd85fe325da1271)

Author SHA1 Message Date
  nihui 1322ae40cb
update engine version 5 years ago
  nihui 308145254e mask bf16 option in layer forward, disable gpu when bf16 enabled, fix #1962 5 years ago
  nihui 71dc13625f disable bf16 storage for int8 inference 5 years ago
  nihuini 8700985540 yet another workaround for nexus6p gpu 5 years ago
  nihuini bf279dcf17 workaround corrupted pipeline cache on old qcom adreno 5 years ago
  nihuini 4e4f0baa73 set openmp blocktime 20 for reducing power consumption, blocktime option 5 years ago
  nihui 21762e09e5
fix dilated convolution (#1956) 5 years ago
  nihuini 4d2d625432 fix avx2 build, second try, fix #1953 5 years ago
  nihuini 8b0890999a fix avx2 build, fix #1953 5 years ago
  nihui 88367f4164
Ci enable mips msa (#1949) 5 years ago
  nihui bb5bfe3841
avx2 infrastructure (#1943) 5 years ago
  nihui 11cffce114
armv8.2 infrastructure (#1856) 5 years ago
  nihui 3ff40b0679
Ci rv32imc (#1940) 5 years ago
  nihui fe6bc1ed4d
Ci rv64gcv and rv64gc (#1936) 5 years ago
  nihui b8f3e1455e code clean 5 years ago
  nihui 6ebb774e37 fix interp arm 5 years ago
  nihui 6538b95102 fix interp and lrn test 5 years ago
  nihuini 68750a5151 fix data race in innerproduct avx 5 years ago
  nihuini 0d6cc01d55 innerproduct handle mish activation, fix naive C testing, fix #1930 5 years ago
  nihui be330e0fc4 test mat pixel 5 years ago
  nihui 109e079c51 test deconvolution with output shape and padding 5 years ago
  nihuini 00ef566609 implement full permute tag support for reshape 5 years ago
  nihui 193e08e834 lazy initialize utility operator, fix #1923 5 years ago
  nihui f2397710c0
fix arm loadfp16 on some old compilers (#1921) 5 years ago
  nihui 7f5047d1dc
Ci test end2end squeezenet (#1919) 5 years ago
  nihui 27e099961c fix double gpu instance destruction 5 years ago
  nihui fb4daa5c96 reshape packing with permute, fix #1909 5 years ago
  nihui b5e288b521 layer creator function is not necessary for built-in layers 5 years ago
  nihui 164273de61
online pipeline cache (#1792) 5 years ago
  Tijmen Verhulsdonck 3325cf94f8
Added AVX swish/lrn/batchnorm (#1897) 5 years ago
  Tijmen Verhulsdonck 73aa99e83c
LSTM arm/x86 + fp16 innerproduct arm (#1881) 5 years ago
  Tijmen Verhulsdonck 26999fab19
Fix AVX wino 3x3 and improve convolution test converage (#1891) 5 years ago
  nihuini ca170a9652 fix conv3x3s1 pack1to4 bf16s arm neon, #1891 5 years ago
  nihuini ebce2c3742 fp16s priorbox seems to work now, drop hack 5 years ago
  nihuini df36164356 disable avx2 winograd convolution at the moment for incorrect output 5 years ago
  nihui dfee9a75ea workaround the shape specialization constant not respected properly in padding reflect mode on nvidia gpu 5 years ago
  nihui 2345ab7c92 workaround fp16 vec4 of same value store get undefined result on some nvidia gpu 5 years ago
  nihui 12ce58074e some code clean 5 years ago
  Tijmen Verhulsdonck 66618340ac
x86 fp16 weight storage optimizations (#1871) 5 years ago
  zhiliu6 1474f056df
Fix windows build (#1878) 5 years ago
  nihuini d2bf77cd88 create new allocator when pre-allocated allocators exhausted, fix #1862 5 years ago
  Tijmen Verhulsdonck 82637995c1
3x3 winograd elempack8 (#1865) 5 years ago
  Tijmen Verhulsdonck 988e8088ea
Fix benchmark (#1864) 5 years ago
  nihuini 71db6e1da5 shufflechannel reverse group style 5 years ago
  nihui 01b8b79ed2 packing layout option respect support_packing property 5 years ago
  nihui a33b353c36
C api (#1851) 6 years ago
  Tijmen Verhulsdonck d1b5711791
X86 Elempack 8 AVX implementations. (#1853) 6 years ago
  nihuini 890aff0bdf static int8kernel function 6 years ago
  nihuini c38d304369 the implicit gpu instance makes life easier :) 6 years ago
  nihuini 187a3e672d implicit gpu instance destruction, fix #1849 6 years ago