718 Commits (4d2d625432e8fdaaaa33042f31ceb6071eef6809)

Author SHA1 Message Date
  nihuini 4d2d625432 fix avx2 build, second try, fix #1953 5 years ago
  nihuini 8b0890999a fix avx2 build, fix #1953 5 years ago
  nihui 88367f4164
Ci enable mips msa (#1949) 5 years ago
  nihui bb5bfe3841
avx2 infrastructure (#1943) 5 years ago
  nihui 11cffce114
armv8.2 infrastructure (#1856) 5 years ago
  nihui 3ff40b0679
Ci rv32imc (#1940) 5 years ago
  nihui fe6bc1ed4d
Ci rv64gcv and rv64gc (#1936) 5 years ago
  nihui b8f3e1455e code clean 5 years ago
  nihui 6ebb774e37 fix interp arm 5 years ago
  nihui 6538b95102 fix interp and lrn test 5 years ago
  nihuini 68750a5151 fix data race in innerproduct avx 5 years ago
  nihuini 0d6cc01d55 innerproduct handle mish activation, fix naive C testing, fix #1930 5 years ago
  nihui 109e079c51 test deconvolution with output shape and padding 5 years ago
  nihuini 00ef566609 implement full permute tag support for reshape 5 years ago
  nihui f2397710c0
fix arm loadfp16 on some old compilers (#1921) 5 years ago
  nihui fb4daa5c96 reshape packing with permute, fix #1909 5 years ago
  nihui b5e288b521 layer creator function is not necessary for built-in layers 5 years ago
  nihui 164273de61
online pipeline cache (#1792) 5 years ago
  Tijmen Verhulsdonck 3325cf94f8
Added AVX swish/lrn/batchnorm (#1897) 5 years ago
  Tijmen Verhulsdonck 73aa99e83c
LSTM arm/x86 + fp16 innerproduct arm (#1881) 5 years ago
  Tijmen Verhulsdonck 26999fab19
Fix AVX wino 3x3 and improve convolution test converage (#1891) 5 years ago
  nihuini ca170a9652 fix conv3x3s1 pack1to4 bf16s arm neon, #1891 5 years ago
  nihuini ebce2c3742 fp16s priorbox seems to work now, drop hack 5 years ago
  nihuini df36164356 disable avx2 winograd convolution at the moment for incorrect output 5 years ago
  nihui dfee9a75ea workaround the shape specialization constant not respected properly in padding reflect mode on nvidia gpu 5 years ago
  nihui 2345ab7c92 workaround fp16 vec4 of same value store get undefined result on some nvidia gpu 5 years ago
  nihui 12ce58074e some code clean 5 years ago
  Tijmen Verhulsdonck 66618340ac
x86 fp16 weight storage optimizations (#1871) 5 years ago
  Tijmen Verhulsdonck 82637995c1
3x3 winograd elempack8 (#1865) 6 years ago
  Tijmen Verhulsdonck 988e8088ea
Fix benchmark (#1864) 6 years ago
  nihuini 71db6e1da5 shufflechannel reverse group style 6 years ago
  nihui 01b8b79ed2 packing layout option respect support_packing property 6 years ago
  Tijmen Verhulsdonck d1b5711791
X86 Elempack 8 AVX implementations. (#1853) 6 years ago
  nihuini 890aff0bdf static int8kernel function 6 years ago
  Tijmen Verhulsdonck a91a18b901
AVX innerproduct and pooling 2x2 versions (#1839) 6 years ago
  nihui 3ef995ed1e
format code style and setup restyled.io (#1840) 6 years ago
  nihui 9858d203df padding handle autopad same lower, bottom_shapes is hint only, respect the actual one_blob_only property, fix #1828 6 years ago
  Tijmen Verhulsdonck e3b31511ad
Added AVX implementation to cast to/from bfloat and float32 (#1836) 6 years ago
  nihuini f72b394588 skip crop for identical shape 6 years ago
  Tijmen Verhulsdonck da09e5e7f1
Adding channel padding support for blazeface model. (#1826) 6 years ago
  tpoisonooo 8e1c3ac4d1
Add crop para check (#1825) 6 years ago
  JackieWu ce2251db05
Improve ROIAlign (accelerate ROIAlign, support sampling ratio and aligned ROIAlign) (#1820) 6 years ago
  zhiliu6 63d7e2c88d
Add support for darknet EfficientNetB0-Yolov3 conversion. (#1821) 6 years ago
  nihui 478e15d7dc fix memorydata test on nvidia gpu 6 years ago
  nihuini bb56b5439f fix vkmat download on integrated gpu, workaround priorbox fp16s with online spirv, fix #1700 fix #1805 6 years ago
  zhiliu6 d23cef320c
Add Swish layer (#1799) 6 years ago
  nihuini d232272db0 lower end gpu friendly 6 years ago
  nihuini ed7bca6de0 deepcopy image shader 6 years ago
  nihuini c94d1b39ad force diable image storage on macos and ios, fix #1738 6 years ago
  Charles.Wang 72ac36ea60
add statisticspooling layer (#1768) 6 years ago