63 Commits (c09d7b359116c02891ef0bb678e596d67cb0d363)

Author SHA1 Message Date
  nihui 559e5b23f9
vulkan tensorcore optimization (#3628) 4 years ago
  nihui 920aa79f04
drop x86 avx2 fp16 (#3568) 4 years ago
  Yuzhong Yan 681141ff42
[YZ] Fix bug in unit test (#3556) 4 years ago
  nihui 3a83704c38
binary4d, unary4d (#3443) 4 years ago
  nihui 6941ec8fc9
arm neon optimization for general packed convolution (#3426) 4 years ago
  nihui 999e640d43
dynamic convolution weight (#3408) 4 years ago
  nihui f10cc6dd93
initial data structure changes for 3dcnn, conv3d, pooling3d (#3378) 4 years ago
  nihui 24fbb6e8cb
honor thread setting on load and vulkan command, ci avx512 t4 (#3391) 4 years ago
  nihui 0b664ec438
fix potential out of range read in test with int8 inputs (#3357) 4 years ago
  Tijmen Verhulsdonck 4270b5c502
Fix broken codepaths with AVX only (#3254) 4 years ago
  Tijmen Verhulsdonck eaa7e24db6
Added ability to switch AVX/AVX2 during runtime (#3076) 4 years ago
  nihui 3a77b09c31 fix test failure 5 years ago
  nihuini fef61c5296 fix arm build 5 years ago
  nihuini 934a1a8e32 test flatten packing padding int8 5 years ago
  nihui 17936e9f54 fix packing risc-v test, add cpu_riscv_vlenb() 5 years ago
  nihui a61f03ec76 arm neon optimization for pixelshuffle scale 2 5 years ago
  nihui 5fe75f19ef
architecture changes for int8 packing (#2771) 5 years ago
  nihui 21dc650eb3
check layer support (#2564) 5 years ago
  tpoisonooo baf49574c4
innerproduct aarch64 use gemm (#2521) 5 years ago
  nihui 54c0a13b9f
build shared library (#2525) 5 years ago
  PENGUINLIONG 8f8f2de4d0
SSE2 optimization pack (#2123) 5 years ago
  maxfy1992 a106baa3b8
add interp param align_corner (#2236) 5 years ago
  Leo 5afd318b86
Support remove libstdc++ denpendency (#2030) 5 years ago
  nihuini d3f0b9f993 try smaller random values 5 years ago
  nihuini 5d5a3d1434 conv1x1s1 conv1x1s2 conv3x3s1 winograd pack8 arm fp16sa 5 years ago
  nihui aa1a9e90c5 interp shufflechannel arm fp16sa pack8 5 years ago
  nihuini df5a7f32d4 enable arm82 fp16sa pack8 test 5 years ago
  nihuini 47ae0c151a some shared arm bf16s fp16s implementation 5 years ago
  nihui bb5bfe3841
avx2 infrastructure (#1943) 5 years ago
  nihui 11cffce114
armv8.2 infrastructure (#1856) 5 years ago
  nihui 3ff40b0679
Ci rv32imc (#1940) 5 years ago
  nihuini 0d6cc01d55 innerproduct handle mish activation, fix naive C testing, fix #1930 5 years ago
  Tijmen Verhulsdonck 3325cf94f8
Added AVX swish/lrn/batchnorm (#1897) 6 years ago
  Tijmen Verhulsdonck 73aa99e83c
LSTM arm/x86 + fp16 innerproduct arm (#1881) 6 years ago
  nihui 12ce58074e some code clean 6 years ago
  Tijmen Verhulsdonck 66618340ac
x86 fp16 weight storage optimizations (#1871) 6 years ago
  Tijmen Verhulsdonck d1b5711791
X86 Elempack 8 AVX implementations. (#1853) 6 years ago
  nihuini c38d304369 the implicit gpu instance makes life easier :) 6 years ago
  nihui 3ef995ed1e
format code style and setup restyled.io (#1840) 6 years ago
  nihuini ebabfa60c1 disable image storage test on macos and ios 6 years ago
  nihui f9332e04e4
enable image storage test (#1744) 6 years ago
  nihui 9a9a618229 image storage is mandatory, less options makes life easier 6 years ago
  nihui e8688b042f fuse packing cast storage, binaryop image shader, dummy buffer and image, device-wide utility packing converter operators, fix multi-blob layer test 6 years ago
  nihui 62da1228e1
adreno image shader + fp16 + fp16a (#1714) 6 years ago
  nihui 18328f63e6 fix arm bf16 test conditions, fix unused warning in crop arm 6 years ago
  nihui 7365bb80a2
vkmat and command api breaks (#1689) 6 years ago
  nihuini 9f3af60b3a dropout prelu scale test 6 years ago
  nihuini 85d5e5d3e4 fix innerproduct vulkan pack8 and arm neon, disable packing_layout for int8 test 6 years ago
  nihui ec40b4dbd7
test bf16s (#1644) 6 years ago
  nihui d023137426
test fp16 packed and shader pack8 option (#1636) 6 years ago