208 Commits (a490f8a5335f3608a19fdf8a018fbfbd731280d3)

Author SHA1 Message Date
  nihui aa9753b2f0
detach mat from local blob allocator so net instance could be destroyed much earlier (#3287) 4 years ago
  zhiliu6 814f89ef1a
Fuse HardSwish activation into Convolution and InnerProduct (#3233) 4 years ago
  Tijmen Verhulsdonck 4270b5c502
Fix broken codepaths with AVX only (#3254) 4 years ago
  zhiliu6 80699dd3f9
fix hardswish test beta param (#3214) 4 years ago
  nihui c6cda8d07c
arm neon optimization for requantize leakyrelu (#3144) 4 years ago
  Xavier Hsinyuan 2a5c672787
Add unittest and RVV optimized for SELU (#3114) 4 years ago
  nihuini f1533667ff
fix test_c_api net instance destroyed earlier than blob destruction 4 years ago
  Tijmen Verhulsdonck eaa7e24db6
Added ability to switch AVX/AVX2 during runtime (#3076) 4 years ago
  nihui b413fd3a3d
auto code-format bot and disable restyled (#3075) 4 years ago
  DaydreamCoding f42d0e5dc9
fix warpaffine_bilinear_yuv420sp uv matrix (#3048) 4 years ago
  nihui 4f135e07bf
implement convolution1d and pooling1d (#3035) 4 years ago
  nihuini 12eaa6f9ba update concat test 5 years ago
  nihuini a180bf7bdc update concat test for larger channels 5 years ago
  nihui c1ce8ea84d add more test 5 years ago
  nihuini 07fa2e1fe3 prefer large channels for int8 operator tests 5 years ago
  nihui 3a77b09c31 fix test failure 5 years ago
  nihuini fef61c5296 fix arm build 5 years ago
  nihuini 934a1a8e32 test flatten packing padding int8 5 years ago
  nihui 49f3e1ea09
drawing api and stb_image (#2913) 5 years ago
  nihui 17936e9f54 fix packing risc-v test, add cpu_riscv_vlenb() 5 years ago
  nihui a61f03ec76 arm neon optimization for pixelshuffle scale 2 5 years ago
  nihuini d6b2ea5aac arm neon optimization for convolution 3x3 on small channels 5 years ago
  nihui 7e1aaa5828
cmake option NCNN_INT8 (#2839) 5 years ago
  nihui 66455c1b95
implement 2823 binary broadcasting type (#2827) 5 years ago
  nihuini 41a4bea954 unroll size 8 for conv3x3s1 pack8to1 int8 arm64 5 years ago
  nihui e9cc637573
arm neon optimization for int8 packing kernels (#2809) 5 years ago
  nihui 1ea8bfbd2e x86 avx2 conv3x3s1 pack8 direct optimization, fix #2789 5 years ago
  ncnnnnn 6e6cb9f4f3
simple sort ncnn_add_layer_test (#2790) 5 years ago
  nihui a48bf43ef7 test conv/fc int8 with activation 5 years ago
  nihui 5fe75f19ef
architecture changes for int8 packing (#2771) 5 years ago
  nihuini 15d63ec0f5 fuse onnx multiheadattention with same qkv blob 5 years ago
  RBelogorodtsevFBase 1212ed6e94
implements gelu activation (#2749) 5 years ago
  nihuini c17eb4e208 multiheadattention layer 5 years ago
  nihuini 7ac23ab34d fuse onnx layernorm, fix 2-dim layernorm implementation, add test 5 years ago
  nihui 3c92a1184b
arm neon optimization for general convolution im2col sgemm (#2668) 5 years ago
  nihui ab56083ca5
arm neon optimization for conv3x3s1 winograd42 (#2664) 5 years ago
  nihuini f437bcdd4c enable fp16s and int8s on newer adreno/mali, actually enable int8 tests 5 years ago
  nihui 74451897cb
handle gemm in innerproduct (#2607) 5 years ago
  nihui 0a59ac9b16
integer warpaffine (#2604) 5 years ago
  nihui 6672b09a37
arm neon optimization for gru (#2597) 5 years ago
  nihui 0b35540c72
arm neon optimization for lstm (#2595) 5 years ago
  nihuini 3915b5d496 arm neon optimization for packing fp16/bf16 pack8 family 5 years ago
  nihui fca04980f3
enhance padding test (#2580) 5 years ago
  nihui 80fdddb502 more slice test 5 years ago
  nihui ef3550b52f
gru and rnn layer (#2572) 5 years ago
  Guoxia Wang 609f63c57e
support PyTorch AdaptiveAvgPool2d and AdaptiveMaxPool2d (#2546) 5 years ago
  nihui 21dc650eb3
check layer support (#2564) 5 years ago
  tpoisonooo baf49574c4
innerproduct aarch64 use gemm (#2521) 5 years ago
  nihui 54c0a13b9f
build shared library (#2525) 5 years ago
  nihuini fbf0ffda53 pixelshuffle nhwc mode, convert onnx DepthToSpace mode DCR, convert mlir tf.DepthToSpace 5 years ago