2410 Commits (a46edcf720aba9cd352fc959e59a50eac8dc43d7)
 

Author SHA1 Message Date
  nihui a46edcf720
x86 optimization for interp (#3546) 4 years ago
  nihui 139554b36e
rewrite convolution x86 sgemm pack1 (#3544) 4 years ago
  Yoh d2999b8d53
Optimize scale x86 (#3540) 4 years ago
  nihui fb6283c8b0
x86 avx fma optimization (#3543) 4 years ago
  nihui 3a43cc7015 update efficientnetv2_b0 param for reduction axes changes 4 years ago
  nihui 3181616439 treat old reduction axes param as failure 4 years ago
  nihui 672daa7e04
xop infrastructure and optimization (#3541) 4 years ago
  nihui 9d0c36358c add z8350 and n5105 benchmark 4 years ago
  nihui de77b669c4
x86 sse2 optimization for conv1x1/3x3 pack4 and general sgemm pack4/pack4to1 (#3538) 4 years ago
  nihui 6422e6acd3
fix x86 sgemm convolution int8 weight shuffle 4 years ago
  nihui 340b4e673e
pnnx fold constant (#3521) 4 years ago
  Kagurazaka Kotori 08ecc94d63
x86: Use _mm_cvtsi128_si{32,64} in float2int8 (#3536) 4 years ago
  nihui 1d0b78f9b6
Update README.md 4 years ago
  nihui a356d152bb
Update README.md 4 years ago
  Joson 70795c6548
Create README.md (#3532) 4 years ago
  teng 3ff9ae707f
simplify macro (#3530) 4 years ago
  Kagurazaka Kotori 5c078016c2
x86/avx_mathfun.h: Remove fallback warnings (#3527) 4 years ago
  nihui 2d46994d2e wrap avxvnni and avx512vnni build options over cpu feature detector 4 years ago
  nihui 33e225f173 fix c api test 4 years ago
  nihui bae2ee375f simplify c api layer forward_n output array type 4 years ago
  nihuini 1be043aad5
convert torch mean/sum/prod reduction with no args 4 years ago
  nihuini b4a755495c
convert pnnx zeros roll remainder 4 years ago
  nihui c0a94cd9ca
fix armv7 without neon (#3514) 4 years ago
  nihuini 4ba1eb6d2f
assign unique names for all pnnx operator and operand names. fix #3493 4 years ago
  nihuini 457f7d1c63
fix use-after-free, fix #3492 4 years ago
  nihui b07ad54320 add zynq-7020 benchmark 4 years ago
  nihui 4e4e0b9cf8 do not link libgcc as we no longer rely on builtin support cpu feature intrinsics now 4 years ago
  nihui 71f377e9e9 update benchmark from Q-engineering 4 years ago
  nihui d95213a005
x86 convolution int8 optimization third stage (#3506) 4 years ago
  nihuini 9f7f491885
use the old-style __cpuid_count for old compiler compatibility, fix #3510 4 years ago
  nihui 930c36ebe2
avx512 infrastructure (#3407) 4 years ago
  nihui c2896bcd4d
x86 convolution int8 optimization second stage (#3495) 4 years ago
  teng 13a51fbcf8
add else (#3494) 4 years ago
  nihui e9b8f0a6ef
x86 avx2 optimization for convolution gemm int8 (#3489) 4 years ago
  nihui c5d7f963b9
layer tile (#3491) 4 years ago
  dependabot[bot] d25388c938
Bump pypa/gh-action-pypi-publish from 1.4.2 to 1.5.0 (#3490) 4 years ago
  Xiaoyang Chen 4d31c46532
[pnnx] Update README.md (#3487) 4 years ago
  nihui 7d3503c06a
pnnx Tensor index (#3483) 4 years ago
  nihuini 1db16ce9fc
pnnx torch norm stack test 4 years ago
  nihuini 23d3340017
pnnx norm stack 4 years ago
  nihuini e33bdd16e8
pnnx fuse conv1d-bn convtranspose1d-bn 4 years ago
  nihuini f8ca1e7585
fix pnnx crash on unsupported expression 4 years ago
  nihui 7c60dc2db7
pnnx roialign (#3478) 4 years ago
  nihui 143258e317
pnnx torchvision deformconv2d (#3459) 4 years ago
  Xiaohan Liu 3daabd515d
add missing doffset (#3475) 4 years ago
  nihui 7b222a19af
update benchmark (#3465) 4 years ago
  dog-qiuqiu 009d607a15
add the param file of yolo-fastest in benchmark (#3470) 4 years ago
  nihuini 014387dfae
update operators doc 4 years ago
  nihuini de436f9e26
pnnx arange matmul zeros_like expand_as 4 years ago
  nihui 922f8b33c1
reduction4d, merge keepdims arg, add test (#3469) 4 years ago