nihui
76849cede4
armv8.4 i8mm optimization for convolution gemm int8 ( #4034 )
3 years ago
Jie Li
627be612c6
fix(examples): optimize nms_sorted_bboxes ( #4030 )
3 years ago
nihui
dd86cebab8
armv8.6 ci and coverage ( #4025 )
* asimdfhm in fc
* move neon bf16 conversion function to arm_usability header
* fix cmake option
* fix build with newer gcc
* arm84 coverage
* arm asimdfhm optimization for innerproduct gemm fp16s
3 years ago
nihui
c5bb0e52ed
add ingenic-x2000 toolchain file
3 years ago
nihui
9b39691cc8
pnnx handle unrecognized file format ( #4028 )
3 years ago
nihui
f1ea792b26
fix too many microtask error in old libomp runtime ( #4002 )
3 years ago
Guo Haria
c1e2ab7205
add yolov7_pnnx example ( #4027 )
3 years ago
xuehao.ma
962a49069a
add the param file of fastestdet in benchmark ( #4026 )
3 years ago
teng
3901b837e2
add example yolov7 ( #4019 )
3 years ago
nihui
9b8272e86d
arm edsp and arm neon optimization for convolution int8 winograd ( #4017 )
3 years ago
nihui
a12cd7c212
mips msa and loongson mmi optimization for convolution int8 winograd f43 ( #4014 )
3 years ago
陸 言
cae8d0f1d7
Add Loongson 2F toolchain support (refer to AOSC) ( #3992 )
3 years ago
nihui
5725c028c0
arm dsp infrastructure and optimization for convolution gemm int8 ( #4011 )
3 years ago
nihui
ef216f732e
armv5 optimization for convolution gemm int8 ( #4010 )
3 years ago
nihui
0a12f81a2d
fix data race in arm rnn/gru/lstm ( #4008 )
3 years ago
nihui
322667a2ab
pnnx fix fused tensor_split operator insert order ( #4006 )
3 years ago
Jie Li
434e55c0f8
fix(examples): load_param always failed ( #4001 )
3 years ago
nihui
1892c25360
pnnx fuse megvii style shufflechannel slice ( #3999 )
3 years ago
nihui
94786308bd
pnnx fuse binaryop eltwise as weighted sum ( #4000 )
3 years ago
dependabot[bot]
8d28fc91b5
Bump pypa/cibuildwheel from 2.7.0 to 2.8.0 ( #3995 )
Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel ) from 2.7.0 to 2.8.0.
- [Release notes](https://github.com/pypa/cibuildwheel/releases )
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md )
- [Commits](https://github.com/pypa/cibuildwheel/compare/2.7.0...v2.8.0 )
---
updated-dependencies:
- dependency-name: pypa/cibuildwheel
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
3 years ago
nihui
a5fb92db51
optimize innerproduct fp16s transform kernel ( #3994 )
3 years ago
nihui
d4a704de0e
pnnx eliminate noop upsample ( #3991 )
3 years ago
nihui
b4bae2c9e4
pnnx convert torch.tensor_split, fuse full dim size slice to tensor_split ( #3988 )
3 years ago
sodo
3b3605eec4
add pkgconfig ( #3984 )
Signed-off-by: sodo <djdisodo@gmail.com>
3 years ago
Zhuo Zhang
c04a1bccce
fix(examples): exit program if load_param, load_model failed for clear message ( #3983 )
* fix(examples): exit program if load_param, load_model failed for clear error messsage
* ignore .cache dir for clangd
* check non-zero instead of -1 for load functions
3 years ago
tpoisonooo
207ca0e0bb
Improve protobuf FAQ doc ( #3973 )
3 years ago
nihui
044467b2c6
pnnx support torch 1.12 ( #3981 )
3 years ago
nihui
0de869e8e3
ci release copy and zip preserve links ( #3980 )
3 years ago
nihui
d2fc473d8f
update qcom810 benchmark data
3 years ago
nihui
b9b5f9b119
ci release python fix, install strip ( #3977 )
* fix release python
* Update release-python.yml
* install strip
* there is no strip target for ios
* install libomp on macos
* uninstall libomp atm
* force remove
3 years ago
nihui
eeb969ad2c
update qcom855plus benchmark data. set adreno gpu frequency.
3 years ago
nihui
f8c76e730a
fix ci release optimization with cmake >= 3.21 and ndk23 ( #3976 )
* Update release.yml
* Update how-to-build.md
3 years ago
nihui
eeac683c70
fix release python ( #3974 )
3 years ago
nihui
14588023b5
release ubuntu 22.04 package, fix ndk debug flag for r23+ ( #3972 )
3 years ago
nihui
531506d602
improve pattern value match, always treat inplace operator as non-inplace version ( #3970 )
3 years ago
nihui
8dbedf8a19
use cmake gnuinstalldirs for install destination ( #3968 )
3 years ago
nihui
43912c7f42
more strict compiler avx512fp16 support detection ( #3966 )
3 years ago
moozae
94ec06a8a5
delete unused variables ( #3965 )
3 years ago
nihui
f597619e94
pnnx export weight inside moduleop ( #3902 )
3 years ago
nihui
b85bfb6085
armv8.2 asimdfhm and armv8.4 bf16 i8mm and armv8.6 sve sve2 compiler flags and runtime detection functions ( #3964 )
3 years ago
nihui
be8e195630
add avx512 spr ci ( #3960 )
3 years ago
nihui
5ae827c745
convert inplace relu6, match more hardswish pattern ( #3952 )
* convert inplace relu6, match more hardswish pattern
* upgrade requests
3 years ago
nihui
bc8e939c85
insert reshape for nn.Linear with 4d/5d inputs ( #3959 )
* insert reshape for nn.Linear with 4d/5d inputs
* batch index aware reshape, add test
3 years ago
Yoh
a4ccad3325
Fix gcc4 simd conflict ( #3957 )
* fix mat::fill gcc4 avx sse conflict bug
* fix build and crash with gcc 4.4.0
Co-authored-by: nihui <shuizhuyuanluo@126.com>
3 years ago
dankernel
01655613d1
Create build-for-VisualStudio.en.md ( #3956 )
- Translated Chinese documents into English
- Updated to VS2022 version
3 years ago
Jianbo-Ning
4fad760a64
add faq.en.md ( #3901 )
3 years ago
nihui
28e0c9132a
fix cmake find thread on mips
3 years ago
nihui
27dc780005
mips msa optimization for innerproduct fp16s ( #3953 )
3 years ago
nihui
706831f8a9
arm vfpv4 optimization for innerproduct ( #3950 )
3 years ago
nihui
440bfdd2cc
x86 f16c optimization for innerproduct ( #3944 )
3 years ago