未知时光
dee2e0dc0c
fix ios-simulator-gpu badge ( #4836 )
2 years ago
張小凡
1e0d70af8c
Add translated document: glsl-extension.zh.md ( #4818 )
2 years ago
nihui
4b97730b0d
x86 packed convolution transform kernel avx2/avx512 optimization ( #4819 )
* fix non-sse non-neon weight pack
2 years ago
nihui
6c21b08727
check loongarch lasx and enable ( #4820 )
2 years ago
nihui
43aba6badb
Update glsl-extension.md
2 years ago
nihui
172b748c74
add ncnn glsl extension doc ( #4817 )
2 years ago
nihui
1283a19305
pnnx convert torch round trunc ( #4813 )
* update riscv qemu
* c906 test on qemu
* fix qemu aarch64
2 years ago
nihui
3a74ae4d3d
update rpi3b+ benchmark data
3 years ago
nihui
8c40a59216
pnnx insert reshape for ncnn global pooling ( #4812 )
3 years ago
nihui
9022b7162a
implement all explicit binaryop broadcast types ( #4809 )
* simplify binaryop
* less gpu test
* update binaryop broadcast doc
* do not test atan2 zero
3 years ago
nihui
cc37c10997
update rpi4b benchmark
3 years ago
Zhenjia Guo
d9e45ec703
fix pnnx PermissionError ( #4801 )
3 years ago
Zhang Geng
4a78b6d457
Update HUAWEI KunPeng 920 platform ( #4795 )
3 years ago
nihui
e112461d30
write shape, fuse sam image encoder attention ( #4792 )
* write shape, fuse sam image encoder attention
* set more dynamic shape as static
* less warning for constant tensor node
3 years ago
nihui
b8cf8cb73e
pnnx rewrite multiple ops ( #4780 )
fuse F.scaled_dot_product_attention
3 years ago
張小凡
ec0a8503c5
Fix function recursion errors under some low-version c++ linux compilers ( #4768 )
3 years ago
Justin62628
9dc581e490
Fix pnnx index out of range in eval expression ( #4765 )
3 years ago
dependabot[bot]
cb104e31ee
Bump pypa/cibuildwheel from 2.12.3 to 2.13.0 ( #4759 )
Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel ) from 2.12.3 to 2.13.0.
- [Release notes](https://github.com/pypa/cibuildwheel/releases )
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md )
- [Commits](https://github.com/pypa/cibuildwheel/compare/v2.12.3...v2.13.0 )
---
updated-dependencies:
- dependency-name: pypa/cibuildwheel
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
3 years ago
huoshuai-dot
f7f6ca0033
Update onnx2ncnn.cpp ( #4754 )
3 years ago
nihui
07c2cac091
update loongarch ci toolchain, add loongarch lsx coverage ( #4747 )
3 years ago
nihui
f7652ec72d
pnnx fuse chinese clip mha and write need_weights=False ( #4745 )
3 years ago
nihui
7883f4d0c3
shadowed variable for less openmp task args ( #4744 )
3 years ago
nihui
1d6bfdca38
fix pnnx pass on fp16 weight, common fp16 conversion routines ( #4743 )
3 years ago
nihui
0dbe5a7180
possible fix for issue 4735 ( #4737 )
3 years ago
Shirui Cao
7646392d3e
Fix clang-cl.exe compatibility by using the correct cpuid() built-in function ( #4738 )
3 years ago
nihui
903ec7c2c9
fix overwrite builtin layer destruction ( #4732 )
* fix overwrite builtin layer destruction
* make modelbin class copyable
* test++
3 years ago
nihui
f893d2440d
innerproduct allow 1 height gemm ( #4730 )
3 years ago
nihui
1b4a8fd4b2
fix warnings and code clean ( #4729 )
3 years ago
張小凡
0e09ae1290
Reimplement the sleep() and get_current_time() functions using modern C++. ( #4677 )
3 years ago
nihui
d4046b4ae9
pnnx fuse transformer clip attention and diffusers attentionblock ( #4727 )
* pnnx fuse transformer clip attention
* skip fuse mha for 1.8
* select one method other than forward
* pnnx fuse diffusers attentionblock
3 years ago
nihui
2b87dc2cf7
force global cpu info initialization ( #4725 )
* force global cpu info initialization
* sanitize zero nT
3 years ago
nihui
c038b8227b
pnnx convert sdpa ( #4722 )
* pnnx convert sdpa
* pnnx fuse diffuser attention2
3 years ago
Kenji Mouri
bd2935c11d
Update some benchmark results when using Hyper-V Linux Guest with GPU-PV enabled ( #4712 )
* Update benchmark information about Hyper-V Linux Guest with GPU-PV enabled (Intel Core i7-11800H, NVIDIA GeForce RTX 3070 Laptop GPU).
* Update benchmark information about Hyper-V Linux Guest with GPU-PV enabled (Intel Core i7-7700K, NVIDIA GeForce GTX 1050 Ti).
3 years ago
Yoh
2d9ec410f4
fix pnnx build bug ( #4721 )
3 years ago
nihui
15cf81c40d
workaround multiheadattention vulkan nan issue on nvidia gpu ( #4682 )
* fix vulkan validation error, prefer VK_KHR_buffer_device_address over VK_EXT_buffer_device_address
* enable validation extension features
3 years ago
nihui
249b264336
workaround moltenvk error on spec const composite op ( #4714 )
* workaround moltenvk error on spec const composite op
* workaround moltenvk crying on binding image with memory offset
3 years ago
nihui
1fa38fe5ac
pnnx convert torch std ( #4715 )
* pnnx convert torch std
* fix multiple fuse pass on torch 2.0
* fuse vit pytorch mha pattern
3 years ago
nihui
7fb16be32a
fix aarch64 build without fp16 conversion intrinsics ( #4713 )
* fix aarch64 build without fp16 conversion intrinsics
* vfpv4 always implies neon
3 years ago
nihui
097e3537ac
fix build with gcc-5.4 aarch64 ( #4703 )
3 years ago
jason_w
48f9bcfce2
place the `if` statement outside the `for` loop ( #4707 )
3 years ago
nihui
6b5ca0f70d
add doc for building for qnx ( #4709 )
3 years ago
nihui
d294b26783
add qnx710.toolchain.cmake ( #4706 )
3 years ago
Zhuo Zhang
b709f041fe
Add qnx.toolchain.cmake ( #4213 )
3 years ago
nihui
8c7c21b5fb
fix fp resource leak in cpu.cpp ( #4704 )
3 years ago
triple Mu
18a37a270b
Update model_zoo.py ( #4495 )
support yolov8s ncnn
3 years ago
kennybradley
f79daf4071
I added yolov7 tiny to the model zoo since it already exists in the ncnn-assets ( #4693 )
3 years ago
nihui
05ad0c52c6
pnnx fuse gelu ( #4702 )
3 years ago
nihui
490816b21b
update ios toolchain, simulator arm64 ci, mac catalyst ci ( #4697 )
3 years ago
Zhang Geng
2b1feaf82b
Update VF2 score ( #4688 )
3 years ago
nihui
a37a83d850
clip gelu mish tanh 4d ( #4695 )
3 years ago