nihui
f03c8f9307
build universal pnnx wheel and pnnx entrypoint script ( #5331 )
2 years ago
nihui
f5b55c62e9
use 4 job for github ci ( #5330 )
2 years ago
nihui
984d6dd844
promote vfpv4 for auto fp16 storage conversion ( #5325 )
* promote vfpv4 for auto fp16 storage conversion
* always report neon and vfpv4 for arm64
2 years ago
nihui
5b536af234
fix uwp build ( #5328 )
2 years ago
nihui
d38bdbdb84
fix debug build on some compiler, fix #5295 ( #5326 )
2 years ago
nihui
87d7165848
disable signal based detectisa if being debugged ( #5280 )
2 years ago
Justin Fung
f6763262d1
Add draw rectangle, draw text, draw circle, and draw line to C API ( #5324 )
2 years ago
Molly Sophia
545a3671ee
python pnnx add option to change fp16 parameter ( #5320 )
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2 years ago
Q-engineering
5fdbb54b7f
Add Radxa Zero 3W to benchmark ( #5321 )
2 years ago
Xinyu Yang
7ac42680cf
RVV: Refine riscv gemm fp32 ( #5303 )
* replace storexxx to vsseg2e32_v_f32m1
* refine transpose
---------
Co-authored-by: Xinyu302 <Xinyu302@users.noreply.github.com>
2 years ago
Darren Cheng
10fd242741
add remipi benchmark ( #5319 )
2 years ago
Sophon
294e786d36
convolution_x86: Fix typo in logging ( #5310 )
Signed-off-by: Xilin Wu <wuxilin123@gmail.com>
2 years ago
HalfSweet
66b26b6ce8
add PhytiumPi ncnn benchmark ( #5312 )
2 years ago
nihui
0942efab2e
x86 avx512 optimization for mish ( #5309 )
2 years ago
nihui
7928d44d51
port stb image optimization ( #5307 )
2 years ago
nihui
40958d3ab3
pnnx support dynamic slice indexes ( #5299 )
* pnnx handle two operands add/sub/rsub variant
* fuse dynamic slice indexes, wip
* pnnx sliceindexes
* reset device may change non-dtype input numeric 5 to 6
* print inf as float
* preserve dtype for generation op
* pnnx convert torch.masked_select
* test masked_select
* test negative slice
2 years ago
David Sugarman
ff17c170f3
Grammer and typos fix suggestion ( #5304 )
2 years ago
hugo-syn
7d8019d577
chore: add markdown code highlight ( #5302 )
Signed-off-by: hugo-syn <hugo.vincent@synacktiv.com>
2 years ago
hugo-syn
f35eb4b3b8
chore: Fix multiple typos ( #5301 )
Signed-off-by: hugo-syn <hugo.vincent@synacktiv.com>
2 years ago
nihui
05b4dcb06c
report vulkan cm 8x8x16 config, enable fp16a cm ( #5298 )
2 years ago
dependabot[bot]
1012f85139
Bump actions/cache from 3 to 4 ( #5297 )
Bumps [actions/cache](https://github.com/actions/cache ) from 3 to 4.
- [Release notes](https://github.com/actions/cache/releases )
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md )
- [Commits](https://github.com/actions/cache/compare/v3...v4 )
---
updated-dependencies:
- dependency-name: actions/cache
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2 years ago
nihui
5329d32e74
check vulkan fp16 uniform support and implement lfp conversion without fp16u ( #5287 )
2 years ago
nihui
656b082284
fix cast armv7 sigbus when loading fp16 model ( #5292 )
* fix sigbus error when loading fp16 model on armv7
* apply for bf16
2 years ago
nihui
a705a24f32
pnnx convert some cudnn conv2d variants ( #5289 )
2 years ago
nihui
7ed252c854
pnnx handle index_put with empty indices and scalar values ( #5288 )
2 years ago
Molly Sophia
09f15e6980
Add Dimensity 9300 MT6989 ncnn benchmark ( #5284 )
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2 years ago
wonderfullook
28078b7ad6
add orangepi zero2 ncnn benchmark ( #5277 )
2 years ago
mizu-bai
2ac1e776dd
Add Xeon Phi 3120A results ( #5276 )
2 years ago
nihui
c557fb6704
pnnx model binary over 4g ( #5274 )
2 years ago
nihui
76dcaa4947
do not eliminate noop math if shape changes, improve torch-2.1 mha attn_mask detection ( #5273 )
2 years ago
nihui
ba42369c68
workaround l2 norm produce -inf value with subnormals ( #5272 )
2 years ago
nihui
c222208cc9
feat mask for disable threading, make some extractor setter no-op, update doc ( #5270 )
2 years ago
mizu-bai
237f45ff3e
Update AWS and OneCloud benchmark results with ncnn 20240102 ( #5257 )
* Update OneCloud results with 20240102
* Update AWS c5.4xlarge Instance results with ncnn 20240102
* Add compiler info for OneCloud
2 years ago
Ikko Eltociear Ashimine
5581d27d4d
docs: update FAQ-ncnn-vulkan.md ( #5268 )
plase -> please
2 years ago
nihui
a31f66203b
do not cache temporary blob for uploading weight ( #5266 )
2 years ago
nihui
556b79ce4d
create layer decoupled ( #5258 )
* create layer decoupled
* no more virtual public
* allow build test with shared library
* decouple cpu vulkan
* drop old scripts
2 years ago
nihui
6f84952122
pnnx handle more softmin logsoftmax dtype, fuse static full range slices to tensor_split ( #5253 )
* pnnx handle more softmin logsoftmax dtype, fuse static full range slices to tensor_split
* fix convert nn.Conv2d with none bias tensor
* fix embedding input with batch index zero
2 years ago
nihui
1e88fb8d5b
fix ci release artifact name conflicts ( #5251 )
* build android vulkan 32bit with api-14
* fix artifact name conflicts
* add watchos tvos
2 years ago
Molly Sophia
92d49e1f59
requantize: Use activation_ss in fused_activation.h ( #5245 )
Which fixes int8 requantization on risc-v
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2 years ago
nihui
d1d9aa2edb
fix some cpu.cpp warning ( #5244 )
2 years ago
Joey Ballentine
a7506008c8
Add more gpu-related python bindings ( #5165 )
2 years ago
nihui
d30af29ee2
fix simplecv Mat templated ptr ( #5241 )
2 years ago
nihui
c41aa2fdfd
pnnx export with ptpath ( #5239 )
* pnnx export with ptpath
* build and test python pnnx
2 years ago
nihui
09f2723699
pnnx nn.Identity test ( #5238 )
2 years ago
nihui
6c261a8c04
fix the missing elemsize in vkimagemat from_android_hardware_buffer ( #5237 )
2 years ago
nihui
ded0b78bb2
fix nvidia vulkan crash on exit ( #5234 )
2 years ago
nihui
8c4fc5e2a0
enable uniform 16bit and 8bit when available, fix validation error in fp16sa shader ( #5233 )
2 years ago
dependabot[bot]
b11a9d1dac
Bump actions/labeler from 4 to 5 ( #5198 )
* Bump actions/labeler from 4 to 5
Bumps [actions/labeler](https://github.com/actions/labeler ) from 4 to 5.
- [Release notes](https://github.com/actions/labeler/releases )
- [Commits](https://github.com/actions/labeler/compare/v4...v5 )
---
updated-dependencies:
- dependency-name: actions/labeler
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
* Update labeler.yml
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: nihui <nihuini@tencent.com>
2 years ago
nihui
606626baf3
python pnnx return optimized model ( #5232 )
2 years ago
nihui
b7f70cfe4e
initialize cpu thread affinity mask all to all cores ( #5231 )
call omp_set_num_threads with zero num_threads is implementation defined
2 years ago