nihui
c09d7b3591
mips msa optimization for convolution int8 ( #3675 )
* basic mips msa optimization for convolution int8
* mips msa optimization for convolution int8 gemm
* mips msa optimization for convolution int8 winograd pack8to4/pack8to1
* mention msa maddv/msubv intrinsics bug
4 years ago
dependabot[bot]
1a0bf1f517
Bump pypa/cibuildwheel from 2.3.1 to 2.4.0 ( #3674 )
Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel ) from 2.3.1 to 2.4.0.
- [Release notes](https://github.com/pypa/cibuildwheel/releases )
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md )
- [Commits](https://github.com/pypa/cibuildwheel/compare/v2.3.1...v2.4.0 )
---
updated-dependencies:
- dependency-name: pypa/cibuildwheel
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
4 years ago
nihui
72c467d1d9
mips msa optimization for quantize dequantize requantize ( #3672 )
4 years ago
WuJinxuan
e984f9f40d
add multiheadattention arm ( #3667 )
* add:multiheadattention_arm
* pass the test in local
* add omp
* return naive
Co-authored-by: EdVince <EdVince@users.noreply.github.com>
4 years ago
nihui
1fcad0e765
loongson mmi optional layer
4 years ago
nihui
0d83ad99f8
link pnnx with pthread, fix minmax issue on windows build
4 years ago
nihui
559e5b23f9
vulkan tensorcore optimization ( #3628 )
* query and enable cooperative matrix
* fix build with old vulkan sdk
* implement cooperative matrix optimization
* add nvidia-t4 coverage
* adjust test option for more coverage
4 years ago
nihui
4302f78f55
less specialization constant for vulkan conv1x1s1d1 shaders ( #3657 )
4 years ago
Guo Haria
67f52ba73c
Update yolov5.py ( #3656 )
fix rect position bug.
4 years ago
Shangxin
94beeaf000
fix word case ( #3655 )
4 years ago
nihui
b934fd53e7
fix vs2019 packaging
4 years ago
tpoisonooo
aab476f5b6
fix build warning ( #3651 )
4 years ago
nihui
9c92df814f
better condition for mixing vulkan winograd f23 and f43
4 years ago
nihui
944829838b
vulkan conv1x1s1d1 for any packing ( #3646 )
4 years ago
nihui
62a872bad3
vulkan winograd for any packing ( #3645 )
4 years ago
dependabot[bot]
aac71352fd
Bump actions/cache from 2.1.7 to 3 ( #3643 )
Bumps [actions/cache](https://github.com/actions/cache ) from 2.1.7 to 3.
- [Release notes](https://github.com/actions/cache/releases )
- [Commits](https://github.com/actions/cache/compare/v2.1.7...v3 )
---
updated-dependencies:
- dependency-name: actions/cache
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
4 years ago
nihui
9e3cc1c5df
two stage vulkan innerproduct ( #3642 )
4 years ago
nihui
677d54d496
fuse vulkan winograd pad and crop ( #3640 )
4 years ago
nihui
002c07d4ec
mix vulkan winograd f23 and f43 ( #3639 )
* mix vulkan winograd f23 and f43
* larget epsilon for winograd optimization test
4 years ago
nihui
d42e048b56
pnnx convert torch.addmm ( #3634 )
4 years ago
nihui
3ddd65e18c
massive vulkan optimization part3 ( #3632 )
* implicit gemm
* unroll direct conv by 2x2x2
4 years ago
nihui
cfcb1cffa9
massive vulkan optimization part2 ( #3621 )
* vulkan local memory optimization for conv1x1 pack4 and winograd on dgpu
* unified innerproduct pipeline creation
* reorder deconvolution weight layout
* flexible local memory data type
* more local memory optimization for conv/deconv gemm
4 years ago
tpoisonooo
edd3a78ffe
style(src/layer): remove unused para and var ( #3623 )
Co-authored-by: MegEngine <megengine@megvii.com>
Co-authored-by: tpoisonooo <tpoisonooo@users.noreply.github.com>
4 years ago
tpoisonooo
6e12647985
docs(how-to-build.md): update jetson description ( #3622 )
Co-authored-by: MegEngine <megengine@megvii.com>
4 years ago
tpoisonooo
b8f36c258e
Update how-to-build.md ( #3619 )
4 years ago
nihui
f9663d7726
pnnx support torch 1.11.0 ( #3617 )
* adapt torch 1.11.0 api changes
* find python library for torchvision linking
4 years ago
nihui
6e19ab26ba
massive vulkan optimization ( #3602 )
* vulkan deconvolution sgemm col2im
* vulkan convolution winograd43
* improve fp16s numeric stablity
* vulkan convolution im2col sgemm
* check squeezenet top2, as top3 vs top4 score too close..
4 years ago
nihui
8f25ba0cab
enable fp16a on mali-g31
4 years ago
Kenji Mouri
8e29c42080
Improve SSE2 implementations in x86 targets. ( #3605 )
* Make some typos for SSE2 floor.
* Improve the implementation of SSE2 abs.
* Improve the implementation of SSE2 ceil.
4 years ago
Kenji Mouri
2b4a2125e6
Add SSE2 implementation of floor and ceil in x86 targets. ( #3595 )
* Add SSE2 implementation of floor and ceil in x86 targets.
* apply code-format changes
* Update the SSE2 floor implementation.
Co-authored-by: MouriNaruto <MouriNaruto@users.noreply.github.com>
4 years ago
MouriNaruto
f4f4cfd784
apply code-format changes
4 years ago
nihui
8af8e52cb0
add nanodetplus pnnx example
4 years ago
Kenji Mouri
3ba5d9765f
Add arm and arm64 targets support for MSVC. ( #3592 )
4 years ago
Kenji Mouri
e4f6b118a2
Fix comment typo because it should be itm[6][6]. ( #3591 )
4 years ago
nihuini
b053a8c6d5
fix unlocked pool allocator destroyed too early issue in gpu convdw and deconvdw inference
4 years ago
nihui
30e106b185
add another mali g52 device id
4 years ago
nihui
41cd40ac1f
convert pnnx torch.unbind and torch.ones/ones_like family ( #3583 )
* convert torch.unbind
* torch.ones torch.ones_like
* torch.full torch.full_like
* torch.randn_like
* torch.empty torch.empty_like
4 years ago
dependabot[bot]
8b68ec050d
Bump actions/checkout from 2 to 3 ( #3588 )
Bumps [actions/checkout](https://github.com/actions/checkout ) from 2 to 3.
- [Release notes](https://github.com/actions/checkout/releases )
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md )
- [Commits](https://github.com/actions/checkout/compare/v2...v3 )
---
updated-dependencies:
- dependency-name: actions/checkout
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
4 years ago
nihui
2880eff264
deconv1d deconv3d ( #3584 )
* fix sigmoid returns nan with very large input
4 years ago
dependabot[bot]
556aeb675b
Bump actions/setup-python from 2 to 3 ( #3586 )
Bumps [actions/setup-python](https://github.com/actions/setup-python ) from 2 to 3.
- [Release notes](https://github.com/actions/setup-python/releases )
- [Commits](https://github.com/actions/setup-python/compare/v2...v3 )
---
updated-dependencies:
- dependency-name: actions/setup-python
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
4 years ago
Yoh
f5b90a5333
Add unaryop x86 ( #3579 )
Co-authored-by: Yoh <wangpeizhou@bilibili.com>
Co-authored-by: Yoh-Z <Yoh-Z@users.noreply.github.com>
4 years ago
nihuini
bc188ece58
update modelwriter for new operators
4 years ago
nihui
457e066eb5
x86 f16c infrastructure ( #3577 )
4 years ago
nihui
38ae671391
add yolov5 pnnx example
4 years ago
Yoh
4b68e3f9c1
Opt avxmath ( #3563 )
* optimize x86 avx exp log sincos fma
* optimize avx&sse_mathfun exp,log,cos,sin,sincos and add fnmadd to x86_usability
Co-authored-by: Yoh-Z <Yoh-Z@users.noreply.github.com>
4 years ago
nihui
920aa79f04
drop x86 avx2 fp16 ( #3568 )
4 years ago
nihui
6b2495cc24
add reshape before and after pooling 123d with no batch dimension ( #3566 )
4 years ago
nihui
76e32e9ee6
fix interp nearest by scale factor, fix issue #3555 ( #3565 )
* lets accept the float div vs recp error
4 years ago
nihuini
6ee5ab72f6
handle reshape 5d with batch index 0
4 years ago
tpoisonooo
cfacba273f
improvement(binaryop): use MAKE_FUNCTION macro ( #3559 )
Co-authored-by: MegEngine <megengine@megvii.com>
Co-authored-by: tpoisonooo <tpoisonooo@users.noreply.github.com>
4 years ago