nihui/ncnn - ncnn - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
nihui	a46edcf720	x86 optimization for interp (#3546 )	4 years ago
nihui	139554b36e	rewrite convolution x86 sgemm pack1 (#3544 )	4 years ago
Yoh	d2999b8d53	Optimize scale x86 (#3540 ) Co-authored-by: Yoh-Z <Yoh-Z@users.noreply.github.com>	4 years ago
nihui	fb6283c8b0	x86 avx fma optimization (#3543 )	4 years ago
nihui	3a43cc7015	update efficientnetv2_b0 param for reduction axes changes	4 years ago
nihui	3181616439	treat old reduction axes param as failure	4 years ago
nihui	672daa7e04	xop infrastructure and optimization (#3541 )	4 years ago
nihui	9d0c36358c	add z8350 and n5105 benchmark	4 years ago
nihui	de77b669c4	x86 sse2 optimization for conv1x1/3x3 pack4 and general sgemm pack4/pack4to1 (#3538 ) * x86 sse2 optimization for conv1x1 conv3x3 pack4 and general sgemm pack4/pack4to1 * x86 sse2 optimization for conv3x3s1 pack4to1 and general sgemm convolution pack4to1, use aligned load/store * enforce explicit alignment	4 years ago
nihui	6422e6acd3	fix x86 sgemm convolution int8 weight shuffle	4 years ago
nihui	340b4e673e	pnnx fold constant (#3521 )	4 years ago
Kagurazaka Kotori	08ecc94d63	x86: Use _mm_cvtsi128_si{32,64} in float2int8 (#3536 ) This patch uses _mm_cvtsi128_si{32,64} intrinsics when returning value in float2int8() to reduce unnecessary memory accesses. Resolves TODO "use _mm_cvtsi128_si64 on 64bit target". Signed-off-by: Kagurazaka Kotori <kagurazakakotori@gmail.com>	4 years ago
nihui	1d0b78f9b6	Update README.md	4 years ago
nihui	a356d152bb	Update README.md	4 years ago
Joson	70795c6548	Create README.md (#3532 )	4 years ago
teng	3ff9ae707f	simplify macro (#3530 )	4 years ago
Kagurazaka Kotori	5c078016c2	x86/avx_mathfun.h: Remove fallback warnings (#3527 ) * x86/avx_mathfun.h: Remove fallback warnings This patch removes warning messages indicating falling back to SSE2 when AVX2 support is disabled as suggested. Also reorders non-AVX2 macros for readability and faster preprocessing. Suggested-by: nihui <shuizhuyuanluo@126.com> Signed-off-by: Kagurazaka Kotori <kagurazakakotori@gmail.com> * apply code-format changes Co-authored-by: kagurazakakotori <kagurazakakotori@users.noreply.github.com>	4 years ago
nihui	2d46994d2e	wrap avxvnni and avx512vnni build options over cpu feature detector	4 years ago
nihui	33e225f173	fix c api test	4 years ago
nihui	bae2ee375f	simplify c api layer forward_n output array type	4 years ago
nihuini	1be043aad5	convert torch mean/sum/prod reduction with no args	4 years ago
nihuini	b4a755495c	convert pnnx zeros roll remainder	4 years ago
nihui	c0a94cd9ca	fix armv7 without neon (#3514 )	4 years ago
nihuini	4ba1eb6d2f	assign unique names for all pnnx operator and operand names. fix #3493	4 years ago
nihuini	457f7d1c63	fix use-after-free, fix #3492	4 years ago
nihui	b07ad54320	add zynq-7020 benchmark	4 years ago
nihui	4e4e0b9cf8	do not link libgcc as we no longer rely on builtin support cpu feature intrinsics now	4 years ago
nihui	71f377e9e9	update benchmark from Q-engineering	4 years ago
nihui	d95213a005	x86 convolution int8 optimization third stage (#3506 ) * avx-vnni and avx512-vnni optimization for convolution int8 gemm and 3x3 winograd pack8to4/pack8to1	4 years ago
nihuini	9f7f491885	use the old-style __cpuid_count for old compiler compatibility, fix #3510	4 years ago
nihui	930c36ebe2	avx512 infrastructure (#3407 )	4 years ago
nihui	c2896bcd4d	x86 convolution int8 optimization second stage (#3495 ) * some sse 4.1 optimization * sse2/avx2 optimization for convolution 3x3 winograd42 int8 pack8to4/pack8to1	4 years ago
teng	13a51fbcf8	add else (#3494 )	4 years ago
nihui	e9b8f0a6ef	x86 avx2 optimization for convolution gemm int8 (#3489 )	4 years ago
nihui	c5d7f963b9	layer tile (#3491 )	4 years ago
dependabot[bot]	d25388c938	Bump pypa/gh-action-pypi-publish from 1.4.2 to 1.5.0 (#3490 ) Bumps [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) from 1.4.2 to 1.5.0. - [Release notes](https://github.com/pypa/gh-action-pypi-publish/releases) - [Commits](https://github.com/pypa/gh-action-pypi-publish/compare/v1.4.2...v1.5.0) --- updated-dependencies: - dependency-name: pypa/gh-action-pypi-publish dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	4 years ago
Xiaoyang Chen	4d31c46532	[pnnx] Update README.md (#3487 )	4 years ago
nihui	7d3503c06a	pnnx Tensor index (#3483 ) * pnnx Tensor index * add test	4 years ago
nihuini	1db16ce9fc	pnnx torch norm stack test	4 years ago
nihuini	23d3340017	pnnx norm stack	4 years ago
nihuini	e33bdd16e8	pnnx fuse conv1d-bn convtranspose1d-bn	4 years ago
nihuini	f8ca1e7585	fix pnnx crash on unsupported expression	4 years ago
nihui	7c60dc2db7	pnnx roialign (#3478 )	4 years ago
nihui	143258e317	pnnx torchvision deformconv2d (#3459 )	4 years ago
Xiaohan Liu	3daabd515d	add missing doffset (#3475 )	4 years ago
nihui	7b222a19af	update benchmark (#3465 ) * update qcom855+ benchmark * Update README.md * Update README.md * add rock3a, update imx.7d benchmark * update raspberrypi3b+ benchmark * update	4 years ago
dog-qiuqiu	009d607a15	add the param file of yolo-fastest in benchmark (#3470 )	4 years ago
nihuini	014387dfae	update operators doc	4 years ago
nihuini	de436f9e26	pnnx arange matmul zeros_like expand_as	4 years ago
nihui	922f8b33c1	reduction4d, merge keepdims arg, add test (#3469 )	4 years ago

1 2 3 4 5 ...

2410 Commits (a46edcf720aba9cd352fc959e59a50eac8dc43d7) All Branches Search

2410 Commits (a46edcf720aba9cd352fc959e59a50eac8dc43d7)

All Branches