nihui/ncnn - ncnn - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
nihui	4abadd2ffb	binaryop implicit broadcast B with 1 dimension rank for outer axis (#4930 )	2 years ago
JeremyRand	0a8cf31a05	Add POWER8 VSX toolchains (#4853 ) * Add POWER8 VSX toolchains POWER8, though slower than POWER9, is still used in the wild; these toolchains should still be much faster on POWER8 than POWER8 without VSX optimizations. * VSX toolchains: set -cpu arg in QEMU CI tests	2 years ago
mizu-bai	4c861a0d1a	Add Building with Intel oneAPI (#4920 )	2 years ago
ฅ'ω'ฅ	2303b77ac1	Update how-to-build.md (#4872 )	2 years ago
JeremyRand	47e0daf4a1	Translate x86_64 SSE to ppc64le VSX intrinsics (#4807 ) * Add POWER9 VSX toolchains Translating x86_64 SSE to ppc64le VSX intrinsics yields a quite large speedup on POWER9. See this article for background: https://www.talospace.com/2019/07/easier-power-vectorizing-for-fun-and.html * Add power9le docs * power9le clang toolchain: Document Clang 13+ requirement --------- Co-authored-by: Jeremy Rand <jeremyrand@danwin1210.de>	2 years ago
Kin Yu Shek	e8d8042b90	Fix a mistake in docs/faq (#4837 )	2 years ago
張小凡	1e0d70af8c	Add translated document: glsl-extension.zh.md (#4818 )	2 years ago
nihui	43aba6badb	Update glsl-extension.md	2 years ago
nihui	172b748c74	add ncnn glsl extension doc (#4817 )	2 years ago
nihui	9022b7162a	implement all explicit binaryop broadcast types (#4809 ) * simplify binaryop * less gpu test * update binaryop broadcast doc * do not test atan2 zero	2 years ago
nihui	6b5ca0f70d	add doc for building for qnx (#4709 )	3 years ago
nihui	c28c8c04a1	multiheadattention attn mask (#4668 )	3 years ago
nihui	b640574b88	rough vulkan gemm and multiheadattention (#4618 )	3 years ago
He Yang	f9180330e2	update how-to-build.md and delete obsolete tutorials in docs (#4660 )	3 years ago
張小凡	868ea52bea	update faq.md about gpu performance (#4614 )	3 years ago
Zhuo Zhang	a124c2a839	fix typos in citation and benchmark docs (#4604 )	3 years ago
inisis	f7de5a7dc2	update faq.md (#4584 )	3 years ago
inisis	37042b2174	update build doc for Centos users (#4583 )	3 years ago
nihui	6f661f9bc4	Update FAQ-ncnn-throw-error.md	3 years ago
nihui	afc9310c62	update new operators for modelwriter (#4540 )	3 years ago
nihui	47ea2877ed	stb and emsdk update (#4536 ) * stb_image_write 1.16 * stb_image v2.28 * update emsdk 3.1.28 * enable stb arm neon * update doc Co-authored-by: ncnnnnn <67086033+ncnnnnn@users.noreply.github.com>	3 years ago
nihui	fc6ce4a641	copyto operator (#4522 )	3 years ago
nihui	242e775d21	pnnx convert torch log10, pow 2 as square (#4518 )	3 years ago
nihui	246e71c526	implement atan2 (#4516 )	3 years ago
Fangjun Kuang	92e75105c9	Support torch.cumsum (#4505 )	3 years ago
nihui	ab4cfbf5b0	enrich ncnn binary broadcast rules (#4513 )	3 years ago
Hitesh Kumar	add0a7bac4	fix : minor typo readme (#4486 )	3 years ago
nihui	fed99fd35b	gemm output transpose, prepack c (#4479 ) * mha is now permute and reshape free * gemm user defined tile mnk param	3 years ago
WuJinxuan	10e9d91576	Add x86 MultiHeadAttention (#4443 ) * fix doc, sync x86 gemm fix Co-authored-by: EdVince <EdVince@users.noreply.github.com> Co-authored-by: nihuini <nihuini@tencent.com>	3 years ago
nihui	fd1ac3c7a0	x86 optimization for gemm unified elempack (#4387 )	3 years ago
nihui	eceac35a7f	implement MultiheadAttention kdim vdim (#4347 )	3 years ago
Lry89757	6a47f8d15c	gridsample op support (#4288 ) Co-authored-by: LRY89757 <LRY89757@users.noreply.github.com> Co-authored-by: nihuini <nihuini@tencent.com> Co-authored-by: nihui <shuizhuyuanluo@126.com>	3 years ago
Fangjun Kuang	5281d51535	implement GLU and pnnx conversion (#4283 )	3 years ago
nihui	77eda4c19f	implement lstm proj_size (#4263 )	3 years ago
MisakaBit	bbbe17c5b5	docs: disable fp16 when wrong results encountered caused by overflow (#4248 )	3 years ago
Lry89757	b16f8ca921	[docs] Fix typo (#4201 )	3 years ago
miemie2013	720f3c9aab	Add DeformableConv2D (#4070 ) * Add DeformableConv2D * add unittest and docs * pnnx torchvision deformconv2d conversion Co-authored-by: miemie2013 <miemie2013@users.noreply.github.com> Co-authored-by: nihui <shuizhuyuanluo@126.com>	3 years ago
Zhouzhou	4158e63668	docs：add sse optimized zh (#4053 ) Signed-off-by: Zhouzhou <1197236910@qq.com>	3 years ago
tpoisonooo	207ca0e0bb	Improve protobuf FAQ doc (#3973 )	3 years ago
nihui	f8c76e730a	fix ci release optimization with cmake >= 3.21 and ndk23 (#3976 ) * Update release.yml * Update how-to-build.md	3 years ago
nihui	14588023b5	release ubuntu 22.04 package, fix ndk debug flag for r23+ (#3972 )	3 years ago
dankernel	01655613d1	Create build-for-VisualStudio.en.md (#3956 ) - Translated Chinese documents into English - Updated to VS2022 version	3 years ago
Jianbo-Ning	4fad760a64	add faq.en.md (#3901 )	3 years ago
nihui	f79073c182	update how-to-build doc for raspberrypi and d1	4 years ago
Lry89757	ca9abd1c4a	Update the add-custom-layer.zh.md (#3741 ) 1. 🐛Fix Bug of float and int. 修复了std::max()中参数int和float参数不符合的Bug 2. 👀 The structure of ncnn changes. ncnn文件结构变动,所有testlayer.cpp改成在tests/文件夹中	4 years ago
nihui	ae75a093fa	update print 4d mat, remove deprecated content	4 years ago
nihui	49e70c81a6	update linking glslang libraries	4 years ago
_0Mirror	2dcd85ca71	docs: fix docs about 'Build for iOS on macOS with xcode' (#3696 )	4 years ago
nihui	c09d7b3591	mips msa optimization for convolution int8 (#3675 ) * basic mips msa optimization for convolution int8 * mips msa optimization for convolution int8 gemm * mips msa optimization for convolution int8 winograd pack8to4/pack8to1 * mention msa maddv/msubv intrinsics bug	4 years ago
tpoisonooo	6e12647985	docs(how-to-build.md): update jetson description (#3622 ) Co-authored-by: MegEngine <megengine@megvii.com>	4 years ago

1 2 3 4

192 Commits (e80fcbca8f67cf107beebb4dd0333856879dc6fa)