nihui/ncnn - ncnn - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
ice	7a0c19c856	feat: pipe & spv cache	11 months ago
Copilot	4644540ea4	Add Windows XP support merging PRs #6176 and #6177 (#6204 ) Co-authored-by: Sugar-Baby <87747602+Sugar-Baby@users.noreply.github.com> Co-authored-by: AtomAlpaca <66774326+AtomAlpaca@users.noreply.github.com>	11 months ago
nihui	fe509e9bc1	flexible coopmat mnk and unified elempack for vulkan deconvolution gemm (#6199 )	11 months ago
nihui	0cfe201b3c	fix vulkan absval fp16 (#6167 ) * fix 1d 2d cstep * fix ranged cstep	1 year ago
nihui	171b9d1bba	use spdx license header, copyright Tencent (#6152 )	1 year ago
nihui	9f832c19c1	vulkan int8 packing quantize dequantize requantize (#3731 ) * add int8 definitions * packing vulkan int8/int32, quantize vulkan * vulkan dequantize * requantize vulkan	1 year ago
nihui	bd0b111775	vulkan tight fp16p pack1 (#6127 )	1 year ago
nihui	24a3b99f1f	drop layer support_image_storage and option use_image_storage (#6126 ) * fix pyncnn build	1 year ago
nihui	211e238639	drop layer forward vkimagemat (#6124 ) vkimagemat was originally used as a mat storage in the hope of improving performance on old adreno gpus, but in fact it is slower than the cpu in most cases and is no longer suitable for the latest adreno architecture and large shapes	1 year ago
nihui	b9f98f0d3a	always allocate aligned size for 1d/2d mat and vkmat (#6104 ) * fix sub mat cstep * fix embed * rnn/lstm/gru int8 test without rounding diversity	1 year ago
nihui	4c4ecdf118	dequantize pack8 for all datatypes, fix convdw int8 dequant pack8 (#6109 )	1 year ago
hanzh	78b2e68728	arm unified elempack optimization for groupnorm (#4080 ) Co-authored-by: mmyyy22 <mmyyy22@users.noreply.github.com> Co-authored-by: nihui <nihuini@tencent.com>	1 year ago
nihui	8363040cb4	pnnx ncnn gelu fast mode, fix interp 2d resize (#5999 )	1 year ago
nihui	ef0b0e631c	interp output size expression (#5994 )	1 year ago
nihui	39c055d7f2	crop axes starts ends expression (#5976 ) * skip dynamic tensor index * handle clone oom	1 year ago
nihui	eed257df1f	ci update llvmpipe (#5954 ) * check image fp16	1 year ago
nihui	07267f2618	softmax 4d test and vulkan, softmax unified elempack optimization for x86 arm riscv (#5931 )	1 year ago
nihui	6396a732ef	reshape shape expression, drop reshape permute, test reshape oom (#5918 )	1 year ago
Yexuan Wu	3571d7e8ec	Support better API to detect big little core in windows after win7 (#5927 )	1 year ago
erquren	c9e0c877f9	add missing license header (#5925 )	1 year ago
nihui	1e3fcb9dda	paramdict value string type, natural array representation (#5915 )	1 year ago
nihui	23890900c2	x86 optimization for convolution int8 gemm (#5874 ) * cmake check compiler test cannot be optimized out * drop requant pack4	1 year ago
nihui	4a70be45ed	fix requantize pack4to8 (#5893 )	1 year ago
nihui	ff5b554003	restrict one dim quantize scale size, test quantize oom (#5892 ) * restrict one dim quantize scale size * sse2 requantize pack8	1 year ago
nihui	956bccd295	restrict one dim requantize scale bias size (#5888 )	1 year ago
nihui	48e1260a6f	restrict one dim dequantize scale bias size (#5886 )	1 year ago
nihui	21a71d3673	slim x86 dequantize (#5879 ) * remove dequantize pack8 test, seems to be useless	1 year ago
nihui	a13958ef47	optimize ncnn test building time (#5867 )	1 year ago
nihui	39cf4f6018	slim reduction (#5866 )	1 year ago
nihui	44e0d95c0d	x86 sse2/xop/avx/avx2/avx512/vnni/vnniint8 optimization for gemm int8 (#5763 ) * skip round problem * sde on ubuntu24	1 year ago
nihui	a9553fcc15	skip unaryop round halfway cases for powerpc (#5814 )	1 year ago
nihui	19caca3140	port rvv intrinsic 1.0+ (#5642 ) * zfh zvfh xtheadvector infra * dispatch for rvv and xtheadvector * dispatch for non-vector zfh * port xtheadvector recp rsqrt trunc * general rvv gemm * c906 and c910 ci * old tuple code clean * update riscv64 ci * update build doc * drop old th1520 toolchain	1 year ago
nihui	0734b657d9	spectrogram and inverse spectrogram (#5779 ) * only supports hann, hamming and all-one window * inverse spectrogram does not support length parameter * spectrogram always returns torch.view_as_real(out) as ncnn does not support complex typed mat yet * inverse spectrogram always accepts torch.view_as_complex(in) as ncnn does not support complex typed mat yet	1 year ago
nihui	e7602a206b	fix gemm arm int8 scales descales offset (#5750 )	1 year ago
nihui	8fe62812c9	arm neon optimization for layernorm fp32/bf16s/fp16s (#5746 )	1 year ago
nihui	66b54cbea2	multiheadattention int8 quantization (#5733 ) * x86 vulkan fallback * comment about bf16s	1 year ago
nihui	1c7af00499	gemm int8 quantization (#5706 ) * quantize gemm * write gemm quantize scales * update doc * less openmp args * x86 riscv fallback * skip gemm vulkan int8 * fix noint8 test, fix arm bf16 test * enable vfpv4 on neon build only * fix gemm vulkan without C * fp16 pack8 output * enable elempack=8 only for asimdhp+ * tiled gemm int8 test * opt arm64 tiles, fix asimdhp dispatch	1 year ago
nihui	5df5413c81	embed int8 quantization and add embed test (#5667 )	1 year ago
nihui	fdf0df3079	RMSNorm (#5630 )	1 year ago
nihui	3752d71200	fix potential fp16s bf16s conflicts on arm vfpv4 (#5578 ) * fix potential fp16s bf16s conflicts on armv7 vfpv4 * but prefer fp16 on armv8.2	2 years ago
nihui	4c3debae2d	multiheadattention scale param (#5526 ) * update swiftshader * skip vs2017 swiftshader	2 years ago
nihui	8235cad999	mha allow qdim differs from embed_dim (#5519 ) * test mha oom	2 years ago
nihui	39c27de47b	test concat oom (#5502 )	2 years ago
nihui	093c516898	test slice oom (#5501 )	2 years ago
nihui	da7d1a10f7	test x86 arm convolution oom (#5492 ) * skip mips loongarch riscv oom test atm * test softmax oom	2 years ago
nihui	08b7d99a75	rnn/lstm/gru dynamic quantization (#5435 )	2 years ago
nihui	9ce7930413	x86 optimization for convolution tiled gemm (#5426 )	2 years ago
nihui	e3758fdd19	fix test reduction warning (#5397 )	2 years ago
nihui	984d6dd844	promote vfpv4 for auto fp16 storage conversion (#5325 ) * promote vfpv4 for auto fp16 storage conversion * always report neon and vfpv4 for arm64	2 years ago
nihui	5329d32e74	check vulkan fp16 uniform support and implement lfp conversion without fp16u (#5287 )	2 years ago

1 2 3 4 5 ...

372 Commits (7a0c19c8563e0e65d458c548f983c0d2dcdb36ba)