nihui/ncnn - ncnn - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
nihui	0734b657d9	spectrogram and inverse spectrogram (#5779 ) * only supports hann, hamming and all-one window * inverse spectrogram does not support length parameter * spectrogram always returns torch.view_as_real(out) as ncnn does not support complex typed mat yet * inverse spectrogram always accepts torch.view_as_complex(in) as ncnn does not support complex typed mat yet	1 year ago
nihui	e7602a206b	fix gemm arm int8 scales descales offset (#5750 )	1 year ago
nihui	8fe62812c9	arm neon optimization for layernorm fp32/bf16s/fp16s (#5746 )	1 year ago
nihui	66b54cbea2	multiheadattention int8 quantization (#5733 ) * x86 vulkan fallback * comment about bf16s	1 year ago
nihui	1c7af00499	gemm int8 quantization (#5706 ) * quantize gemm * write gemm quantize scales * update doc * less openmp args * x86 riscv fallback * skip gemm vulkan int8 * fix noint8 test, fix arm bf16 test * enable vfpv4 on neon build only * fix gemm vulkan without C * fp16 pack8 output * enable elempack=8 only for asimdhp+ * tiled gemm int8 test * opt arm64 tiles, fix asimdhp dispatch	1 year ago
nihui	5df5413c81	embed int8 quantization and add embed test (#5667 )	1 year ago
nihui	fdf0df3079	RMSNorm (#5630 )	1 year ago
nihui	3752d71200	fix potential fp16s bf16s conflicts on arm vfpv4 (#5578 ) * fix potential fp16s bf16s conflicts on armv7 vfpv4 * but prefer fp16 on armv8.2	1 year ago
nihui	4c3debae2d	multiheadattention scale param (#5526 ) * update swiftshader * skip vs2017 swiftshader	1 year ago
nihui	8235cad999	mha allow qdim differs from embed_dim (#5519 ) * test mha oom	1 year ago
nihui	39c27de47b	test concat oom (#5502 )	1 year ago
nihui	093c516898	test slice oom (#5501 )	1 year ago
nihui	da7d1a10f7	test x86 arm convolution oom (#5492 ) * skip mips loongarch riscv oom test atm * test softmax oom	1 year ago
nihui	08b7d99a75	rnn/lstm/gru dynamic quantization (#5435 )	2 years ago
nihui	9ce7930413	x86 optimization for convolution tiled gemm (#5426 )	2 years ago
nihui	e3758fdd19	fix test reduction warning (#5397 )	2 years ago
nihui	984d6dd844	promote vfpv4 for auto fp16 storage conversion (#5325 ) * promote vfpv4 for auto fp16 storage conversion * always report neon and vfpv4 for arm64	2 years ago
nihui	5329d32e74	check vulkan fp16 uniform support and implement lfp conversion without fp16u (#5287 )	2 years ago
nihui	556b79ce4d	create layer decoupled (#5258 ) * create layer decoupled * no more virtual public * allow build test with shared library * decouple cpu vulkan * drop old scripts	2 years ago
nihui	ded0b78bb2	fix nvidia vulkan crash on exit (#5234 )	2 years ago
nihui	eea3fc9b41	optimize vulkan global pooling (#5191 ) Co-authored-by: nihui <nihui@users.noreply.github.com> Co-authored-by: michaelcai <michaelcai@tencent.com>	2 years ago
nihui	4136de3b8d	arm optimization for convolution int8 packed unified elempack (#5147 )	2 years ago
nihui	4494aadd74	deconvolution dynamic weight (#5119 )	2 years ago
nihui	14e14a9ae8	slice with indices (#5103 )	2 years ago
邓实诚	a1e3ebf8e5	implement simplemath (#4905 ) * complete abs, fmod and sin function in simplemath.h * remove some unused variables in simplemath.cpp * modify test-coverage.yml and add some functions to simplemath.cpp * modify erf.cpp which included math.h * include platform.h for NCNN_SIMPLEMATH definition * move utility constants and functions in simplemath.h to simplemath.cpp * guard simplemath functions with extern "C" * add NCNN_EXPORT macro in simplemath.h * include plateform.h and guard all declarations with NCNN_SIMPLEMATH * clean unused code in test_unaryop.cpp * guard #include <vector> with NCNN_SIMPLEMATH in benchncnn.cpp * add 'static' to guard functions that not declarated in header file * modify sin and cos with better implementation --------- Co-authored-by: HonestDeng <HonestDeng@users.noreply.github.com>	2 years ago
Yoh	3f437d3f3d	Grid sample op (#4373 ) * pnnx support grid_sample op * complete the permute and gridsample operator fusion * spilt calculation into two stages and support permute fusion	2 years ago
nihui	7b02425246	x86 optimization for convolution int8 winograd unified elempack (#5054 )	2 years ago
FhqTreap	1d7720efe8	fix test conv1d (#5049 )	2 years ago
nihui	78aca88d67	elu 4d and selu 4d (#5047 )	2 years ago
Beq Jal	019176c6b2	selu and shufflechannel on x86 (#5017 )	2 years ago
Amir Ramezani	7e5fa3ade3	shrink operator (#5022 )	2 years ago
nihui	c8662cce5e	arm optimization for convolution int8 gemm unified elempack (#5016 )	2 years ago
Amir Ramezani	0ea587b8c7	celu activation vulkan and onnx conversion (#5018 )	2 years ago
Beq Jal	bcfec1da33	Celu layer and export to ncnn (#5019 )	2 years ago
Beq Jal	c851231832	add diag layer and its converter (#4935 )	2 years ago
Amir Ramezani	695f770eab	erf implementation (#5012 ) * added erf implementation * added testcase for erf * added onnx2ncnn support of erf	2 years ago
nihui	4abadd2ffb	binaryop implicit broadcast B with 1 dimension rank for outer axis (#4930 )	2 years ago
nihui	c45c01c7c1	enable VK_KHR_cooperative_matrix (#4823 ) * enable VK_KHR_cooperative_matrix * add khr cm shader * update glslang * print matrix info	2 years ago
nihui	55709708e9	x86 optimization for convolution int8 packed unified elempack (#4861 )	2 years ago
nihui	1283a19305	pnnx convert torch round trunc (#4813 ) * update riscv qemu * c906 test on qemu * fix qemu aarch64	2 years ago
nihui	9022b7162a	implement all explicit binaryop broadcast types (#4809 ) * simplify binaryop * less gpu test * update binaryop broadcast doc * do not test atan2 zero	2 years ago
nihui	903ec7c2c9	fix overwrite builtin layer destruction (#4732 ) * fix overwrite builtin layer destruction * make modelbin class copyable * test++	3 years ago
nihui	f893d2440d	innerproduct allow 1 height gemm (#4730 )	3 years ago
nihui	249b264336	workaround moltenvk error on spec const composite op (#4714 ) * workaround moltenvk error on spec const composite op * workaround moltenvk crying on binding image with memory offset	3 years ago
nihui	a37a83d850	clip gelu mish tanh 4d (#4695 )	3 years ago
nihui	cd5a6098a2	sigmoid and swish 4d (#4692 )	3 years ago
nihui	c28c8c04a1	multiheadattention attn mask (#4668 )	3 years ago
nihui	b640574b88	rough vulkan gemm and multiheadattention (#4618 )	3 years ago
nihui	db628b1b99	allow overwriting built-in layer with custom layer (#4616 )	3 years ago
nihui	1133a18ca8	x86 and arm optimization for convolution1d packed unified elempack (#4615 )	3 years ago

1 2 3 4 5 ...

340 Commits (f1bdc87478c64e0dfdba3d679e6e0dfd4b84df80)