nihui/ncnn - ncnn - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
nihui	debc33fee2	arm handle allocation failures (#5490 )	1 year ago
nihui	db035d602d	update ncnnoptimize layers, lightmode=false keeps original weight (#5414 )	2 years ago
nihui	556b79ce4d	create layer decoupled (#5258 ) * create layer decoupled * no more virtual public * allow build test with shared library * decouple cpu vulkan * drop old scripts	2 years ago
nihui	4494aadd74	deconvolution dynamic weight (#5119 )	2 years ago
zhiliu6	125b9f2baf	reduce double usage (#4671 )	3 years ago
nihui	c471826da1	fix arm bfloat2float float2bfloat oops (#4439 )	3 years ago
nihui	dd86cebab8	armv8.6 ci and coverage (#4025 ) * asimdfhm in fc * move neon bf16 conversion function to arm_usability header * fix cmake option * fix build with newer gcc * arm84 coverage * arm asimdfhm optimization for innerproduct gemm fp16s	3 years ago
nihui	7886e90c65	split arm82 source for smaller binary and memory footprint (#3877 ) * split arm82 source, wip * check compiler arm82 only for arm 64bit target * drop arm82 registery * strict check compiler support arm82	4 years ago
nihui	241524ffce	discard weight memory for x86 arm vulkan (#3865 ) * discard weight memory for x86 and vulkan * drop arm innerproduct weight * drop arm convolution weight * drop arm convolutiondepthwise weight * drop x86 vulkan deconvolution deconvolutiondepthwise weight * drop arm deconvolution deconvolutiondepthwise weight * arm neon assembly optimization for innerproduct pack4	4 years ago
nihui	c0a94cd9ca	fix armv7 without neon (#3514 )	4 years ago
nihui	24fbb6e8cb	honor thread setting on load and vulkan command, ci avx512 t4 (#3391 )	4 years ago
nihui	adfc8b25bc	fix deconv output pad (#3337 )	4 years ago
nihui	cdf45a6512	cmake option NCNN_BF16 (#3068 )	4 years ago
nihui	5fe75f19ef	architecture changes for int8 packing (#2771 ) * quantize and dequantize tests * unify activation and usability function * drop NCNN_REQUANT cmake option, test dequantize requantize pack8, fix webassembly build * benchmark use requantize int8 model	5 years ago
nihui	bf09af21be	exp arm fp16sa neon optimization	5 years ago
nihui	72a27d4776	utility wrapper for neon float32 bfloat16 conversion, deconvolution deconvolutiondepthwise arm fp16s fp16sa bf16s	5 years ago
nihui	b5e288b521	layer creator function is not necessary for built-in layers	5 years ago
nihui	01b8b79ed2	packing layout option respect support_packing property	6 years ago
nihui	3ef995ed1e	format code style and setup restyled.io (#1840 )	6 years ago
nihui	57bedd59fa	fix build without neon	6 years ago
nihui	038666e049	the initial auto test (#1464 ) * cpu test * wip * ci run test * travis ci for arm64 * arm64 ctest * copy vulkan loader * wip * run * Update ccpp.yml * gpu test * swiftshader * cache macos swiftshader * try MoltenVK * try vulkaninfo * give swiftshader another try * disable failed macos gpu test * more conv test, fix conv3x3s1 gpu test fail * fix deconvolution test * dilation test * cmake option to build tests * ncnn_add_layer_test macro * host barrier before upload and after download, handle packing layout option * test packing layout * wip * wip * merge deconvolution packing and non-packing code * merge convolution packing and non-packing code * pass top_blob_count param * fix build * take care of non-coherent mappable memory	6 years ago
nihuini	336d1c1edd	remove the ncnn namespace for in source Option	6 years ago
nihuini	cd4be6d0fa	call vulkan create_pipeline on the vkdev condition, drop opt_cpu hacks	6 years ago
nihuini	624291e2b2	use subop optimization for group convolution deconvolution pack4 family	6 years ago
nihui	48e3e7d49c	move neon activation into a wrapper function	6 years ago
nihuini	b7085ceec0	deconvolution apply output adj first, then crop the padding	6 years ago
nihuini	296e0022df	deconvolution output adj and output shape	6 years ago
nihuini	9a6ee37eef	asymmetric padding parameter for convolution and deconvolution family	6 years ago
nihui	394f6786b9	neon enable support_packing	6 years ago
nihui	cf42e7c254	deconvolutiondepthwise pack4 arm neon	6 years ago
nihui	b4c388a72a	Mat misc function accept option parameter, deconvolution pack4 arm neon	6 years ago
BUG1989	d9f269fa3d	use sgemm fp32 on arm platform,optimize conv1x1s2 (#1031 )	7 years ago
nihuini	4de4078779	move platform includes out of namespace	7 years ago
nihui	3e003ffd98	fuse sigmoid	7 years ago
nihuini	7a8f68aca6	move vulkan code to subdir, new layer interface create_pipeline and destroy_pipeline for post-loading works	7 years ago
nihuini	c6e075cef7	fuse deconv/innerproduct relu arm	7 years ago
nihuini	a76a07eb3f	fix null sub group elemsize/allocator in depthwise deconvolution, fix #539	7 years ago
nihuini	6f1b0b0a61	quantized padding in convolution, use range sweets	7 years ago
nihui	5d04a3a45c	layer holds bottom blob scale, depthwise convolution read group scales	7 years ago
nihuini	6b536701c3	sub-mat shall be allocator-aware	7 years ago
nihui	9706cd1447	implement ncnn blob/workspace allocator, fine-grained per-layer openmp threads control, fix #469	7 years ago
nihuini	aac70893f8	fix build on gcc	8 years ago
nihuini	9ac305e160	create 3-dim sub blob for group convolution, fix #315	8 years ago
nihui	6c4c810fda	decouple modelbin of different input types, simplify timestamp function	8 years ago
nihuini	76a55693a6	decouple convolutiondepthwise and convolution, reduce binary size by 10%, fix #254	8 years ago
nihuini	a84ba8fc0f	element type storage support in Mat, move data member the first so that a pointer to Mat is a pointer to data, convenient index access for float vector	8 years ago
nihui	bdb70a2010	padding w h in convolution and deconvolution	8 years ago
nihui	44b4519307	non-square convolution and deconvolution kernel stride dilation	8 years ago
huyn	8b9365a68c	fix top_blob not set (#199 )	8 years ago
tedder59	4d59d0afda	Add depthwise Deconvolution. (#187 ) * add depthwise deconvolution. * add depthwise deconvolution. * fix some syntax error and uncessary modification	8 years ago

50 Commits (debc33fee27acc44f2691897e2677ff2d87bbb39)