nihui/ncnn - ncnn - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
nihui	7d1eec3d5d	the use_bf16_storage option	6 years ago
xieydd	b760e22da2	fix requant relu6 bug (#1590 ) * fix requant relu6 bug * fix * delete pipeline change in forward/forward_inplace avoid race in multithreading	6 years ago
nihui	52ce59e672	fix build with requant option on	6 years ago
nihui	0f7e7bca02	shader shape specialization constant and basic local group size partition (#1523 ) * use Mat class for Shape description * shape specialization constant in compute shader * wip * wip * test forward_inplace, add binaryop unaryop sigmoid test * fix arm unaryop test * fix arm binaryop test * make shape hint optional, cast int8 to fp32, add cast test * wip * follow the good and old local size setting for conv1x1 * the optimal local size rewrite * fix build on msvc * add permute shader for all packing layout, add permute test * concat and slice patial shape constant, slice test * fix slice test * interp test * add lrn test, test packing layout implicitly * add eltwise test * add normalize test * add instancenorm test * reorg shape constant * simple local group size partition * add shape constant param	6 years ago
nihui	e2bd4eae6e	write shape as 4-number tuple	6 years ago
nihui	6cefaad957	ncnnoptimize shape inference, load shape hint	6 years ago
nihui	a718129d76	shader pack8 option works	6 years ago
nihui	6f2ef1932d	int8 code refactoring wip, add int8 test	6 years ago
Anton Kochkov	07170542c9	Fix GCC 9.x warnings (#1462 )	6 years ago
Sungmann Cho	9bfc554bc9	Fix warnings on Visual Studio (#1431 ) * Fix warnings C4244, C4267 in src/layer/yolov3detectionoutput.cpp C4244: '=': conversion from 'int' to 'float', possible loss of data C4244: 'initializing': conversion from 'float' to 'int', possible loss of data C4244: 'initializing': conversion from 'double' to 'float', possible loss of data C4244: 'return': conversion from 'double' to 'float', possible loss of data C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data * Fix warnings C4244, C4267 in src/layer/yolodetectionoutput.cpp C4244: '=': conversion from 'int' to 'float', possible loss of data C4244: 'initializing': conversion from 'float' to 'int', possible loss of data C4244: 'initializing': conversion from 'double' to 'float', possible loss of data C4244: 'return': conversion from 'double' to 'float', possible loss of data C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data * Fix warning C4244 in src/layer/quantize.cpp C4244: 'initializing': conversion from 'double' to 'int', possible loss of data * Fix warnings C4244, C4267 in src/layer/detectionoutput.cpp C4244: '=': conversion from 'int' to 'float', possible loss of data C4244: 'initializing': conversion from 'double' to 'float', possible loss of data C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data * Fix warning C4244 in src/layer/roipooling.cpp C4244: 'initializing': conversion from 'double' to 'int', possible loss of data * Fix warning C4244 in src/layer/sigmoid.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4267 in src/layer/slice.cpp C4267: '=': conversion from 'size_t' to 'int', possible loss of data C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data * Fix warning C4267 in src/layer/softmax.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/layer/interp.cpp C4244: '=': conversion from 'float' to 'int', possible loss of data C4244: 'initializing': conversion from 'double' to 'int', possible loss of data * Fix warning C4244 in src/layer/instancenorm.cpp C4244: 'initializing': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/layer/deconvolutiondepthwise.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/layer/convolutiondepthwise.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/net.cpp C4244: 'return': conversion from '__int64' to 'int', possible loss of data C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data C4267: 'return': conversion from 'size_t' to 'int', possible loss of data * Fix warning C4244 in src/layer/bnll.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4267 in src/layer/concat.cpp C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data * Fix warning C4267 in tools/mxnet/mxnet2ncnn.cpp C4244: 'initializing': conversion from 'double' to 'float', possible loss of data C4267: '=': conversion from 'size_t' to 'int', possible loss of data C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data C4305: 'initializing': truncation from 'double' to 'float'	6 years ago
nihuini	3c9b3074e4	reclaim local vulkan allocator after blob_mats_gpu clear, fix random crash in multithread gpu inference without explicit per-thread allocator set	6 years ago
nihuini	50e8b5e4e8	multiple transfers may run concurrently if there is no dependency with each other, do not share staging buffer memory to fix potential data race	6 years ago
nihuini	33956cbfc3	pretty error info	6 years ago
nihuini	a170ef1acf	remove the default option usage in layer interface, fix write out of range in cast arm pack4, handle fp16p conversion on cpu/gpu transfer	6 years ago
nihuini	e73b06bbb8	fix build with NCNN_STRING=OFF	6 years ago
nihuini	64333429bb	data reader wrapper, fix #1325	6 years ago
nihui	8c1b87b1a2	fallback to cpu if no vulkan device found	6 years ago
Natsu	637d96c1d2	Fix gcc 9 compilation failure (#1189 ) * Fix gcc 9 compilation failure * Fix compilation failure on linux gcc * Fix compilation failure on old gcc * Remove C++11 requirement	6 years ago
nihui	ff62e7eed9	use_packing_layout option works	6 years ago
nihui	b4c388a72a	Mat misc function accept option parameter, deconvolution pack4 arm neon	6 years ago
nihui	8c53706987	net vkdev getter api	6 years ago
BUG1989	bcfe9f453f	initial the ncnn post training quantization tools (#1067 ) * initial the ncnn post training quantization tools * clear some comments of tools * fix the Travis ci compiler error	7 years ago
nihuini	b25f76833a	restore per extractor allocator setters, patially revert `e09607bc22`	7 years ago
nihuini	21b5508c96	shared locked vkallocator cannot prevent concurrent accessing during actual gpu inference, use seperated vkallocator for each queue	7 years ago
nihuini	040a8d2427	set vulkan device by gpu index	7 years ago
nihui	21f79b8546	prefer cpu fp16 casting to reduce upload/download overhead on discrete gpu	7 years ago
nihuini	e09607bc22	add option to upload model function, pipeline creation honors option use flags, setting allocator per extractor do not make much sense	7 years ago
BUG1989	d9f269fa3d	use sgemm fp32 on arm platform,optimize conv1x1s2 (#1031 )	7 years ago
nihuini	838c5df839	option api changes	7 years ago
nihuini	7f7bbf12e5	new api for getting the default gpu device	7 years ago
nihuini	cd7559c639	more fix for fp16p, still disabled by default	7 years ago
nihui	25b9736f82	shader fp16 packed	7 years ago
nihuini	738fb6bb14	print gpu per layer benchmark	7 years ago
nihuini	c9a9486307	merge command submit and wait, expose queue_count, concurrent queue submission shall work	7 years ago
nihuini	7a8f68aca6	move vulkan code to subdir, new layer interface create_pipeline and destroy_pipeline for post-loading works	7 years ago
nihuini	b81e1f3906	get rid of the old workaround :)	7 years ago
nihuini	4729ea3505	bottom blob memory never alias, reuse blob memory more elegantly relying on refcount	7 years ago
nihui	8724440c59	bind wait barrier count member to memory, fix #932	7 years ago
nihui	162c46647d	do not create fp16 shader module on unsupported platform	7 years ago
nihui	d753fe2589	upload fp16 weight, enable fp16 storage and arithmetic	7 years ago
Gemfield	add8c73922	Fix the return value of load_param and load_model (#855 )	7 years ago
Gemfield	573c2bcd93	Fix crash issue during load_model (#848 ) * Fix crash issue during load_model * Fix crash issue during load_model 2nd part	7 years ago
nihui	caeb85d6cd	multithreaded pipeline creation and destruction may cause driver crash :(	7 years ago
nihuini	b2e41bf83d	fallback convolution to cpu path for pad -233	7 years ago
nihuini	d999f43b87	fix vulkan initialization using memory loading	7 years ago
nihuini	d263cd507c	gpu packing and unpacking	7 years ago
nihuini	d3a11eb6c9	one codepath for unified and discrete device	7 years ago
nihuini	433a92401a	auto barrier in pipeline and copy command	7 years ago
nihuini	1f4bdd91b5	uint32_t typed workgroup size	7 years ago
BUG1989	df3d224484	new int8 implement,better accuracy (#749 ) * add the armv7a conv3x3s1 implement without overflow,remove old codes * fix the bug of conv3x3s2 packed int8 * new int8 implement,weight quant by perchanel,better accuracy~ * fix the bug of conv3x3s1 packed int8 neon * add the naive c fp32 and int8 winograd F(2,3) * add the neon intrinsic int8 winograd F(2,3) * optimize the armv7a int8 winograd F(2,3) with neon assembly * optimize the armv7a int8 winograd F(2,3) input transform with assembly. * add the requantize layer and int8 relu implement. * add graph optimize conv1x1s2 -> conv1x1s1,begin optimize int8 aarch64. * fix int8 bugs * add the c naive im2col with sgemm * add aarch64 int8 winograd f23, conv3x3s2 naive implement * add the int8 sgemm conv7x7s2 on x86/armv7a platform * optimize the int8 sgemm by neon intrinsic and packed kernel * optimize the int8 sgemm with packed data * optimize the int8 sgemm with armv7a neon assembly * add the int8 sgemm on arm64-v8a platform * perpare to merge latest codes from master * add the int8 param files * In the Class Net,add the fuse_network method	7 years ago

1 2

84 Commits (1469bc8b19b83d44206f36abfa3dc7377feeef69)