nihui/ncnn - ncnn - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
Cai Shanli	a9df4f6c59	add custom layer destroyer (#2481 ) * add custom layer destroyer * set default layer destroyer with 0	5 years ago
Martin Han	b441f738bd	Extract on CPU without pack/fp16fp32 (#2288 ) * Add readme for keras2ncnn * Add supported model variants * Fix supported model variants * Add extract without convert pack/fp1632	5 years ago
PENGUINLIONG	8f8f2de4d0	SSE2 optimization pack (#2123 ) * SSE2: BatchNorm * Fixed batch norm in AVX configuration * Optimized register size switch * Attempt to pass CI * Attempt to pass CI * Bias op * Element wise ops * Support packing on x86 by default * Fixed macro range in bias * Use aligned read for packed data * Update testutil.h * Update pooling_x86.cpp * Support wasn SIMD * Fix emscripten compiler flags * fix build * more ci fix * concat x86 pack4 * flatten x86 pack4 * more x86 pack4 * ci pass * fix * enable sse2 mathfun * enable --experimental-wasm-simd Co-authored-by: nihui <shuizhuyuanluo@126.com> Co-authored-by: nihuini <nihuini@tencent.com>	5 years ago
nihui	cf3cf83cd3	unified image shader storage type (#2231 ) * drop bug_layout_binding_id_alias flag	5 years ago
nihuini	b766c8cd9e	fix potential divide by zero fault when bf16s / fp16s enabled, fix #2125	5 years ago
nihuini	a334513b5e	fp16a option fix	5 years ago
nihuini	e841ae73c6	fix arm fp16s feat output, fix #2003	5 years ago
nihui	54e79a62d7	fix crash on non-arm82 build	5 years ago
nihui	c173d51c9b	mish sigmoid swish tanh arm fp16s	5 years ago
nihui	71f86af8a6	fix non-arm82 ci	5 years ago
nihui	9a2e2a6937	convert fp32 blobs for layers with fp16 storage support	5 years ago
nihui	308145254e	mask bf16 option in layer forward, disable gpu when bf16 enabled, fix #1962	5 years ago
nihui	71dc13625f	disable bf16 storage for int8 inference	5 years ago
nihuini	4e4f0baa73	set openmp blocktime 20 for reducing power consumption, blocktime option	5 years ago
nihui	bb5bfe3841	avx2 infrastructure (#1943 )	5 years ago
nihui	11cffce114	armv8.2 infrastructure (#1856 ) * runtime cpu dispatch * force thread one * disable openmp for coverage * simplify test layer * print NCNN_TARGET_ARCH * less ci build variants * weight fp16 storage option * test convdw int8 * apple a12 a13 * ncnn_add_layer ncnn_add_shader cmake macro	5 years ago
nihui	164273de61	online pipeline cache (#1792 ) * online pipeline cache wip * device-wide pipeline cache * enable model-wide pipeline cache * drop pre-created shader modules * always use pipeline cache * use implicit model-wide pipeline cache, code format * code clean	5 years ago
Tijmen Verhulsdonck	d1b5711791	X86 Elempack 8 AVX implementations. (#1853 ) * added avx implementations of FC and Max pool * Specify AVX2 * Small fixes and using Fused avx activations * fix type casting * fixing some CI errors * Fix code format * fix pooling test * remove vector typedef * More compile fixes * remove vector typedef * set c++ version to 17 * Force c++ 17 * Fixing mathfun * Try and workaround typedef issues * typefix * Remove typedef * switch to static inline * attempting to fix msvc bug * Verified MSVX FIX * Fixing clang build * commit before switch * More avx and packing implementation * Fix ctest * starting the depthwise pack 8 implementation * Unrolled loop * add depthwise pack 8 implementations * Working 1x1 pack 8 implementation added * revert incorrect changes * added conact elempack 8 * more elempack enabled layers added and started on the conversion of the winograd pack4 conv 3x3 * Added code formatting * fix styling * Unroll loops * unrolling loops * Added more elempac layers for mobilenet v3 * revert commit * fix code style * remove arm neon references * remove pack4 references * More cleanup * added packing avx code * fixing linux build ctests * remove usage of aligned loads * More aligned mem ops removed * Cleanup, revert some files and remove not working winograd and shufflechannel implementation * add stackoverflow referal * Fix windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * implement requested chaanges * remove reshape * revert arm file change * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * fix unterminated directive Co-authored-by: Restyled.io <commits@restyled.io>	6 years ago
nihui	3ef995ed1e	format code style and setup restyled.io (#1840 )	6 years ago
nihuini	64985809a3	fix crash in load_model when gpu is not used	6 years ago
nihuini	116869594c	fix cpu-only build	6 years ago
nihuini	aeba24b371	enable implicit fp16a on arm mali variants, add bug tag for layout binding id alias	6 years ago
nihuini	054ec09195	adreno device blacklist	6 years ago
nihuini	c94d1b39ad	force diable image storage on macos and ios, fix #1738	6 years ago
Naiyang Lin	ceef2470a5	Add logger.h (#1753 )	6 years ago
nihui	9a9a618229	image storage is mandatory, less options makes life easier	6 years ago
nihuini	041437ef48	seperate packing cast type to more shaders	6 years ago
nihui	62da1228e1	adreno image shader + fp16 + fp16a (#1714 ) * wip * wip * fix * image and imageview can not be destroyed until command execution ends * fast copy path for tightly packed data * wip * texture load works * 1d 3d image * record clone image, multiple commands share one image reference * upload download image * layer forward accept vkimagemat * vkimagemat graph works * staging vkimagemat for passing dynamic parameters, macro for fp32+image shader, padding image shader * vkimagemat elemsize * convolution test pass * conv1x1s1 image shader * fast staging image allocator from host memory, pooling image shader * convolutiondepthwise image shader * innerproduct image shader * packing image shader * crop deconvolution image shader * resolve spirv binding types * image fp16 and fp16a, cast image shader * eltwise image shader * wip * absval image shader * deconvolutiondepthwise image shader * concat image shader, squeezenet works * noop split image shader * uniform precision hint * layer support_image_storage * wip * vulkan device utility operator * command is storage and packing option aware * fallback to cpu on image allocation failed, mobilenetssd works * flatten image shader, enable more test * ci test * check imgfp32 imgfp16 imgfp16a features * fix ci test * fix ci test * upgrade swiftshader * wip * opt aggressive * imgfp16p * opt none * convolution winograd image shader * fix flush range, fast copy path for continous buffer * minor fix * fix innerproduct * wip ... * wip * cast fix * packing test * wip * image fp16p is fp16p * wip * silence * more line info * code clean * softmax image shader	6 years ago
nihui	7365bb80a2	vkmat and command api breaks (#1689 ) * vkmat and command api breaks * always use compute queue for compute buffer transfer * no barrier for readonly weight buffer * record clone, drop queue_owner * bring back layer forward * fix validation errors * lifecycle inside command makes life easier * update doc * record_import_android_hardware_buffer	6 years ago
nihui	7d1eec3d5d	the use_bf16_storage option	6 years ago
xieydd	b760e22da2	fix requant relu6 bug (#1590 ) * fix requant relu6 bug * fix * delete pipeline change in forward/forward_inplace avoid race in multithreading	6 years ago
nihui	52ce59e672	fix build with requant option on	6 years ago
nihui	0f7e7bca02	shader shape specialization constant and basic local group size partition (#1523 ) * use Mat class for Shape description * shape specialization constant in compute shader * wip * wip * test forward_inplace, add binaryop unaryop sigmoid test * fix arm unaryop test * fix arm binaryop test * make shape hint optional, cast int8 to fp32, add cast test * wip * follow the good and old local size setting for conv1x1 * the optimal local size rewrite * fix build on msvc * add permute shader for all packing layout, add permute test * concat and slice patial shape constant, slice test * fix slice test * interp test * add lrn test, test packing layout implicitly * add eltwise test * add normalize test * add instancenorm test * reorg shape constant * simple local group size partition * add shape constant param	6 years ago
nihui	e2bd4eae6e	write shape as 4-number tuple	6 years ago
nihui	6cefaad957	ncnnoptimize shape inference, load shape hint	6 years ago
nihui	a718129d76	shader pack8 option works	6 years ago
nihui	6f2ef1932d	int8 code refactoring wip, add int8 test	6 years ago
Anton Kochkov	07170542c9	Fix GCC 9.x warnings (#1462 )	6 years ago
Sungmann Cho	9bfc554bc9	Fix warnings on Visual Studio (#1431 ) * Fix warnings C4244, C4267 in src/layer/yolov3detectionoutput.cpp C4244: '=': conversion from 'int' to 'float', possible loss of data C4244: 'initializing': conversion from 'float' to 'int', possible loss of data C4244: 'initializing': conversion from 'double' to 'float', possible loss of data C4244: 'return': conversion from 'double' to 'float', possible loss of data C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data * Fix warnings C4244, C4267 in src/layer/yolodetectionoutput.cpp C4244: '=': conversion from 'int' to 'float', possible loss of data C4244: 'initializing': conversion from 'float' to 'int', possible loss of data C4244: 'initializing': conversion from 'double' to 'float', possible loss of data C4244: 'return': conversion from 'double' to 'float', possible loss of data C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data * Fix warning C4244 in src/layer/quantize.cpp C4244: 'initializing': conversion from 'double' to 'int', possible loss of data * Fix warnings C4244, C4267 in src/layer/detectionoutput.cpp C4244: '=': conversion from 'int' to 'float', possible loss of data C4244: 'initializing': conversion from 'double' to 'float', possible loss of data C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data * Fix warning C4244 in src/layer/roipooling.cpp C4244: 'initializing': conversion from 'double' to 'int', possible loss of data * Fix warning C4244 in src/layer/sigmoid.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4267 in src/layer/slice.cpp C4267: '=': conversion from 'size_t' to 'int', possible loss of data C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data * Fix warning C4267 in src/layer/softmax.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/layer/interp.cpp C4244: '=': conversion from 'float' to 'int', possible loss of data C4244: 'initializing': conversion from 'double' to 'int', possible loss of data * Fix warning C4244 in src/layer/instancenorm.cpp C4244: 'initializing': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/layer/deconvolutiondepthwise.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/layer/convolutiondepthwise.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/net.cpp C4244: 'return': conversion from '__int64' to 'int', possible loss of data C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data C4267: 'return': conversion from 'size_t' to 'int', possible loss of data * Fix warning C4244 in src/layer/bnll.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4267 in src/layer/concat.cpp C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data * Fix warning C4267 in tools/mxnet/mxnet2ncnn.cpp C4244: 'initializing': conversion from 'double' to 'float', possible loss of data C4267: '=': conversion from 'size_t' to 'int', possible loss of data C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data C4305: 'initializing': truncation from 'double' to 'float'	6 years ago
nihuini	3c9b3074e4	reclaim local vulkan allocator after blob_mats_gpu clear, fix random crash in multithread gpu inference without explicit per-thread allocator set	6 years ago
nihuini	50e8b5e4e8	multiple transfers may run concurrently if there is no dependency with each other, do not share staging buffer memory to fix potential data race	6 years ago
nihuini	33956cbfc3	pretty error info	6 years ago
nihuini	a170ef1acf	remove the default option usage in layer interface, fix write out of range in cast arm pack4, handle fp16p conversion on cpu/gpu transfer	6 years ago
nihuini	e73b06bbb8	fix build with NCNN_STRING=OFF	6 years ago
nihuini	64333429bb	data reader wrapper, fix #1325	6 years ago
nihui	8c1b87b1a2	fallback to cpu if no vulkan device found	6 years ago
Natsu	637d96c1d2	Fix gcc 9 compilation failure (#1189 ) * Fix gcc 9 compilation failure * Fix compilation failure on linux gcc * Fix compilation failure on old gcc * Remove C++11 requirement	6 years ago
nihui	ff62e7eed9	use_packing_layout option works	6 years ago
nihui	b4c388a72a	Mat misc function accept option parameter, deconvolution pack4 arm neon	6 years ago
nihui	8c53706987	net vkdev getter api	7 years ago

1 2 3 4

163 Commits (d38871bbfc38048d904efe50099bc2b1b7901bc1)