nihui/ncnn - ncnn - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
nihui	171b9d1bba	use spdx license header, copyright Tencent (#6152 )	10 months ago
nihui	bd0b111775	vulkan tight fp16p pack1 (#6127 )	11 months ago
nihui	24a3b99f1f	drop layer support_image_storage and option use_image_storage (#6126 ) * fix pyncnn build	11 months ago
nihui	211e238639	drop layer forward vkimagemat (#6124 ) vkimagemat was originally used as a mat storage in the hope of improving performance on old adreno gpus, but in fact it is slower than the cpu in most cases and is no longer suitable for the latest adreno architecture and large shapes	11 months ago
nihui	556b79ce4d	create layer decoupled (#5258 ) * create layer decoupled * no more virtual public * allow build test with shared library * decouple cpu vulkan * drop old scripts	2 years ago
nihui	1377acf945	avx512 bf16 fp16 infrastructure (#3926 )	3 years ago
nihui	f10cc6dd93	initial data structure changes for 3dcnn, conv3d, pooling3d (#3378 ) Co-authored-by: ElvisYu <elvisyuovo@gmail.com> Co-authored-by: 余浩文 <m18107220188@163.com> Co-authored-by: Zr2223 <67497651+Zr2223@users.noreply.github.com>	4 years ago
nihui	54c0a13b9f	build shared library (#2525 ) * build shared lib and enable lto * reserved for layer and option * allocator pimpl * datareader pimpl * paramdict pimpl, disable copy assign for allocator and datareader * modelbin pimpl * net extractor pimpl * gpu pimple * disable copy assign vulkandevice, code format * command pimpl, dummy image readonly * pipeline pipelinecache pimpl, export platform class * code format, export simple family * update ci * disable lto on android armv7, merge webassembly ci * link libgcc, fix macos dylib version * pipeline pimpl, gpu info pimpl * destroy gpu info after vulkan device * ignore msvc stl class warning * fix ncnn_paramdict_get_float return type * fix vktransfer upload fp16 without flatten, add command test	5 years ago
nihui	cf3cf83cd3	unified image shader storage type (#2231 ) * drop bug_layout_binding_id_alias flag	5 years ago
nihui	11cffce114	armv8.2 infrastructure (#1856 ) * runtime cpu dispatch * force thread one * disable openmp for coverage * simplify test layer * print NCNN_TARGET_ARCH * less ci build variants * weight fp16 storage option * test convdw int8 * apple a12 a13 * ncnn_add_layer ncnn_add_shader cmake macro	5 years ago
nihui	3ef995ed1e	format code style and setup restyled.io (#1840 )	6 years ago
Tijmen Verhulsdonck	e3b31511ad	Added AVX implementation to cast to/from bfloat and float32 (#1836 )	6 years ago
nihui	8fec0038ba	fix ci test	6 years ago
nihuini	4a624c636b	skip image tests on unsupported platforms	6 years ago
nihui	9a9a618229	image storage is mandatory, less options makes life easier	6 years ago
nihui	62da1228e1	adreno image shader + fp16 + fp16a (#1714 ) * wip * wip * fix * image and imageview can not be destroyed until command execution ends * fast copy path for tightly packed data * wip * texture load works * 1d 3d image * record clone image, multiple commands share one image reference * upload download image * layer forward accept vkimagemat * vkimagemat graph works * staging vkimagemat for passing dynamic parameters, macro for fp32+image shader, padding image shader * vkimagemat elemsize * convolution test pass * conv1x1s1 image shader * fast staging image allocator from host memory, pooling image shader * convolutiondepthwise image shader * innerproduct image shader * packing image shader * crop deconvolution image shader * resolve spirv binding types * image fp16 and fp16a, cast image shader * eltwise image shader * wip * absval image shader * deconvolutiondepthwise image shader * concat image shader, squeezenet works * noop split image shader * uniform precision hint * layer support_image_storage * wip * vulkan device utility operator * command is storage and packing option aware * fallback to cpu on image allocation failed, mobilenetssd works * flatten image shader, enable more test * ci test * check imgfp32 imgfp16 imgfp16a features * fix ci test * fix ci test * upgrade swiftshader * wip * opt aggressive * imgfp16p * opt none * convolution winograd image shader * fix flush range, fast copy path for continous buffer * minor fix * fix innerproduct * wip ... * wip * cast fix * packing test * wip * image fp16p is fp16p * wip * silence * more line info * code clean * softmax image shader	6 years ago
nihui	979dd5fd11	test does not need to provide data type options	6 years ago
nihui	7365bb80a2	vkmat and command api breaks (#1689 ) * vkmat and command api breaks * always use compute queue for compute buffer transfer * no barrier for readonly weight buffer * record clone, drop queue_owner * bring back layer forward * fix validation errors * lifecycle inside command makes life easier * update doc * record_import_android_hardware_buffer	6 years ago
nihui	d023137426	test fp16 packed and shader pack8 option (#1636 ) * wip * fix slice pack8 test * fix flatten pack8 test * fix binaryop pack8 test * fix interp pack8 test * rewrite cast test for different blob type and packing	6 years ago
nihui	0f7e7bca02	shader shape specialization constant and basic local group size partition (#1523 ) * use Mat class for Shape description * shape specialization constant in compute shader * wip * wip * test forward_inplace, add binaryop unaryop sigmoid test * fix arm unaryop test * fix arm binaryop test * make shape hint optional, cast int8 to fp32, add cast test * wip * follow the good and old local size setting for conv1x1 * the optimal local size rewrite * fix build on msvc * add permute shader for all packing layout, add permute test * concat and slice patial shape constant, slice test * fix slice test * interp test * add lrn test, test packing layout implicitly * add eltwise test * add normalize test * add instancenorm test * reorg shape constant * simple local group size partition * add shape constant param	6 years ago

20 Commits (master)