nihui/ncnn - ncnn - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
Leo	5afd318b86	Support remove libstdc++ denpendency (#2030 ) * [build] add toolchain file w/o stdcxx dependency * [build] link m and gcc lib explicitly * [ncnn] complete simple stl impl * [ncnn] adapt for ncnn simplestl * [test] adapt for ncnn simplestl * [ncnn] fix missing algorithm and list when simplestl disabled * [ncnn] fix guard for operator new and delete * [style] fix the code style * [build] fix build failed on darwin and emscripten * [ci] do not import cxx to avoid operator conflict * [ncnn] add temporary partial_sort impl using bubble sort heap sort should be used for better perf. * [ncnn] add std greater and less function * [ncnn] fix placement new operator overload * [ncnn] add operator delete with size info * [build] disable exception, rtti, example and tools when simplestl on * [build] add toolchain for arm simplestl * [build] add toolchain for aarch64 simplestl * [ncnn] move initializer to constructor * [ncnn] use deteiled type instead of auto * [ncnn] use plain lib name in target_link_libraries	5 years ago
nihui	b8f3e1455e	code clean	6 years ago
nihui	b5e288b521	layer creator function is not necessary for built-in layers	6 years ago
nihui	3ef995ed1e	format code style and setup restyled.io (#1840 )	6 years ago
SunTY	705dd36a31	simplestl is an alternative std vector string implementation (#1762 ) * 去掉对stl的依赖 * 头文件名，push_back改正 * 去掉构造托管 * 好像是折腾 * data 的返回改为指针，非指针引用 * resize一处写错 * stdint * 加入c_str * 改文件名为小写 * NCNN_SIMPLESTL option * simplestl default to OFF * Update linux-x64-cpu-gcc.yml * Update linux-x64-cpu-gcc.yml * Update linux-x64-cpu-clang.yml * drop functional header * arm32 arm64 simplestl ci * 修改一处内存泄漏, 去掉编译器警告 * resize时默认量的bug Co-authored-by: nihuini <nihuini@tencent.com> Co-authored-by: nihui <shuizhuyuanluo@126.com>	6 years ago
nihuini	6077066b02	binaryop broadcasting special type 3 4 for lhs	6 years ago
Sungmann Cho	c62e2702b3	Fix warnings on Visual Studio (#1456 ) * Fix warning C4244 in src/layer/convolution.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/layer/convolution_sgemm_int8.h C4244: 'initializing': conversion from 'double' to 'int', possible loss of data * Fix warning C4244 in src/layer/deconvolution.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/layer/elu.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4267 in src/layer/embed.cpp C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data * Fix warning C4244 in src/layer/exp.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/layer/innerproduct.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/layer/log.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data C4244: 'initializing': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/layer/lrn.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/layer/mvn.cp C4244: 'initializing': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/layer/power.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warnings C4244 and C4267 in src/layer/proposal.cpp C4244: 'initializing': conversion from 'double' to 'float', possible loss of data C4244: 'initializing': conversion from 'double' to 'int', possible loss of data C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data * Fix warning C4244 in src/layer/reduction.cpp C4244: 'return': conversion from 'double' to 'T', possible loss of data * Fix warning C4244 in src/layer/tanh.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warning C4244 in src/layer/binaryop.cpp C4244: '=': conversion from 'double' to 'float', possible loss of data * Fix warnings C4244 and C4267 in src/layer/unaryop.cpp C4244: 'return': conversion from 'double' to 'T', possible loss of data C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data * Fix warning C4244 in src/layer/x86/convolutiondepthwise_3x3_int8.h C4244: 'initializing': conversion from 'double' to 'int', possible loss of data	6 years ago
Sungmann Cho	f63632421c	Replace all occurrences of std::{unary\|binary}_function<> with std::function<> (#1439 ) std::{unary\|binary}_function<> are deprecated in C++11 and removed in C++17. Actually, these were unnecessary in general. In the C++98/03 era, many user- defined function object classes derived from these base classes in an attempt to imitate STL conventions. However, STL containers and algorithms have never required such inheritance (or the typedefs that they provide). Only the function object adaptors (like bind1st()) needed such typedefs. Currently, there are no uses of function object adapters in our codebase, so we can eliminate the inheritance completely.	6 years ago
nihui	bc3255b06f	binaryop broadcasting type for spatial attention module	6 years ago
nihuini	e0798faee3	binaryop unaryop pack4 arm neon	6 years ago
nihuini	77c1f361b7	comment for broadcasting rules	7 years ago
nihuini	7a8f68aca6	move vulkan code to subdir, new layer interface create_pipeline and destroy_pipeline for post-loading works	7 years ago
nihuini	433a92401a	auto barrier in pipeline and copy command	7 years ago
nihuini	85a28959e4	fix binaryop shader binding, use shared buffer state, fix blob copy in non-light mode, fix #817	7 years ago
nihui	f0b4933eac	massive simd optimize in compute shader (#772 ) * init vec4 shader * more vec4 shader ... * convolutiondepthwise is depthwise * pooling pack4, fix global pooling * dropout pack4, relu pack4 * softmax pack4 * more shader vec4 .. * fix staging remap, remove layer pipeline member, add destroy_pipeline interface, add pack4 glue code * eltwise pack4 glue code * add binary pack4, unary pack4 * add binaryop unaryop pack4 glue code	7 years ago
mzpan	777f3f98d9	add w=1 h=1 op (#765 ) * add w=1 h=1 op * add w=1 h=1 op * add w=1 h=1 op	7 years ago
nihui	10b8ac68cc	[WIP] vulkan compute (#618 ) * vulkan infrastructure * vkallocator and vkmat * layer interface for vulkan compute * wip... * default vulkan device, command wrapper, upload model weight in load_model to simplify layer interface * simplify command api, vkmat holds staging buffer, relu works * initialize specialization constant, simplify command dispatch, fix staging buffer copy with different shape, convolution works * init extension functions * dynamic local size and group count * group count=1 is invalid * regard device max workgroup size limit * fix relu oooops * decouple command record and staging allocation * create result blob * add pooling shader * buffer is faster than image :) * fix pooling shader * add innerproduct shader * readonly writeonly decoration * simplify buffer creation * decouple command and layer, VK_KHR_descriptor_update_template extension makes descriptor binding update easy :D * fix vulkan building issues in visual studio (#1) * fix building issues on visual studio * ignore benchmark * cancel changes * ... ... * decouple paramdict and vulkandevice * fix staging buffer destroy in model loading * remove vkdev member in option * add padding shader * simplify vulkan layer creation, simplify convolution and pooling shader for no padding, less debug output * add convolutiondepthwise and softmax shader * specialization float type, add leakyrelu * add dropout shader * add batchnorm shader * split vulkan forward * add scale shader * push constant type can be int or float * set_optimal_local_size_xyz * add eltwise shader * concat vulkan forward * fix convolution without bias * add dummy shader for concat and split, more fix ... * optional VK_KHR_descriptor_update_template and VK_KHR_push_descriptor * check VK_KHR_push_descriptor for vkCmdPushDescriptorSetWithTemplateKHR * binaryop and unaryop shader * hide raw command buffer * simple vkbenchncnn benchmark * create device with transfer queue * rename command to vkcompute, add vktransfer and layer upload_model interface * external VkMat, copy and map wrt buffer offset * command copy respect offset and size * decouple weight upload and load, simplify upload weight api, use one big staging buffer for uploading weights * fix build on android * binding count can not vary :( * barrier check state, fix sub-op destruction * declare local_size_xyz constant, fix crash on radv * fix local_size_xyz, second try * more barrier and state fix * fix softmax * reconstruct buffer memory allocator, reuse blob buffer, less verbose output * find unified memory type index * weight staging buffer allocator and weight buffer allocator, respect descriptor buffer offset alignment * use VK_KHR_descriptor_update_template for faster descriptor update if available, multithread pipeline creation * find more useful vulkan extensions and enable them * fix msvc build * respect VK_KHR_dedicated_allocation for weight buffer allocation * fix android build * fix bias name conflicts with metal * decouple pipeline and layer, building shader sources into shader module, dedicated create_pipeline api, simplify pipeline recording * drop dummy shader, inplace softmax, multiple shader module works * fix unique queue family index error * flatten support vulkan * mnasnet run * find shader module by name, each entry point per shader module, fix attribute/id conflict on moltenvk * some minor changes * add some high level api * use dedicated transfer queue to upload weight model * prefer mappable buffer on unified memory * global pooling and convolution fc, reuse staging buffer * implement ring-buffer style blob allocator, add VkBufferMemory capacity * use blob allocator for workspace blob, it works fine :) * vulkan option off * Update layer.cpp * fix build with vulkan off * less verbose output, fix crash on vulkan_compute off * merge benchncnn tool * allocator clear api, use new weight buffer allocator per net * add default locked allocator * mapped mat ptr api, persistent mapped memory works generally :) * travis ci linux vulkan * travis ci vulkan wip ... * more gpu wip ... * more gpu wip ... * wip... * wip... * wip... ... * wip... ios vulkan build... * find glslangValidator on ios build * use dynamic moltenvk library * travis ci wip ... * ios simulator does not support metal at all * fix cpu only extractor * optimize workgroup size, first try * optimize workgroup size, second try * conv1x1s1d1 vec4 * revert build system * fix ncnn2mem build * fix ncnn2mem build	7 years ago
nihui	9706cd1447	implement ncnn blob/workspace allocator, fine-grained per-layer openmp threads control, fix #469	8 years ago
nihui	30b6cc4ecd	rdiv binaryop	8 years ago
nihui	2f90a794ad	rsub binaryop	8 years ago
nihuini	bd705d5bdb	inplace binaryop with scalar	8 years ago
nihuini	a84ba8fc0f	element type storage support in Mat, move data member the first so that a pointer to Mat is a pointer to data, convenient index access for float vector	8 years ago
nihui	1e2265dd99	new param load api	8 years ago
nihuini	7914a8ad21	binaryop with scalar const	8 years ago
nihuini	091a43676a	implement binaryop and unaryop with operator template, binaryop between blobs with different dims	8 years ago
nihuini	ac76e4ba02	include algorithm for std::max	8 years ago
nihuini	dcbc117368	implement binaryop and unaryop	9 years ago

27 Commits (ba70b2f7807f76ec6f7567d784ba41eee30595fb)