nihui/ncnn - ncnn - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
nihui	aa9753b2f0	detach mat from local blob allocator so net instance could be destroyed much earlier (#3287 )	4 years ago
zhiliu6	814f89ef1a	Fuse HardSwish activation into Convolution and InnerProduct (#3233 ) * add general fused activation * add NCNN_FORCE_INLINE option	4 years ago
Tijmen Verhulsdonck	4270b5c502	Fix broken codepaths with AVX only (#3254 ) * Fix codepaths for fp16 weights when only AVX is enabled * Disable opt overrides * Update SDK url * Update vulkan SDK download version * Debugging risv pad * apply code-format changes * fix padding test * fix mips slice test * fix lrn test * implement mish swish image shader, fix pooling adaptive image storage support, drop debug output * update ci ubuntu 18.04 Co-authored-by: nihui <shuizhuyuanluo@126.com>	4 years ago
zhiliu6	80699dd3f9	fix hardswish test beta param (#3214 )	4 years ago
nihui	c6cda8d07c	arm neon optimization for requantize leakyrelu (#3144 ) * arm neon optimization for requantize leakyrelu * add missing changes * Update test_requantize.cpp * more test coverage	4 years ago
Xavier Hsinyuan	2a5c672787	Add unittest and RVV optimized for SELU (#3114 )	4 years ago
nihuini	f1533667ff	fix test_c_api net instance destroyed earlier than blob destruction	4 years ago
Tijmen Verhulsdonck	eaa7e24db6	Added ability to switch AVX/AVX2 during runtime (#3076 )	4 years ago
nihui	b413fd3a3d	auto code-format bot and disable restyled (#3075 )	4 years ago
DaydreamCoding	f42d0e5dc9	fix warpaffine_bilinear_yuv420sp uv matrix (#3048 )	4 years ago
nihui	4f135e07bf	implement convolution1d and pooling1d (#3035 ) * implement convolution1d and pooling1d * add conv1d pool1d test * fuse convolution1d activation * update operator doc * fix vulkan adpative pooling	4 years ago
nihuini	12eaa6f9ba	update concat test	5 years ago
nihuini	a180bf7bdc	update concat test for larger channels	5 years ago
nihui	c1ce8ea84d	add more test	5 years ago
nihuini	07fa2e1fe3	prefer large channels for int8 operator tests	5 years ago
nihui	3a77b09c31	fix test failure	5 years ago
nihuini	fef61c5296	fix arm build	5 years ago
nihuini	934a1a8e32	test flatten packing padding int8	5 years ago
nihui	49f3e1ea09	drawing api and stb_image (#2913 ) * drawing api * add drawing test * yuv420sp drawing * enable simpleocv in webassembly build	5 years ago
nihui	17936e9f54	fix packing risc-v test, add cpu_riscv_vlenb()	5 years ago
nihui	a61f03ec76	arm neon optimization for pixelshuffle scale 2	5 years ago
nihuini	d6b2ea5aac	arm neon optimization for convolution 3x3 on small channels	5 years ago
nihui	7e1aaa5828	cmake option NCNN_INT8 (#2839 )	5 years ago
nihui	66455c1b95	implement 2823 binary broadcasting type (#2827 )	5 years ago
nihuini	41a4bea954	unroll size 8 for conv3x3s1 pack8to1 int8 arm64	5 years ago
nihui	e9cc637573	arm neon optimization for int8 packing kernels (#2809 )	5 years ago
nihui	1ea8bfbd2e	x86 avx2 conv3x3s1 pack8 direct optimization, fix #2789	5 years ago
ncnnnnn	6e6cb9f4f3	simple sort ncnn_add_layer_test (#2790 ) for obsessive	5 years ago
nihui	a48bf43ef7	test conv/fc int8 with activation	5 years ago
nihui	5fe75f19ef	architecture changes for int8 packing (#2771 ) * quantize and dequantize tests * unify activation and usability function * drop NCNN_REQUANT cmake option, test dequantize requantize pack8, fix webassembly build * benchmark use requantize int8 model	5 years ago
nihuini	15d63ec0f5	fuse onnx multiheadattention with same qkv blob	5 years ago
RBelogorodtsevFBase	1212ed6e94	implements gelu activation (#2749 )	5 years ago
nihuini	c17eb4e208	multiheadattention layer	5 years ago
nihuini	7ac23ab34d	fuse onnx layernorm, fix 2-dim layernorm implementation, add test	5 years ago
nihui	3c92a1184b	arm neon optimization for general convolution im2col sgemm (#2668 ) * arm neon optimization for conv3x3s1 winograd42 * better condition * Update test_convolution.cpp * Update test_convolution.cpp * more proper conditions * arm neon optimization for general im2col sgemm pack4 * add sgemm * wip * wip * fix armv7 build * more conditions blah blah * code format * fix convolution * move packed convolution to seperated header source * unify weight data bf16 * proper conditions * conv3x3s2 sgemm pack4 test	5 years ago
nihui	ab56083ca5	arm neon optimization for conv3x3s1 winograd42 (#2664 )	5 years ago
nihuini	f437bcdd4c	enable fp16s and int8s on newer adreno/mali, actually enable int8 tests	5 years ago
nihui	74451897cb	handle gemm in innerproduct (#2607 )	5 years ago
nihui	0a59ac9b16	integer warpaffine (#2604 ) * integer warpaffine * fix some corner case * fix yuv420sp border value	5 years ago
nihui	6672b09a37	arm neon optimization for gru (#2597 )	5 years ago
nihui	0b35540c72	arm neon optimization for lstm (#2595 )	5 years ago
nihuini	3915b5d496	arm neon optimization for packing fp16/bf16 pack8 family	5 years ago
nihui	fca04980f3	enhance padding test (#2580 ) * workaround nvidia driver crash * workaround radv buffer_ld1 zero bug * fix offset elempack	5 years ago
nihui	80fdddb502	more slice test	5 years ago
nihui	ef3550b52f	gru and rnn layer (#2572 )	5 years ago
Guoxia Wang	609f63c57e	support PyTorch AdaptiveAvgPool2d and AdaptiveMaxPool2d (#2546 ) * support pytorch adaptive pool * support onnx2ncnn adaptive pool convert * support ncnnoptimize adaptive pool param write * fix adaptive pool out_shape order * fix adaptive pool out_shape order, H and W can be either a int add test case, set support_vulkan = false Pooling_vulkan::create_pipeline * review adaptive pool * fix typo * add adaptive pool forward in pooling_x86.cpp pooling_arm.cpp fix out_w, out_h id naming convention * fix typo * don't support packing, bf16, int8, image for adaptive pool * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io>	5 years ago
nihui	21dc650eb3	check layer support (#2564 )	5 years ago
tpoisonooo	baf49574c4	innerproduct aarch64 use gemm (#2521 ) * perf(innerproduct-arm): add aarch64 gemm * fix(innerproduct): fix compilation errror * fix(armv7-innerproduct): fix armv7 compilation error * fix(innerproduct): fix gemm param * fix(int8): update mock scales and fix runtime error * fix(compilation): fix compilation error	5 years ago
nihui	54c0a13b9f	build shared library (#2525 ) * build shared lib and enable lto * reserved for layer and option * allocator pimpl * datareader pimpl * paramdict pimpl, disable copy assign for allocator and datareader * modelbin pimpl * net extractor pimpl * gpu pimple * disable copy assign vulkandevice, code format * command pimpl, dummy image readonly * pipeline pipelinecache pimpl, export platform class * code format, export simple family * update ci * disable lto on android armv7, merge webassembly ci * link libgcc, fix macos dylib version * pipeline pimpl, gpu info pimpl * destroy gpu info after vulkan device * ignore msvc stl class warning * fix ncnn_paramdict_get_float return type * fix vktransfer upload fp16 without flatten, add command test	5 years ago
nihuini	fbf0ffda53	pixelshuffle nhwc mode, convert onnx DepthToSpace mode DCR, convert mlir tf.DepthToSpace	5 years ago

1 2 3 4 5

208 Commits (a490f8a5335f3608a19fdf8a018fbfbd731280d3)