nihui/ncnn - ncnn - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
nihuini	4cf4c92ee6	better shape format for per-layer benchmark	5 years ago
nihuini	9c33b6c1c8	conv1x1s1 arm fp16sa	5 years ago
Tijmen Verhulsdonck	d1b5711791	X86 Elempack 8 AVX implementations. (#1853 ) * added avx implementations of FC and Max pool * Specify AVX2 * Small fixes and using Fused avx activations * fix type casting * fixing some CI errors * Fix code format * fix pooling test * remove vector typedef * More compile fixes * remove vector typedef * set c++ version to 17 * Force c++ 17 * Fixing mathfun * Try and workaround typedef issues * typefix * Remove typedef * switch to static inline * attempting to fix msvc bug * Verified MSVX FIX * Fixing clang build * commit before switch * More avx and packing implementation * Fix ctest * starting the depthwise pack 8 implementation * Unrolled loop * add depthwise pack 8 implementations * Working 1x1 pack 8 implementation added * revert incorrect changes * added conact elempack 8 * more elempack enabled layers added and started on the conversion of the winograd pack4 conv 3x3 * Added code formatting * fix styling * Unroll loops * unrolling loops * Added more elempac layers for mobilenet v3 * revert commit * fix code style * remove arm neon references * remove pack4 references * More cleanup * added packing avx code * fixing linux build ctests * remove usage of aligned loads * More aligned mem ops removed * Cleanup, revert some files and remove not working winograd and shufflechannel implementation * add stackoverflow referal * Fix windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * implement requested chaanges * remove reshape * revert arm file change * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * fix unterminated directive Co-authored-by: Restyled.io <commits@restyled.io>	6 years ago
nihui	3ef995ed1e	format code style and setup restyled.io (#1840 )	6 years ago
BUG1989	df3d224484	new int8 implement,better accuracy (#749 ) * add the armv7a conv3x3s1 implement without overflow,remove old codes * fix the bug of conv3x3s2 packed int8 * new int8 implement,weight quant by perchanel,better accuracy~ * fix the bug of conv3x3s1 packed int8 neon * add the naive c fp32 and int8 winograd F(2,3) * add the neon intrinsic int8 winograd F(2,3) * optimize the armv7a int8 winograd F(2,3) with neon assembly * optimize the armv7a int8 winograd F(2,3) input transform with assembly. * add the requantize layer and int8 relu implement. * add graph optimize conv1x1s2 -> conv1x1s1,begin optimize int8 aarch64. * fix int8 bugs * add the c naive im2col with sgemm * add aarch64 int8 winograd f23, conv3x3s2 naive implement * add the int8 sgemm conv7x7s2 on x86/armv7a platform * optimize the int8 sgemm by neon intrinsic and packed kernel * optimize the int8 sgemm with packed data * optimize the int8 sgemm with armv7a neon assembly * add the int8 sgemm on arm64-v8a platform * perpare to merge latest codes from master * add the int8 param files * In the Class Net,add the fuse_network method	7 years ago
nihuini	fa3b8cfb87	print depthwise convolution and deconvolution kernel info in per-layer benchmark	7 years ago
nihui	b6b90c888f	high resolution timestamp on windows	7 years ago
nihui	182e92a331	Update benchmark.cpp	8 years ago
nihui	bc1a84dee5	fix mingw32 build	8 years ago
Tiancai Ye	3977d32eb9	Fix windows build fails (#321 ) * fix windows build error * remove wrong commit	8 years ago
nihuini	5e484a47ef	fix build, second try	8 years ago
nihui	5f0fa95f61	fix build	8 years ago
nihui	6c4c810fda	decouple modelbin of different input types, simplify timestamp function	8 years ago
nihui	aaa1ffcef0	emmmm, prefer w h	8 years ago
nihui	d68eb4cd15	wrap benchmark gettimeofday	8 years ago
Linghan Cheung	811b6ba1b6	print benchmark information for every layer, especially for CONVOLUTION (#241 ) * print benchmark information for every layer, especially for CONVOLUTION * print benchmark information for every layer, especially for CONVOLUTION, for cross-platform. * move the function implementation to cpp file to avoid multiple definitions	8 years ago

16 Commits (882f114a8debc4c98a5b87d2a43c00a67ae5eb03)