nihui/ncnn - ncnn - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
nihuini	394bca8dbb	Merge branch 'master' of https://github.com/Tencent/ncnn	8 years ago
nihuini	9ac305e160	create 3-dim sub blob for group convolution, fix #315	8 years ago
Howave	415bfbdfa7	added arm layer compilation for arm-linux system (#316 )	8 years ago
nihuini	318d3abe66	bind register explicitly, fix #306 , fix #310 , fix #312	8 years ago
Yantao Xie	2e9da1b95b	Add the epsilon parameter to the BatchNorm layer. (fix #303 ) (#311 ) * Add the epsilon parameter to the BatchNorm layer. (fix #303) * Move the eps into the sqrt.	8 years ago
nihuini	231a52e469	fix build on aarch64 with gcc, fix #309	8 years ago
BUG1989	af7019d3fc	fix compile error (#305 )	8 years ago
nihui	875a188d10	pre interleave kernel memory for winograd4, about 3%~20% speed gains	8 years ago
dong	6ea09ebf2c	Use aarch64 assembly to replace arm intrinsics	8 years ago
820169199	656de48631	add "#include <float.h>"	8 years ago
Dong Xu	28154dcb29	fix vst1.f32 of coeff sum at eltwise_arm layer In line 414: "vmla.f32 q1, q0, %q6 \n", destination register is q1 instead of q0, So, replace the {d0-d1} of line 416 with {d2-d3}.	8 years ago
nihui	0fd701112e	load LRN bias from param	8 years ago
nihui	7d1e49584d	call Innerproduct for convolution on flattened blob	8 years ago
harhar539	9a8486a823	1.fix pad tail bug in commit `d1ea2a3` at pooling layer	8 years ago
nihui	b1aec69ff9	d31 is useless	8 years ago
nihuini	5e484a47ef	fix build, second try	8 years ago
nihui	5f0fa95f61	fix build	8 years ago
nihui	d1ea2a34b4	rewrite pooling pad scheme, global pooling return continous blob	8 years ago
nihui	6c4c810fda	decouple modelbin of different input types, simplify timestamp function	8 years ago
nihui	2d4ae30508	fallback to all cores	8 years ago
nihui	03c1f63c2e	switch to winograd4	8 years ago
nihui	bc99d5123b	set smp cpu affinity to all cores	8 years ago
nihuini	098fff355c	implement spatial norm, convert L2Normalization	8 years ago
nihui	5ff6a1808a	emmmm, yet another implementation for winograd 3x3, unroll aggressively for aarch64	8 years ago
YQZ1990	6f13cc5185	slice (#269 ) * fix slice dim3	8 years ago
nihuini	bd705d5bdb	inplace binaryop with scalar	8 years ago
nihuini	5f4ac776d1	implement instancenorm	8 years ago
nihuini	db5e805eff	padding_mode for Pooling, fix #261	8 years ago
nihui	2d9410742b	concat slice shufflechannel honor elemsize	8 years ago
nihui	8ccae1d4fd	prevent reuse of param array, fix #258	8 years ago
nihuini	75218953cc	aarch64 assembly for conv1x1s1, unroll outch inch as 8x8	8 years ago
nihuini	76a55693a6	decouple convolutiondepthwise and convolution, reduce binary size by 10%, fix #254	8 years ago
nihuini	3ffb502bc6	reuse if the same shape	8 years ago
nihuini	c6506d6ecd	remaining inch for winograd neon3	8 years ago
nihui	c12fab569f	fix convdw3x3s1 on aarch64	8 years ago
nihui	f133729c78	code style changes	8 years ago
nihuini	03621aa7f9	more x86 stub for convolution and convolutiondepthwise	8 years ago
Lamply	6612178960	correct arm convolution depthwise mistakes (#246 )	8 years ago
nihui	848c9a1ea7	code clean	8 years ago
nihui	80fb28de90	unroll outch for convolution 3x3s1, about 10%~20% speed gain	8 years ago
nihui	df218110be	unroll num_output for innerproduct, about 60% speed gain	8 years ago
nihui	aaa1ffcef0	emmmm, prefer w h	8 years ago
nihui	d68eb4cd15	wrap benchmark gettimeofday	8 years ago
Linghan Cheung	811b6ba1b6	print benchmark information for every layer, especially for CONVOLUTION (#241 ) * print benchmark information for every layer, especially for CONVOLUTION * print benchmark information for every layer, especially for CONVOLUTION, for cross-platform. * move the function implementation to cpp file to avoid multiple definitions	8 years ago
nihuini	d2ee4e7d27	ld1 and st1 handle data endian mode per element	8 years ago
nihui	08e261f423	innerproduct produce continous blob, fix #236	8 years ago
nihui	682b0d3c0d	prelu on vector and image	8 years ago
nihui	14a2e23407	enable embed layer	8 years ago
nihui	c9789fb879	slice dim	8 years ago
nihuini	67b80183dd	fix param load using external memory	8 years ago

1 2 3 4

185 Commits (394bca8dbb36d3384edb089646aec7ec70fcc12d)