nihui/ncnn - ncnn - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
nihui	db035d602d	update ncnnoptimize layers, lightmode=false keeps original weight (#5414 )	2 years ago
nihui	056509a034	fix create_pipeline crash in vulkan-enabled layer without calling load_param/load_model first (#5410 )	2 years ago
張小凡	3b048d1923	destroy_gpu_instance() function wait for all devices to be idle before destroy (#4763 ) * destroy_gpu_instance() will internally ensure that all vulkan devices are idle before proceeding with destruction.	2 years ago
nihui	69640594f7	unified macos ios ci, drop 32bit support, drop ios arm64e, default to ios 13 (#5403 )	2 years ago
nihui	5a8f79f7c7	add apple A17 and M3 family macro (#5405 )	2 years ago
nihui	2a07aa2d79	unified mac-catalyst ci (#5402 ) * fix moltenvk static linking	2 years ago
nihui	fafb897ff7	update ios toolchain, add visionos ci, update watchos, ncnn target ilp32 (#5399 )	2 years ago
nihui	824b79a314	fix rvv extract blob with fp16 enabled, fix #5360 (#5398 )	2 years ago
nihui	7cc89108b3	try more known vulkan library with simplevk (#5396 )	2 years ago
nihui	2f65729873	fix riscv v build with old cpp standard, fix #5366 (#5391 )	2 years ago
nihui	167501f0c6	fix softmax arm fp16s sum error, fix #5340 (#5393 )	2 years ago
nihui	6595743bb2	shift before adding for dropping additional double bit from vqdmulhq_s16, fix #5263 (#5390 )	2 years ago
nihui	84256b1494	pnnx enhance functionize (#5387 ) * pnnx fix some undefined dtype * fix ncnn convdw1d dynamic weight loading	2 years ago
Shatyuka	5a11c383a2	Support LLVM OpenMP runtime for MSVC (#5370 ) with `/openmp:llvm` compile option	2 years ago
hokamilkv	74fda386f3	Update convolution_im2col_gemm_int8.h (#5365 ) remove _sum0=_sum0	2 years ago
Shatyuka	e7748e5311	Fix `destroy_gpu_instance` crash (#5353 ) * Fix `destroy_gpu_instance` crash * Additional check and clear	2 years ago
Shatyuka	ddd17dd907	Fix build error with NCNN_PIXEL_DRAWING off (#5346 )	2 years ago
nihui	4797d19873	ruapu cpu isa detection (#5341 )	2 years ago
nihui	984d6dd844	promote vfpv4 for auto fp16 storage conversion (#5325 ) * promote vfpv4 for auto fp16 storage conversion * always report neon and vfpv4 for arm64	2 years ago
nihui	5b536af234	fix uwp build (#5328 )	2 years ago
nihui	d38bdbdb84	fix debug build on some compiler, fix #5295 (#5326 )	2 years ago
nihui	87d7165848	disable signal based detectisa if being debugged (#5280 )	2 years ago
Justin Fung	f6763262d1	Add draw rectangle, draw text, draw circle, and draw line to C API (#5324 )	2 years ago
Xinyu Yang	7ac42680cf	RVV: Refine riscv gemm fp32 (#5303 ) * replace storexxx to vsseg2e32_v_f32m1 * refine transpose --------- Co-authored-by: Xinyu302 <Xinyu302@users.noreply.github.com>	2 years ago
Sophon	294e786d36	convolution_x86: Fix typo in logging (#5310 ) Signed-off-by: Xilin Wu <wuxilin123@gmail.com>	2 years ago
nihui	0942efab2e	x86 avx512 optimization for mish (#5309 )	2 years ago
nihui	7928d44d51	port stb image optimization (#5307 )	2 years ago
nihui	05b4dcb06c	report vulkan cm 8x8x16 config, enable fp16a cm (#5298 )	2 years ago
nihui	5329d32e74	check vulkan fp16 uniform support and implement lfp conversion without fp16u (#5287 )	2 years ago
nihui	656b082284	fix cast armv7 sigbus when loading fp16 model (#5292 ) * fix sigbus error when loading fp16 model on armv7 * apply for bf16	2 years ago
nihui	ba42369c68	workaround l2 norm produce -inf value with subnormals (#5272 )	2 years ago
nihui	c222208cc9	feat mask for disable threading, make some extractor setter no-op, update doc (#5270 )	2 years ago
nihui	a31f66203b	do not cache temporary blob for uploading weight (#5266 )	2 years ago
nihui	556b79ce4d	create layer decoupled (#5258 ) * create layer decoupled * no more virtual public * allow build test with shared library * decouple cpu vulkan * drop old scripts	2 years ago
Molly Sophia	92d49e1f59	requantize: Use activation_ss in fused_activation.h (#5245 ) Which fixes int8 requantization on risc-v Signed-off-by: Molly Sophia <mollysophia379@gmail.com>	2 years ago
nihui	d1d9aa2edb	fix some cpu.cpp warning (#5244 )	2 years ago
nihui	d30af29ee2	fix simplecv Mat templated ptr (#5241 )	2 years ago
nihui	6c261a8c04	fix the missing elemsize in vkimagemat from_android_hardware_buffer (#5237 )	2 years ago
nihui	ded0b78bb2	fix nvidia vulkan crash on exit (#5234 )	2 years ago
nihui	8c4fc5e2a0	enable uniform 16bit and 8bit when available, fix validation error in fp16sa shader (#5233 )	2 years ago
nihui	b7f70cfe4e	initialize cpu thread affinity mask all to all cores (#5231 ) call omp_set_num_threads with zero num_threads is implementation defined	2 years ago
nihui	5a8ce63af4	optimize resize bilinear and compress font data (#5200 )	2 years ago
nihui	eea3fc9b41	optimize vulkan global pooling (#5191 ) Co-authored-by: nihui <nihui@users.noreply.github.com> Co-authored-by: michaelcai <michaelcai@tencent.com>	2 years ago
nihui	1138312f1e	detect avx512 isa with signal action on macos (#5185 )	2 years ago
nihui	dba87f8cad	fix build with msvc arm64 asimdhp (#5176 )	2 years ago
nihui	deae9e61da	disable rtti and exceptions for msvc (#5167 ) * disable rtti and exceptions for msvc * warnings-- * erff * arch sse2 for 32bit build * enable rtti for cross compiling	2 years ago
nihui	058aa0ad37	enable arm neon intrinsics for msvc build (#5151 )	2 years ago
AlOa	9f26eeb5a7	Prelu layer uses sse instruction _mm_load_ps but data can be misaligned so it must use _mm_loadu_ps (#5149 )	2 years ago
Justin Fung	465debe9bb	Add print statements for 4 dimensions benchmark (#5148 )	2 years ago
nihui	4136de3b8d	arm optimization for convolution int8 packed unified elempack (#5147 )	2 years ago

1 2 3 4 5 ...

1811 Commits (db035d602de6ec0cd3bdd191cb21f4b73e7599be)