1811 Commits (db035d602de6ec0cd3bdd191cb21f4b73e7599be)

Author SHA1 Message Date
  nihui db035d602d
update ncnnoptimize layers, lightmode=false keeps original weight (#5414) 2 years ago
  nihui 056509a034
fix create_pipeline crash in vulkan-enabled layer without calling load_param/load_model first (#5410) 2 years ago
  張小凡 3b048d1923
destroy_gpu_instance() function wait for all devices to be idle before destroy (#4763) 2 years ago
  nihui 69640594f7
unified macos ios ci, drop 32bit support, drop ios arm64e, default to ios 13 (#5403) 2 years ago
  nihui 5a8f79f7c7
add apple A17 and M3 family macro (#5405) 2 years ago
  nihui 2a07aa2d79
unified mac-catalyst ci (#5402) 2 years ago
  nihui fafb897ff7
update ios toolchain, add visionos ci, update watchos, ncnn target ilp32 (#5399) 2 years ago
  nihui 824b79a314
fix rvv extract blob with fp16 enabled, fix #5360 (#5398) 2 years ago
  nihui 7cc89108b3
try more known vulkan library with simplevk (#5396) 2 years ago
  nihui 2f65729873
fix riscv v build with old cpp standard, fix #5366 (#5391) 2 years ago
  nihui 167501f0c6
fix softmax arm fp16s sum error, fix #5340 (#5393) 2 years ago
  nihui 6595743bb2
shift before adding for dropping additional double bit from vqdmulhq_s16, fix #5263 (#5390) 2 years ago
  nihui 84256b1494
pnnx enhance functionize (#5387) 2 years ago
  Shatyuka 5a11c383a2
Support LLVM OpenMP runtime for MSVC (#5370) 2 years ago
  hokamilkv 74fda386f3
Update convolution_im2col_gemm_int8.h (#5365) 2 years ago
  Shatyuka e7748e5311
Fix `destroy_gpu_instance` crash (#5353) 2 years ago
  Shatyuka ddd17dd907
Fix build error with NCNN_PIXEL_DRAWING off (#5346) 2 years ago
  nihui 4797d19873
ruapu cpu isa detection (#5341) 2 years ago
  nihui 984d6dd844
promote vfpv4 for auto fp16 storage conversion (#5325) 2 years ago
  nihui 5b536af234
fix uwp build (#5328) 2 years ago
  nihui d38bdbdb84
fix debug build on some compiler, fix #5295 (#5326) 2 years ago
  nihui 87d7165848
disable signal based detectisa if being debugged (#5280) 2 years ago
  Justin Fung f6763262d1
Add draw rectangle, draw text, draw circle, and draw line to C API (#5324) 2 years ago
  Xinyu Yang 7ac42680cf
RVV: Refine riscv gemm fp32 (#5303) 2 years ago
  Sophon 294e786d36
convolution_x86: Fix typo in logging (#5310) 2 years ago
  nihui 0942efab2e
x86 avx512 optimization for mish (#5309) 2 years ago
  nihui 7928d44d51
port stb image optimization (#5307) 2 years ago
  nihui 05b4dcb06c
report vulkan cm 8x8x16 config, enable fp16a cm (#5298) 2 years ago
  nihui 5329d32e74
check vulkan fp16 uniform support and implement lfp conversion without fp16u (#5287) 2 years ago
  nihui 656b082284
fix cast armv7 sigbus when loading fp16 model (#5292) 2 years ago
  nihui ba42369c68
workaround l2 norm produce -inf value with subnormals (#5272) 2 years ago
  nihui c222208cc9
feat mask for disable threading, make some extractor setter no-op, update doc (#5270) 2 years ago
  nihui a31f66203b
do not cache temporary blob for uploading weight (#5266) 2 years ago
  nihui 556b79ce4d
create layer decoupled (#5258) 2 years ago
  Molly Sophia 92d49e1f59
requantize: Use activation_ss in fused_activation.h (#5245) 2 years ago
  nihui d1d9aa2edb
fix some cpu.cpp warning (#5244) 2 years ago
  nihui d30af29ee2
fix simplecv Mat templated ptr (#5241) 2 years ago
  nihui 6c261a8c04
fix the missing elemsize in vkimagemat from_android_hardware_buffer (#5237) 2 years ago
  nihui ded0b78bb2
fix nvidia vulkan crash on exit (#5234) 2 years ago
  nihui 8c4fc5e2a0
enable uniform 16bit and 8bit when available, fix validation error in fp16sa shader (#5233) 2 years ago
  nihui b7f70cfe4e
initialize cpu thread affinity mask all to all cores (#5231) 2 years ago
  nihui 5a8ce63af4
optimize resize bilinear and compress font data (#5200) 2 years ago
  nihui eea3fc9b41
optimize vulkan global pooling (#5191) 2 years ago
  nihui 1138312f1e
detect avx512 isa with signal action on macos (#5185) 2 years ago
  nihui dba87f8cad
fix build with msvc arm64 asimdhp (#5176) 2 years ago
  nihui deae9e61da
disable rtti and exceptions for msvc (#5167) 2 years ago
  nihui 058aa0ad37
enable arm neon intrinsics for msvc build (#5151) 2 years ago
  AlOa 9f26eeb5a7
Prelu layer uses sse instruction _mm_load_ps but data can be misaligned so it must use _mm_loadu_ps (#5149) 2 years ago
  Justin Fung 465debe9bb
Add print statements for 4 dimensions benchmark (#5148) 2 years ago
  nihui 4136de3b8d
arm optimization for convolution int8 packed unified elempack (#5147) 2 years ago