1610 Commits (a5e60ae11cdd52d0071bc34bdd609eb53dfaf18d)

Author SHA1 Message Date
  Ikko Ashimine cdba4ae936
Fix typo in stb_image.h (#4358) 3 years ago
  nihui eceac35a7f
implement MultiheadAttention kdim vdim (#4347) 3 years ago
  nihui 498ca7341b
squeeze and expanddims 4d (#4346) 3 years ago
  Lry89757 6a47f8d15c
gridsample op support (#4288) 3 years ago
  junchao-loongson 279222c2c9
add vector optimization for loongarch64 (#4242) 3 years ago
  nihui 5b28c1730e
implement ncnn fold and unfold (#4326) 3 years ago
  Xavier Hsinyuan d1ac1de7ab
RVV: InstanceNorm with fp16s(a) support (#4078) 3 years ago
  Xavier Hsinyuan 31602bd2dc
RVV: BatchNorm with fp16s(a) support (#4075) 3 years ago
  nihui 6e49fa30dc
groupnorm 1d/2d/4d (#4312) 3 years ago
  nihui b853b3d132
get_physical_cpu_count api family (#4302) 3 years ago
  nihui 9c6f1107d2
fix #4315 (#4316) 3 years ago
  nihui 5ee276cdf7
x86 unified fc fp32/fp16s (#4303) 3 years ago
  nihui 512e584a6a
general cpu feature detection on macos/ios, enable bf16 and i8mm on a15 a16 and m2 (#4300) 3 years ago
  bestpower a116e005b8
Fix linux build error(#4265) (#4294) 3 years ago
  nihui 8eab5ea0ea
x86 sse2/avx2 optimization for convolution sgemm/winograd int8 family (#4286) 3 years ago
  Fangjun Kuang 5281d51535
implement GLU and pnnx conversion (#4283) 3 years ago
  Yoh bb660d09b8
add elu vulkan operator (#4280) 3 years ago
  nihui 0b591b0d1f
implement layer feature disabled bit (#4278) 3 years ago
  Eahow Chen f80c2743e7
fix compile warning with gcc 9.1.0 including simplestl.h file (#4274) 3 years ago
  miemie2013 b13c2a16ce
Optimize x86 DeformableConv2D (#4128) 3 years ago
  nihui 77eda4c19f
implement lstm proj_size (#4263) 3 years ago
  nihui 3e2b3fa04d
more stricter armv7 fp16 and armv84 bf16 compiler check, fix #4147 fix #4222 (#4247) 3 years ago
  LinHe 9426e21166
Memory Pool Improvement For Variadic Sized Inputs (#4190) 3 years ago
  Xavier Hsinyuan e7eadca6c1
RVV: use new interface for segment load/store & change word_type to size_t&add clang ci (part #4100) (#4118) 3 years ago
  汤圆奶昔 d30fc825d4
style: space alignment (#4217) 3 years ago
  Lry89757 5eb56b2ea5
[Gelu x86] Finish intrinsic with elempack merged(fast version) (#4144) 3 years ago
  Lry89757 9f59711338
[Prelu x86] Finish intrinsic with elempack merged (#4177) 3 years ago
  luqiang guo 5148224516
optmize softmax arm neon (#4171) 3 years ago
  Menci 479a73a62a
remove duplicated newline (#4188) 3 years ago
  Molly Sophia 1d7b2172cc
remove duplicated newline (#4187) 3 years ago
  Lry89757 9278f90114
[Elu x86] Finish intrinsic with elempack merged (#4153) 3 years ago
  nanjoin 3c0096c548
fix ConvolutionDepthwise allocator not updated (#4173) 3 years ago
  tpoisonooo acbaaa665b
fix compile warnings for unused parameter (#4131) 3 years ago
  Lry89757 00c08d7bda
[Batchnorm x86] Merge the multiple elempack (#4085) 3 years ago
  LinHe 03f2ad38ce
Layer Norm x86 SIMD Optimizations (#4065) 3 years ago
  nihui b4ba207c18
more strict compiler rvv checks, drop rvv-071 support (#4094) 3 years ago
  nihui 0666143513
fix vulkan winograd weight layout with cooperative matrix enabled (#4093) 3 years ago
  miemie2013 720f3c9aab
Add DeformableConv2D (#4070) 3 years ago
  nihui 4f414c1806
implement 4d memorydata (#4074) 3 years ago
  Lry89757 13a9533984
[BatchNorm Optimize x86] AVX512 intrinsic (#4061) 4 years ago
  nihui 30ab31cc41
add address sanitizer ci, fix potential memory leak shouted by asan (#4058) 4 years ago
  nihui 0ea7a672fa
fix undefined reference to vkGetAndroidHardwareBufferPropertiesANDROID, add android-29 shared ci (#4056) 4 years ago
  nihui 4bc4a5ed0b
check mat create oom (#4054) 4 years ago
  nihui 1d0917c83b
fix build with very old gcc (#4048) 4 years ago
  nihui b0c40fa644
unified arm eltwise elempack (#4040) 4 years ago
  nihui 76849cede4
armv8.4 i8mm optimization for convolution gemm int8 (#4034) 4 years ago
  nihui dd86cebab8
armv8.6 ci and coverage (#4025) 4 years ago
  nihui f1ea792b26
fix too many microtask error in old libomp runtime (#4002) 4 years ago
  nihui 9b8272e86d
arm edsp and arm neon optimization for convolution int8 winograd (#4017) 4 years ago
  nihui a12cd7c212
mips msa and loongson mmi optimization for convolution int8 winograd f43 (#4014) 4 years ago