* get_physical_cpu_count api family
* set default to physical big cpu
* always treat smt core as big core
* is_smt_cpu
* get max freq mhz on windows
* windows thread affinity
* Simple miss count for better space efficiency
* Simple double ended greedy;
* Add size drop threshold setter;
* set workspace allocator cr to zero as we had some sort of recylcing capability :P
Co-authored-by: LinHeLurking <LinHeLurking@users.noreply.github.com>
Co-authored-by: nihuini <nihuini@tencent.com>
* runtime cpu dispatch
* force thread one
* disable openmp for coverage
* simplify test layer
* print NCNN_TARGET_ARCH
* less ci build variants
* weight fp16 storage option
* test convdw int8
* apple a12 a13
* ncnn_add_layer ncnn_add_shader cmake macro
* Add yolov4 example option.
Add yolov4-tiny for benchmark.
* Restyled by astyle
* Restyled by clang-format
* Restyled by astyle
Co-authored-by: Restyled.io <commits@restyled.io>
* Change DataReader::read()'s signature to fix warning C4267
This CL fixes lots of warning "C4267: 'initializing': conversion from
'size_t' to 'int'" in our codebase by matching DataReader::read()'s
signature to fread().
* Fix warnings C4244 and C4267 in tools/ncnnoptimize.cpp
C4244: 'initializing': conversion from 'double' to 'float',
possible loss of data
C4267: 'initializing': conversion from 'size_t' to 'int',
possible loss of data
* Fix warning C4244 in src\layer\selu.cpp
C4244: '=': conversion from 'double' to 'float', possible loss of data
* Fix warning C4244 in src\layer\cast.cpp
C4244: '=': conversion from 'double' to 'float', possible loss of data
C4244: 'return': conversion from 'float' to 'signed char', possible loss of data
* Fix warning C4244 in src\layer\psroipooling.cpp
C4244: 'initializing': conversion from 'double' to 'float',
possible loss of data
C4244: 'initializing': conversion from 'double' to 'int',
possible loss of data
* CMake improvement
* Fix bugs
* Fix typo
* Propagate vulkan dependency
* import vulkan
* add config files, now exported target cmake should be able to find packages
* Propagate no-rtti and no-exception
* Provide a option to control rtti and exception in mobile platform
* Make cmake clean
* Resolve conflicts
* Update CMake
PIE is propagated by INTERFACE_POSITION_INDEPENDENT_CODE
* Remove bad things
* add the armv7a conv3x3s1 implement without overflow,remove old codes
* fix the bug of conv3x3s2 packed int8
* new int8 implement,weight quant by perchanel,better accuracy~
* fix the bug of conv3x3s1 packed int8 neon
* add the naive c fp32 and int8 winograd F(2,3)
* add the neon intrinsic int8 winograd F(2,3)
* optimize the armv7a int8 winograd F(2,3) with neon assembly
* optimize the armv7a int8 winograd F(2,3) input transform with assembly.
* add the requantize layer and int8 relu implement.
* add graph optimize conv1x1s2 -> conv1x1s1,begin optimize int8 aarch64.
* fix int8 bugs
* add the c naive im2col with sgemm
* add aarch64 int8 winograd f23, conv3x3s2 naive implement
* add the int8 sgemm conv7x7s2 on x86/armv7a platform
* optimize the int8 sgemm by neon intrinsic and packed kernel
* optimize the int8 sgemm with packed data
* optimize the int8 sgemm with armv7a neon assembly
* add the int8 sgemm on arm64-v8a platform
* perpare to merge latest codes from master
* add the int8 param files
* In the Class Net,add the fuse_network method