* runtime cpu dispatch
* force thread one
* disable openmp for coverage
* simplify test layer
* print NCNN_TARGET_ARCH
* less ci build variants
* weight fp16 storage option
* test convdw int8
* apple a12 a13
* ncnn_add_layer ncnn_add_shader cmake macro
* added fp16 weight storage version
* Small changes
* Fixed fp16 weight storage layers
* fix innerproduct
* fix loop error
* Fix windows build.
Disable fp 16 conversion when detecting int8 weights.
Implement requested changes.
* Restyled by clang-format
* Restyled by astyle
* Restyled by clang-format
* Restyled by astyle
* Update option.cpp
Set fp16 storage based on vulkan being used or not.
* fix innerproduct activation location and add 4 parallel channel version
* Restyled by clang-format
* Restyled by astyle
* Restyled by clang-format
* Restyled by astyle
* revert arm file
* implement requested changes
* Restyled by clang-format
* Restyled by astyle
* Restyled by clang-format
* Restyled by astyle
Co-authored-by: Restyled.io <commits@restyled.io>
* use Mat class for Shape description
* shape specialization constant in compute shader
* wip
* wip
* test forward_inplace, add binaryop unaryop sigmoid test
* fix arm unaryop test
* fix arm binaryop test
* make shape hint optional, cast int8 to fp32, add cast test
* wip
* follow the good and old local size setting for conv1x1
* the optimal local size rewrite
* fix build on msvc
* add permute shader for all packing layout, add permute test
* concat and slice patial shape constant, slice test
* fix slice test
* interp test
* add lrn test, test packing layout implicitly
* add eltwise test
* add normalize test
* add instancenorm test
* reorg shape constant
* simple local group size partition
* add shape constant param