* use Mat class for Shape description
* shape specialization constant in compute shader
* wip
* wip
* test forward_inplace, add binaryop unaryop sigmoid test
* fix arm unaryop test
* fix arm binaryop test
* make shape hint optional, cast int8 to fp32, add cast test
* wip
* follow the good and old local size setting for conv1x1
* the optimal local size rewrite
* fix build on msvc
* add permute shader for all packing layout, add permute test
* concat and slice patial shape constant, slice test
* fix slice test
* interp test
* add lrn test, test packing layout implicitly
* add eltwise test
* add normalize test
* add instancenorm test
* reorg shape constant
* simple local group size partition
* add shape constant param
* add folder property for a better looking in visual studio or other property supported IDEs
* fix condition for not found protobuf
* 1. capitalize to lowercase
2. rename visual folder 'test' to 'tests'
* Change DataReader::read()'s signature to fix warning C4267
This CL fixes lots of warning "C4267: 'initializing': conversion from
'size_t' to 'int'" in our codebase by matching DataReader::read()'s
signature to fread().
* Fix warnings C4244 and C4267 in tools/ncnnoptimize.cpp
C4244: 'initializing': conversion from 'double' to 'float',
possible loss of data
C4267: 'initializing': conversion from 'size_t' to 'int',
possible loss of data
* Fix warning C4244 in src\layer\selu.cpp
C4244: '=': conversion from 'double' to 'float', possible loss of data
* Fix warning C4244 in src\layer\cast.cpp
C4244: '=': conversion from 'double' to 'float', possible loss of data
C4244: 'return': conversion from 'float' to 'signed char', possible loss of data
* Fix warning C4244 in src\layer\psroipooling.cpp
C4244: 'initializing': conversion from 'double' to 'float',
possible loss of data
C4244: 'initializing': conversion from 'double' to 'int',
possible loss of data
* CMake improvement
* Fix bugs
* Fix typo
* Propagate vulkan dependency
* import vulkan
* add config files, now exported target cmake should be able to find packages
* Propagate no-rtti and no-exception
* Provide a option to control rtti and exception in mobile platform
* Make cmake clean
* Resolve conflicts
* Update CMake
PIE is propagated by INTERFACE_POSITION_INDEPENDENT_CODE
* Remove bad things
* optimize the conv sgemm int8 on arm64-v8a platform
* optimize int8 arm64-v8a with sadalp ins
* merge requantize op into latest conv layer
* merge requantize op into conv-int8 op
* update the mobilenet.param in the benchmark
* Update README.md
update Kirin970 and RK3399
* try to fix the travis build error
* add the armv7a conv3x3s1 implement without overflow,remove old codes
* fix the bug of conv3x3s2 packed int8
* new int8 implement,weight quant by perchanel,better accuracy~
* fix the bug of conv3x3s1 packed int8 neon
* add the naive c fp32 and int8 winograd F(2,3)
* add the neon intrinsic int8 winograd F(2,3)
* optimize the armv7a int8 winograd F(2,3) with neon assembly
* optimize the armv7a int8 winograd F(2,3) input transform with assembly.
* add the requantize layer and int8 relu implement.
* add graph optimize conv1x1s2 -> conv1x1s1,begin optimize int8 aarch64.
* fix int8 bugs
* add the c naive im2col with sgemm
* add aarch64 int8 winograd f23, conv3x3s2 naive implement
* add the int8 sgemm conv7x7s2 on x86/armv7a platform
* optimize the int8 sgemm by neon intrinsic and packed kernel
* optimize the int8 sgemm with packed data
* optimize the int8 sgemm with armv7a neon assembly
* add the int8 sgemm on arm64-v8a platform
* perpare to merge latest codes from master
* add the int8 param files
* In the Class Net,add the fuse_network method