* create layer decoupled * no more virtual public * allow build test with shared library * decouple cpu vulkan * drop old scripts
* Finish the gelu x86 intrinsics * Finish the fast tanh x86 simd impl