* runtime cpu dispatch
* force thread one
* disable openmp for coverage
* simplify test layer
* print NCNN_TARGET_ARCH
* less ci build variants
* weight fp16 storage option
* test convdw int8
* apple a12 a13
* ncnn_add_layer ncnn_add_shader cmake macro
* added fp16 weight storage version
* Small changes
* Fixed fp16 weight storage layers
* fix innerproduct
* fix loop error
* Fix windows build.
Disable fp 16 conversion when detecting int8 weights.
Implement requested changes.
* Restyled by clang-format
* Restyled by astyle
* Restyled by clang-format
* Restyled by astyle
* Update option.cpp
Set fp16 storage based on vulkan being used or not.
* added ability for storing state in lstm layer
* added avx lstm
* added arm lstm
* fix innerproduct activation location and add 4 parallel channel version
* Restyled by clang-format
* Restyled by astyle
* Restyled by clang-format
* Restyled by astyle
* revert arm file
* commit before switch
* implement requested changes
* Restyled by clang-format
* Restyled by astyle
* Restyled by clang-format
* Restyled by astyle
* More x86 optimized implementations of common layers.
Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy
Added fp16 innerproduct for arm
* fix non avx build
* Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation.
* Fix build check for fp16 arm
* Bypass lstm_fp16 if not supported
* Build order was incorrect
* fix std::min missing in windows build
* Restyled by clang-format
* Restyled by astyle
* Restyled by clang-format
* Restyled by astyle
* attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type
* remove double "fix"
* Specify ieee fp16 format
* implement requested changes
* fix arm non-fp16 build
* fix arm lstm
* Restyled/pull 1881 (#15)
* Restyled by clang-format
* Restyled by astyle
* Restyled by clang-format
* Restyled by astyle
Co-authored-by: Restyled.io <commits@restyled.io>
* Check blob size on arm lstm
* fix styling
Co-authored-by: Restyled.io <commits@restyled.io>
* added fp16 weight storage version
* Small changes
* Fixed fp16 weight storage layers
* fix innerproduct
* fix loop error
* Fix windows build.
Disable fp 16 conversion when detecting int8 weights.
Implement requested changes.
* Restyled by clang-format
* Restyled by astyle
* Restyled by clang-format
* Restyled by astyle
* Update option.cpp
Set fp16 storage based on vulkan being used or not.
* fix innerproduct activation location and add 4 parallel channel version
* Restyled by clang-format
* Restyled by astyle
* Restyled by clang-format
* Restyled by astyle
* revert arm file
* implement requested changes
* Restyled by clang-format
* Restyled by astyle
* Restyled by clang-format
* Restyled by astyle
Co-authored-by: Restyled.io <commits@restyled.io>
* added avx implementations of FC and Max pool
* Specify AVX2
* Small fixes and using Fused avx activations
* fix type casting
* fixing some CI errors
* Fix code format
* fix pooling test
* remove vector typedef
* More compile fixes
* remove vector typedef
* set c++ version to 17
* Force c++ 17
* Fixing mathfun
* Try and workaround typedef issues
* typefix
* Remove typedef
* switch to static inline
* attempting to fix msvc bug
* Verified MSVX FIX
* Fixing clang build
* commit before switch
* More avx and packing implementation
* Fix ctest
* starting the depthwise pack 8 implementation
* Unrolled loop
* add depthwise pack 8 implementations
* Working 1x1 pack 8 implementation added
* revert incorrect changes
* added conact elempack 8
* more elempack enabled layers added and started on the conversion of the winograd pack4 conv 3x3
* Added code formatting
* fix styling
* Unroll loops
* unrolling loops
* Added more elempac layers for mobilenet v3
* revert commit
* fix code style
* remove arm neon references
* remove pack4 references
* More cleanup
* added packing avx code
* fixing linux build ctests
* remove usage of aligned loads
* More aligned mem ops removed
* Cleanup, revert some files and remove not working winograd and shufflechannel implementation
* add stackoverflow referal
* Fix windows build
* Restyled by clang-format
* Restyled by astyle
* Restyled by clang-format
* Restyled by astyle
* implement requested chaanges
* remove reshape
* revert arm file change
* Restyled by clang-format
* Restyled by astyle
* Restyled by clang-format
* Restyled by astyle
* fix unterminated directive
Co-authored-by: Restyled.io <commits@restyled.io>