You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

test_lstm.cpp 12 kB

LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
LSTM arm/x86 + fp16 innerproduct arm (#1881) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>
5 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291
  1. // Tencent is pleased to support the open source community by making ncnn available.
  2. //
  3. // Copyright (C) 2020 THL A29 Limited, a Tencent company. All rights reserved.
  4. //
  5. // Licensed under the BSD 3-Clause License (the "License"); you may not use this file except
  6. // in compliance with the License. You may obtain a copy of the License at
  7. //
  8. // https://opensource.org/licenses/BSD-3-Clause
  9. //
  10. // Unless required by applicable law or agreed to in writing, software distributed
  11. // under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
  12. // CONDITIONS OF ANY KIND, either express or implied. See the License for the
  13. // specific language governing permissions and limitations under the License.
  14. #include "layer/lstm.h"
  15. #include "testutil.h"
  16. static int test_lstm(const ncnn::Mat& a, int outch, int direction, int hidden_size = 0)
  17. {
  18. int input_size = a.w;
  19. int num_directions = direction == 2 ? 2 : 1;
  20. if (hidden_size == 0)
  21. hidden_size = outch;
  22. ncnn::ParamDict pd;
  23. pd.set(0, outch);
  24. pd.set(1, hidden_size * input_size * 4 * num_directions);
  25. pd.set(2, direction);
  26. pd.set(3, hidden_size);
  27. std::vector<ncnn::Mat> weights(hidden_size == 0 ? 3 : 4);
  28. weights[0] = RandomMat(hidden_size * input_size * 4 * num_directions);
  29. weights[1] = RandomMat(hidden_size * 4 * num_directions);
  30. weights[2] = RandomMat(outch * hidden_size * 4 * num_directions);
  31. if (hidden_size)
  32. {
  33. weights[3] = RandomMat(hidden_size * outch * num_directions);
  34. }
  35. int ret = test_layer<ncnn::LSTM>("LSTM", pd, weights, a);
  36. if (ret != 0)
  37. {
  38. fprintf(stderr, "test_lstm failed a.dims=%d a=(%d %d %d) outch=%d direction=%d hidden_size=%d\n", a.dims, a.w, a.h, a.c, outch, direction, hidden_size);
  39. }
  40. return ret;
  41. }
  42. int test_lstm_layer_with_hidden(const ncnn::Mat& a, int outch, int direction, int hidden_size = 0)
  43. {
  44. int input_size = a.w;
  45. int num_directions = direction == 2 ? 2 : 1;
  46. if (hidden_size == 0)
  47. hidden_size = outch;
  48. ncnn::ParamDict pd;
  49. pd.set(0, outch);
  50. pd.set(1, hidden_size * input_size * 4 * num_directions);
  51. pd.set(2, direction);
  52. pd.set(3, hidden_size);
  53. std::vector<ncnn::Mat> weights(hidden_size == 0 ? 3 : 4);
  54. weights[0] = RandomMat(hidden_size * input_size * 4 * num_directions);
  55. weights[1] = RandomMat(hidden_size * 4 * num_directions);
  56. weights[2] = RandomMat(outch * hidden_size * 4 * num_directions);
  57. if (hidden_size)
  58. {
  59. weights[3] = RandomMat(hidden_size * outch * num_directions);
  60. }
  61. // initial hidden state
  62. ncnn::Mat hidden = RandomMat(outch, num_directions);
  63. // initial cell state
  64. ncnn::Mat cell = RandomMat(hidden_size, num_directions);
  65. std::vector<ncnn::Mat> as(3);
  66. as[0] = a;
  67. as[1] = hidden;
  68. as[2] = cell;
  69. int ret = test_layer<ncnn::LSTM>("LSTM", pd, weights, as, 3);
  70. if (ret != 0)
  71. {
  72. fprintf(stderr, "test_lstm_layer_with_hidden failed a.dims=%d a=(%d %d %d) outch=%d direction=%d hidden_size=%d\n", a.dims, a.w, a.h, a.c, outch, direction, hidden_size);
  73. }
  74. return ret;
  75. }
  76. int test_lstm_layer_with_hidden_input(const ncnn::Mat& a, int outch, int direction, int hidden_size = 0)
  77. {
  78. int input_size = a.w;
  79. int num_directions = direction == 2 ? 2 : 1;
  80. if (hidden_size == 0)
  81. hidden_size = outch;
  82. ncnn::ParamDict pd;
  83. pd.set(0, outch);
  84. pd.set(1, hidden_size * input_size * 4 * num_directions);
  85. pd.set(2, direction);
  86. pd.set(3, hidden_size);
  87. std::vector<ncnn::Mat> weights(hidden_size == 0 ? 3 : 4);
  88. weights[0] = RandomMat(hidden_size * input_size * 4 * num_directions);
  89. weights[1] = RandomMat(hidden_size * 4 * num_directions);
  90. weights[2] = RandomMat(outch * hidden_size * 4 * num_directions);
  91. if (hidden_size)
  92. {
  93. weights[3] = RandomMat(hidden_size * outch * num_directions);
  94. }
  95. // initial hidden state
  96. ncnn::Mat hidden = RandomMat(outch, num_directions);
  97. // initial cell state
  98. ncnn::Mat cell = RandomMat(hidden_size, num_directions);
  99. std::vector<ncnn::Mat> as(3);
  100. as[0] = a;
  101. as[1] = hidden;
  102. as[2] = cell;
  103. int ret = test_layer<ncnn::LSTM>("LSTM", pd, weights, as, 1);
  104. if (ret != 0)
  105. {
  106. fprintf(stderr, "test_lstm_layer_with_hidden_input failed a.dims=%d a=(%d %d %d) outch=%d direction=%d hidden_size=%d\n", a.dims, a.w, a.h, a.c, outch, direction, hidden_size);
  107. }
  108. return ret;
  109. }
  110. int test_lstm_layer_with_hidden_output(const ncnn::Mat& a, int outch, int direction, int hidden_size = 0)
  111. {
  112. int input_size = a.w;
  113. int num_directions = direction == 2 ? 2 : 1;
  114. if (hidden_size == 0)
  115. hidden_size = outch;
  116. ncnn::ParamDict pd;
  117. pd.set(0, outch);
  118. pd.set(1, hidden_size * input_size * 4 * num_directions);
  119. pd.set(2, direction);
  120. pd.set(3, hidden_size);
  121. std::vector<ncnn::Mat> weights(hidden_size == 0 ? 3 : 4);
  122. weights[0] = RandomMat(hidden_size * input_size * 4 * num_directions);
  123. weights[1] = RandomMat(hidden_size * 4 * num_directions);
  124. weights[2] = RandomMat(outch * hidden_size * 4 * num_directions);
  125. if (hidden_size)
  126. {
  127. weights[3] = RandomMat(hidden_size * outch * num_directions);
  128. }
  129. std::vector<ncnn::Mat> as(1);
  130. as[0] = a;
  131. int ret = test_layer<ncnn::LSTM>("LSTM", pd, weights, as, 3);
  132. if (ret != 0)
  133. {
  134. fprintf(stderr, "test_lstm_layer_with_hidden_output failed a.dims=%d a=(%d %d %d) outch=%d direction=%d hidden_size=%d\n", a.dims, a.w, a.h, a.c, outch, direction, hidden_size);
  135. }
  136. return ret;
  137. }
  138. static int test_lstm_0()
  139. {
  140. return 0
  141. || test_lstm(RandomMat(4, 1), 2, 2)
  142. || test_lstm(RandomMat(8, 2), 2, 2)
  143. || test_lstm(RandomMat(16, 8), 7, 2)
  144. || test_lstm(RandomMat(17, 8), 8, 2)
  145. || test_lstm(RandomMat(19, 15), 8, 2)
  146. || test_lstm(RandomMat(5, 16), 16, 2)
  147. || test_lstm(RandomMat(3, 16), 8, 2)
  148. || test_lstm(RandomMat(8, 16), 16, 2)
  149. || test_lstm(RandomMat(2, 5), 17, 2, 15);
  150. }
  151. static int test_lstm_1()
  152. {
  153. return 0
  154. || test_lstm_layer_with_hidden(RandomMat(4, 4), 1, 2)
  155. || test_lstm_layer_with_hidden(RandomMat(8, 2), 2, 2)
  156. || test_lstm_layer_with_hidden(RandomMat(16, 8), 7, 2)
  157. || test_lstm_layer_with_hidden(RandomMat(17, 8), 8, 2)
  158. || test_lstm_layer_with_hidden(RandomMat(19, 15), 8, 2)
  159. || test_lstm_layer_with_hidden(RandomMat(5, 16), 16, 2)
  160. || test_lstm_layer_with_hidden(RandomMat(3, 16), 8, 2)
  161. || test_lstm_layer_with_hidden(RandomMat(2, 5), 99, 2, 33)
  162. || test_lstm_layer_with_hidden(RandomMat(4, 4), 1, 1)
  163. || test_lstm_layer_with_hidden(RandomMat(8, 2), 2, 1)
  164. || test_lstm_layer_with_hidden(RandomMat(16, 8), 7, 1)
  165. || test_lstm_layer_with_hidden(RandomMat(17, 8), 8, 1)
  166. || test_lstm_layer_with_hidden(RandomMat(19, 15), 8, 1)
  167. || test_lstm_layer_with_hidden(RandomMat(5, 16), 16, 1)
  168. || test_lstm_layer_with_hidden(RandomMat(3, 16), 8, 1)
  169. || test_lstm_layer_with_hidden(RandomMat(2, 5), 99, 1, 33)
  170. || test_lstm_layer_with_hidden(RandomMat(4, 2), 1, 0)
  171. || test_lstm_layer_with_hidden(RandomMat(8, 2), 2, 0)
  172. || test_lstm_layer_with_hidden(RandomMat(16, 8), 7, 0)
  173. || test_lstm_layer_with_hidden(RandomMat(17, 8), 8, 0)
  174. || test_lstm_layer_with_hidden(RandomMat(19, 15), 8, 0)
  175. || test_lstm_layer_with_hidden(RandomMat(5, 16), 16, 0)
  176. || test_lstm_layer_with_hidden(RandomMat(3, 16), 8, 0)
  177. || test_lstm_layer_with_hidden(RandomMat(2, 5), 17, 0, 15)
  178. || test_lstm_layer_with_hidden_input(RandomMat(4, 4), 1, 2)
  179. || test_lstm_layer_with_hidden_input(RandomMat(8, 2), 2, 2)
  180. || test_lstm_layer_with_hidden_input(RandomMat(16, 8), 7, 2)
  181. || test_lstm_layer_with_hidden_input(RandomMat(17, 8), 8, 2)
  182. || test_lstm_layer_with_hidden_input(RandomMat(19, 15), 8, 2)
  183. || test_lstm_layer_with_hidden_input(RandomMat(5, 16), 16, 2)
  184. || test_lstm_layer_with_hidden_input(RandomMat(3, 16), 8, 2)
  185. || test_lstm_layer_with_hidden_input(RandomMat(2, 5), 99, 2, 33)
  186. || test_lstm_layer_with_hidden_input(RandomMat(4, 4), 1, 1)
  187. || test_lstm_layer_with_hidden_input(RandomMat(8, 2), 2, 1)
  188. || test_lstm_layer_with_hidden_input(RandomMat(16, 8), 7, 1)
  189. || test_lstm_layer_with_hidden_input(RandomMat(17, 8), 8, 1)
  190. || test_lstm_layer_with_hidden_input(RandomMat(19, 15), 8, 1)
  191. || test_lstm_layer_with_hidden_input(RandomMat(5, 16), 16, 1)
  192. || test_lstm_layer_with_hidden_input(RandomMat(3, 16), 8, 1)
  193. || test_lstm_layer_with_hidden_input(RandomMat(2, 5), 99, 1, 33)
  194. || test_lstm_layer_with_hidden_input(RandomMat(4, 2), 1, 0)
  195. || test_lstm_layer_with_hidden_input(RandomMat(8, 2), 2, 0)
  196. || test_lstm_layer_with_hidden_input(RandomMat(16, 8), 7, 0)
  197. || test_lstm_layer_with_hidden_input(RandomMat(17, 8), 8, 0)
  198. || test_lstm_layer_with_hidden_input(RandomMat(19, 15), 8, 0)
  199. || test_lstm_layer_with_hidden_input(RandomMat(5, 16), 16, 0)
  200. || test_lstm_layer_with_hidden_input(RandomMat(3, 16), 8, 0)
  201. || test_lstm_layer_with_hidden_input(RandomMat(2, 5), 17, 0, 15)
  202. || test_lstm_layer_with_hidden_output(RandomMat(4, 4), 1, 2)
  203. || test_lstm_layer_with_hidden_output(RandomMat(8, 2), 2, 2)
  204. || test_lstm_layer_with_hidden_output(RandomMat(16, 8), 7, 2)
  205. || test_lstm_layer_with_hidden_output(RandomMat(17, 8), 8, 2)
  206. || test_lstm_layer_with_hidden_output(RandomMat(19, 15), 8, 2)
  207. || test_lstm_layer_with_hidden_output(RandomMat(5, 16), 16, 2)
  208. || test_lstm_layer_with_hidden_output(RandomMat(3, 16), 8, 2)
  209. || test_lstm_layer_with_hidden_output(RandomMat(2, 5), 99, 2, 33)
  210. || test_lstm_layer_with_hidden_output(RandomMat(4, 4), 1, 1)
  211. || test_lstm_layer_with_hidden_output(RandomMat(8, 2), 2, 1)
  212. || test_lstm_layer_with_hidden_output(RandomMat(16, 8), 7, 1)
  213. || test_lstm_layer_with_hidden_output(RandomMat(17, 8), 8, 1)
  214. || test_lstm_layer_with_hidden_output(RandomMat(19, 15), 8, 1)
  215. || test_lstm_layer_with_hidden_output(RandomMat(5, 16), 16, 1)
  216. || test_lstm_layer_with_hidden_output(RandomMat(3, 16), 8, 1)
  217. || test_lstm_layer_with_hidden_output(RandomMat(2, 5), 99, 1, 33)
  218. || test_lstm_layer_with_hidden_output(RandomMat(4, 2), 1, 0)
  219. || test_lstm_layer_with_hidden_output(RandomMat(8, 2), 2, 0)
  220. || test_lstm_layer_with_hidden_output(RandomMat(16, 8), 7, 0)
  221. || test_lstm_layer_with_hidden_output(RandomMat(17, 8), 8, 0)
  222. || test_lstm_layer_with_hidden_output(RandomMat(19, 15), 8, 0)
  223. || test_lstm_layer_with_hidden_output(RandomMat(5, 16), 16, 0)
  224. || test_lstm_layer_with_hidden_output(RandomMat(3, 16), 8, 0)
  225. || test_lstm_layer_with_hidden_output(RandomMat(2, 5), 17, 0, 15);
  226. }
  227. static int test_lstm_2()
  228. {
  229. return 0
  230. || test_lstm(RandomMat(4, 1), 1, 0)
  231. || test_lstm(RandomMat(8, 2), 2, 0)
  232. || test_lstm(RandomMat(16, 8), 7, 0)
  233. || test_lstm(RandomMat(17, 8), 8, 0)
  234. || test_lstm(RandomMat(19, 15), 8, 0)
  235. || test_lstm(RandomMat(5, 16), 16, 0)
  236. || test_lstm(RandomMat(3, 16), 8, 0)
  237. || test_lstm(RandomMat(8, 16), 16, 0)
  238. || test_lstm(RandomMat(2, 5), 17, 0, 15);
  239. }
  240. static int test_lstm_3()
  241. {
  242. return 0
  243. || test_lstm(RandomMat(4, 1), 1, 1)
  244. || test_lstm(RandomMat(8, 2), 2, 1)
  245. || test_lstm(RandomMat(16, 8), 7, 1)
  246. || test_lstm(RandomMat(17, 8), 8, 1)
  247. || test_lstm(RandomMat(19, 15), 8, 1)
  248. || test_lstm(RandomMat(5, 16), 16, 1)
  249. || test_lstm(RandomMat(3, 16), 8, 1)
  250. || test_lstm(RandomMat(8, 16), 16, 1)
  251. || test_lstm(RandomMat(2, 5), 17, 1, 15);
  252. }
  253. int main()
  254. {
  255. SRAND(7767517);
  256. return 0 || test_lstm_0() || test_lstm_1() || test_lstm_2() || test_lstm_3();
  257. }