Browse Source

!10052 set HCCL_CONNECT_TIMEOUT=600 for transformer distribute training

From: @yuchaojie
Reviewed-by: @linqingke,@liangchenghui
Signed-off-by: @linqingke
tags/v1.1.0
mindspore-ci-bot Gitee 5 years ago
parent
commit
b2e98083c6
1 changed files with 1 additions and 0 deletions
  1. +1
    -0
      model_zoo/official/nlp/transformer/scripts/run_distribute_train_ascend.sh

+ 1
- 0
model_zoo/official/nlp/transformer/scripts/run_distribute_train_ascend.sh View File

@@ -28,6 +28,7 @@ cd run_distribute_train || exit
EPOCH_SIZE=$2
DATA_PATH=$3

export HCCL_CONNECT_TIMEOUT=600
export RANK_TABLE_FILE=$4
export RANK_SIZE=$1
export HCCL_FLAG=1


Loading…
Cancel
Save