This website works better with JavaScript.
Home
Issues
Pull Requests
Milestones
AI流水线
Repositories
Datasets
Forum
实训
竞赛
大数据
AI开发
Register
Sign In
Huawei_Technology
/
mindspore
Not watched
Unwatch
Watch all
Watch but not notify
1
Star
0
Fork
0
Code
Releases
13
Wiki
evaluate
Activity
Issues
0
Pull Requests
0
Datasets
Model
Cloudbrain
HPC
Browse Source
!10052
set HCCL_CONNECT_TIMEOUT=600 for transformer distribute training
From:
@yuchaojie
Reviewed-by: @linqingke,@liangchenghui Signed-off-by:
@linqingke
tags/v1.1.0
mindspore-ci-bot
Gitee
5 years ago
parent
eda6ce12ed
0f610e11de
commit
b2e98083c6
1 changed files
with
1 additions
and
0 deletions
Unified View
Diff Options
Show Stats
Download Patch File
Download Diff File
+1
-0
model_zoo/official/nlp/transformer/scripts/run_distribute_train_ascend.sh
+ 1
- 0
model_zoo/official/nlp/transformer/scripts/run_distribute_train_ascend.sh
View File
@@ -28,6 +28,7 @@ cd run_distribute_train || exit
EPOCH_SIZE=$2
EPOCH_SIZE=$2
DATA_PATH=$3
DATA_PATH=$3
export HCCL_CONNECT_TIMEOUT=600
export RANK_TABLE_FILE=$4
export RANK_TABLE_FILE=$4
export RANK_SIZE=$1
export RANK_SIZE=$1
export HCCL_FLAG=1
export HCCL_FLAG=1
Write
Preview
Loading…
Cancel
Save