ms_yan
36a8886ca2
Revert "[feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset"
This reverts commit b077aa1cab .
Revert "[feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset"
This reverts commit 4e6f7dc97d .
delete pass_registry_test.cc
comment hiai_nlu_model_multi.pb related line
4 years ago
djc
4e6f7dc97d
[feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset
4 years ago
yao_yf
a83bf73298
union auto_parallel_context interface dataset_strategy
4 years ago
yao_yf
dc7dc7d3fa
dataset strategy set
4 years ago
Xiaoda Zhang
bb5d4212f7
enable All2All in infering redistribution ops
4 years ago
Xiaoda Zhang
04381273b3
Add the sharding propagation function:
1) users configure sharding strategies for operators;
2) framework will propagate the strategies from configured-ops to
non-configured ops using BFS;
3) the propagation goal is to minimize redistribution communication
cost;
4 years ago
lichenever
cb438ce350
rectification_log
4 years ago
Ziyan
95ac0f6d58
fix optimizer weight shard config
4 years ago
huangxinjing
e79db658e8
Fix codex for python file
4 years ago
Ziyan
2a752f24bf
enable not fully use opt shard
5 years ago
Ziyan
d19d42ee44
modify grad accu and comm fusion api
5 years ago
liujunzhu
6541b96c40
Add communication parallel mode.
5 years ago
yangzhenzhang
7303c3d3b8
add group ckpt
5 years ago
yangzhenzhang
9da3f9bec9
mini step grad accumulation
5 years ago
lizhenyu
7eb49cfce7
[bugfix] server core dump after traning
5 years ago
jinyaohui
e6f9806cfb
add broadcast
5 years ago
lichenever
cfffff2875
add check for allreduce fusion
5 years ago
huangxinjing
2fa6a3b3c2
Fix doc error
5 years ago
mindspore-ci-bot
9bd34a1b29
!6673 Add stage information for ops and strategy
Merge pull request !6673 from huangxinjing/stage_strategy
5 years ago
huangxinjing
4ef439e27b
Add stage information for ops and strategy
5 years ago
lichenever
395d3f0848
add_limit_for_allreduce_fusion
5 years ago
huangxinjing
8ba1503135
Add default value for auto search parallel mode
5 years ago
yao_yf
b70204c080
auto parallel context add notes and func mv
5 years ago
Ziyan
8ea177e614
fix_api_problems
5 years ago
lichenever
f2d3fd34ce
rectification_allreduce_fusion_api
5 years ago
yao_yf
d4cfe55c04
rename mirror_mean to gradients_mean
5 years ago
yao_yf
8f7aa5bd5a
auto parallel context modify
5 years ago
Yi Huaijie
394be43492
raise RuntimeError when set different mode after Initializer created
5 years ago
Yi Huaijie
89a4ebf8a1
parallel mode must be set before create an initializer
5 years ago
yuchaojie
64a1560f1a
add allreduce group for resnet gpu version
5 years ago
yuchaojie
ed9cf2036c
add nccl default allreduce fusion group
5 years ago
Ziyan
39f08eb7dd
enable optimizer parallel
5 years ago
Ziyan
0925e35252
enable optimizer parallel with broadcast
5 years ago
hongxing
d798325127
supplement description about default value
5 years ago
hongxing
3ad3a71fc7
change interface
5 years ago
mindspore-ci-bot
4df861cb62
!1672 support load full dataset on each device
Merge pull request !1672 from yihuaijie/dev
5 years ago
kswang
699166e552
default fusion group for ge
5 years ago
Yi Huaijie
e5c351690b
support load full dataset on each device
5 years ago
kswang
503dd297c5
set all reduce fusion default group
5 years ago
kswang
362bbacf19
add group for allreduce fusion
5 years ago
yao_yf
6cde5f6d91
auto parallel strategy checkpoint
5 years ago
lirongzhen
4ff418084c
enable/disable allreduce_fusion
5 years ago
leonwanghui
24b26ee1a8
Move args_type_check function to _checkparam.py
5 years ago
yao_yf
6fdcc24585
Integrate two allreduce fusion set interfaces into one
6 years ago
zhunaipan
930a1fb0a8
initial version
Signed-off-by: leonwanghui <leon.wanghui@huawei.com>
6 years ago