i-robot
e09a674bcc
!32529 Fix optimizer limit for the Parallel Optimizer
Merge pull request !32529 from huangxinjing/fix_model_limit
3 years ago
huangxinjing
b16dbf2b5d
1. Fix optimizer check error, as the check is done by the class name, too naive
2. Add virutal assign add to the operator
4 years ago
i-robot
acce047dfe
!32341 add parallel ops about Invert CheckValid PopulationCount
Merge pull request !32341 from yangzhenzhang/add-parallel-operators
4 years ago
i-robot
72414ff8d9
!31941 modify Strided_slice for master
Merge pull request !31941 from lilei/modify_stridedslice_for_master
4 years ago
yangzhenzhang
c927da3b41
add parallel operators
4 years ago
i-robot
9810fa53cb
!31899 [Auto-Par] [D-Rec] Add Mem & Redis coefficient on D-Rec cost model for Pangu-alpha
Merge pull request !31899 from FRHW-WANG/D-Rec-deliver
4 years ago
haoran.wang
3674e1d713
Add Mem & Redis coefficient for PanGu-alpha
4 years ago
i-robot
e897c98b1f
!31988 Implementation of SquaredDifferenceInfo, ErfinvInfo, MaskedFillInfo, SplitVInfo, GammaInfo, KLDivLossInfo and LinSpaceInfo.
Merge pull request !31988 from liuluobin/ops_impl
4 years ago
liuluobin
6f914b8b3c
Implementation of SquaredDifferenceInfo, ErfinvInfo, MaskedFillInfo, SplitVInfo, GammaInfo, KLDivLossInfo and LinSpaceInfo
4 years ago
i-robot
a9cdbd5ae8
!32005 fix bugs of moe
Merge pull request !32005 from bichaoyang/master
4 years ago
b00518648
93da6bab46
fix bugs of moe: only use a fewer dp in moe
4 years ago
lilei
452362332e
xmodify stridedslice for master
4 years ago
wangjun
789539cbaa
modify interface name for shard
4 years ago
haoran.wang
fdfbe2dedc
Modify the name of the funtions and variables of Parameter shared User strategy treatment
4 years ago
i-robot
c2212f88b4
!31164 Fix the global norm missing insert allreduce
Merge pull request !31164 from huangxinjing/fx_global_norm_error
4 years ago
i-robot
bf03f0e030
!31252 Implementation of element wise parallel ops
Merge pull request !31252 from liuluobin/element_wise_ops
4 years ago
liuluobin
f13d342986
Implementation of element wise parallel ops
4 years ago
huangxinjing
31f55b6525
1. The main gol: Fix mixing inserting the AllReduce when where is no mirror appeared
2. remove pattern match error as the origin pattern match will find no operator if there is only one parameter
4 years ago
yangzhenzhang
1f98ffb79c
adafactor parallel skip handle reshape
4 years ago
i-robot
dcb5cd670c
!30953 Dynamic Weight Decay
Merge pull request !30953 from wanyiming/dynamic_wd
4 years ago
lilei
690c58ebcf
modify virtualdataset bug for master
4 years ago
i-robot
c2a5cc1486
!31040 Produce parallel operators for Argmin/max, SquareSumAll and UnsortedSegmentProd
Merge pull request !31040 from Bert0108/reduce_operators_arg
4 years ago
i-robot
216e7c6a92
!31041 add check for conv2d
Merge pull request !31041 from yangzhenzhang/add-check-for-conv2d
4 years ago
yangzhenzhang
c00d29f223
rebase
4 years ago
liuluobin
8f045d02e3
Fix a bug where ROIAlign and CropAndResize distributed op do not support GPU
4 years ago
Bert0108
bfc5e4345c
add distributed operators for argmax/min sqauresumall and unsortedsetmentprod
4 years ago
wanyiming
a124ec4de7
add dynamic_decay
4 years ago
i-robot
335ef1c270
!30459 Add ut validate function for parallel
Merge pull request !30459 from liuluobin/ut_master
4 years ago
liuluobin
b797a410cc
Add validate function for parallel ut
4 years ago
Bert0108
dfc92f1791
add distributed parallel operators for reduceall and reduceprod
4 years ago
yao_yf
b60e54e0d5
support not only power of 2
4 years ago
wangjun
46612fabfb
add st for shard
4 years ago
i-robot
ad9757ccf0
!30661 [Auto parallel] [MoE] Fix an error of configuring MoE parallel
Merge pull request !30661 from Xiaoda/124-moe-changes
4 years ago
Xiaoda Zhang
81e5abe580
fix an error of configuring parallel
4 years ago
huangxinjing
896daee845
[AutoParallel]Fix insert error for the mirror
4 years ago
yangzhenzhang
43e6e16da3
check platform for resizebilinear
4 years ago
i-robot
0341d96dd6
!30469 add shard function to support part of the graph executed in auto_parallel under pynative mode
Merge pull request !30469 from wangjun/0223_pp
4 years ago
i-robot
cfe0f76d2b
!30491 ut for allgather fusion
Merge pull request !30491 from jiahongQian/master
4 years ago
wangjun
24d448239c
add pynative_parallel
4 years ago
i-robot
981eae461a
!30118 自动优化器并行特性
Merge pull request !30118 from zhuyuxiao/I4S85V
4 years ago
jiahongQian
25f57505bf
ut for allgather fusion
4 years ago
i-robot
bbcfbce9e0
!29997 [Auto parallel] [MoE] Support data_parallel + expert_parallel
Merge pull request !29997 from Xiaoda/124-moe-changes
4 years ago
zhuyuxiao
d0e0e305d3
good
4 years ago
i-robot
f2130e7434
!30483 [AutoParallel]Pipeline Automatic detection Opt
Merge pull request !30483 from lichen/pipeline_opt_detection
4 years ago
yao_yf
e21f878e14
adasum ut fix
4 years ago
Xiaoda Zhang
b714451937
implementing expert_parallel+data_parallel in MoE:
1) extending _Linear's input as 4-dimension tensor: [outer_batch, expert_dim, -1, hidden], and _Liner's BatchMatMul becomes BatchMatMul(4_dim_tensor, 3_dim_tensor);
2) configuring the _Linear's BatchMatMul sharding strategy as [[dp, ep, 1, 1], [ep, 1, mp]];
3) introducing a new parameter 'expert_parallel' in TransformerOpParallelConfig, creating a new class MoEParallelConfig to include 'data_parallel', 'model_parallel' and 'expert_parallel';
4) changing parallel config for FeedForward, TransformerEncoderLayer, TransformerDecoderLayer.
4 years ago
wangshengnan12@huawei.com
acbefd80ea
pipeline_opt_detection
4 years ago
i-robot
81260a2319
!30466 takedown test_auto_parallel_adasum.py to ensure stability, again
Merge pull request !30466 from yanghaoran/master
4 years ago
i-robot
14393503b7
!30431 allreduce allgather fusion
Merge pull request !30431 from jiahongQian/master
4 years ago
yanghaoran
71d6b7d506
takedown test_auto_parallel_adasum.py to ensure stability, again
4 years ago