i-robot
017cb5f3ad
!27980 auto insert VirtualDataset node for master
Merge pull request !27980 from lilei/insert_virtualdataset_for_master
4 years ago
yangzhenzhang
e5df74e9e4
compute top bottom overlap for conv2d
4 years ago
Xiaoda Zhang
6d8320fa66
1) fix the exact division in moe;
2) changing CumSum from composition to a single Operator;
3) add InferMirrorOps for CumSumInfo.
4 years ago
Xiaoda Zhang
1bdb610b34
changing default value of single-loop flag
4 years ago
i-robot
dd90a56d68
!28073 fix code warning && remove save_graphs use in st/ut
Merge pull request !28073 from huanghui/fix-warning
4 years ago
i-robot
22c25ec10e
!27862 [Auto parallel] [Sharding propagation] dealing with cast
Merge pull request !27862 from Xiaoda/119-adapting-sharding-propagation
4 years ago
huanghui
74ca50e652
fix code warning && remove save_graphs use in st/ut
4 years ago
Xiaoda Zhang
66c7474e5a
remove CastInfo from CNODE
4 years ago
zhuyuxiao
dd7bbf92dd
change API
4 years ago
lilei
017aa359a6
insert VirtualDataset node for master
4 years ago
i-robot
2fbec9a554
!27856 use neighbor-exchange-v2 for conv2d
Merge pull request !27856 from yangzhenzhang/use-neighborexchangev2-for-conv2d
4 years ago
yangzhenzhang
8a68577756
use neighbor-exchange-v2 for conv2d
4 years ago
wzw
a9b78682d5
parallel ut refactor 3
4 years ago
yangzhenzhang
5f6477b022
add output strategy for gather op
4 years ago
i-robot
d49f5e6caf
!27525 support optimizer parallel for adafactor
Merge pull request !27525 from yangzhenzhang/support-opt-parallel-for-adafactor
4 years ago
yao_yf
30576c6a75
fix reshape bool type in auto parallel
4 years ago
yangzhenzhang
2a0b528084
support opt parallel for adafactor
4 years ago
i-robot
938dc8abd0
!27439 [Auto parallel] Add new operatorInfo for Parallel: CumSum
Merge pull request !27439 from Xiaoda/117-add-cumsum-op
4 years ago
i-robot
0e358f4cb3
!27428 revert insert VirtualDataset node for master
Merge pull request !27428 from lilei/modify_virtualdataset_for_master
4 years ago
Xiaoda Zhang
8042c88223
add the new operatorInfo for parallel: CumSum
4 years ago
lilei
2edf6ab33b
revert insert VirtualDataset node for master
4 years ago
i-robot
faaec746f7
!27401 add more ut tests for allreduce fusion
Merge pull request !27401 from jiahongQian/master
4 years ago
jiahongQian
b03c8d18d3
add more ut tests
4 years ago
i-robot
ffca7b08a5
!27237 auto insert VirtualDataset node for master
Merge pull request !27237 from lilei/modify_virtualdataset_for_master
4 years ago
i-robot
f40668ef73
!27251 test_micro_batch_Interleaved
Merge pull request !27251 from lilei/add_parallel_ut
4 years ago
lilei
05189459ab
auto insert VirtualDataset node for master
4 years ago
lilei
e933aa268b
test_micro_batch_Interleaved
4 years ago
i-robot
2d23b698a6
!27024 add allreduce fusion by size
Merge pull request !27024 from jiahongQian/master
4 years ago
q00596439
de36fdc169
add allreduce fusion size and unify the interface
4 years ago
huangxinjing
8c9b2b93a8
Add transformer
4 years ago
yangzhenzhang
7454b8f8f2
check args for shard
4 years ago
Xiaoda Zhang
364858cbc9
In sharding propagation, to keep strategy consistent of parameter being used by multiple operators, we check the edge with one node of TmpIdentityInfo
4 years ago
Xiaoda Zhang
04db51a528
In a previous PR ( https://gitee.com/mindspore/mindspore/pulls/26807/ ), we replaced 'auto_parallel_search_mode' by 'search_mode' directly.
However, to be forward compatitable, it is suitable to keep 'auto_parallel_search_mode' available. This PR recovers the 'auto_parallel_search_mode' interface and adds a warning when using this old interface.
This PR also deals with other codestyle things.
4 years ago
i-robot
9f8ec2c5ab
!26807 [Auto parallel] [Sharding propagation] Interface change of sharding propagation
Merge pull request !26807 from Xiaoda/113-auto-parallel-search-mode-changes-to-search-mode
4 years ago
i-robot
6ecbc97fd6
!26804 virtual_dataset_avoid_auto_parallel
Merge pull request !26804 from yao_yf/virtual_dataset_avoid_auto_parallel
4 years ago
i-robot
b282414de7
!26619 arallel_ut_refactoring
Merge pull request !26619 from 王志伟/parallel_ut_refactoring1
4 years ago
Xiaoda Zhang
ad5ac77ae8
1) 'auto_parallel_search_mode' changes to 'search_mode';
2) 'sharding_propagation' moves to 'search_mode';
4 years ago
yao_yf
f29ce1fb60
virtual dataset avoid auto parallel
4 years ago
i-robot
519f14a909
!26006 slice recompute activation
Merge pull request !26006 from yao_yf/add_transformer_slice_activation_config
4 years ago
wzw
86c5ad20c8
parallel_ut_refactoring1
4 years ago
i-robot
1b8c2ff0e9
!26414 fault_recover_by_mirror_group_fix_opt_shard
Merge pull request !26414 from yao_yf/fault_recover_by_mirror_group_fix_opt_shard
4 years ago
yao_yf
188d39da83
slice_activation_in_recompute
slice recompute activation
4 years ago
yao_yf
01dc4bbdf9
fix fault recover in optimizer shard
4 years ago
Xiaoda Zhang
df67e74eaf
making sharding_propagation smooth, add a reshape justification:
1) when propagate sharding strategy from one op to another, try to find the strategy with zero communication cost;
2) if there is no such strategy, find the strategy with minimum communication cost, and raise a warning;
4 years ago
i-robot
9f52343a6a
!26350 add check for resizenearestneighbor parallel op
Merge pull request !26350 from yangzhenzhang/add-check-for-resize-op
4 years ago
yangzhenzhang
ba99e4c505
add check for resize op
4 years ago
ttudu
33ac1de062
fix bug
4 years ago
i-robot
7a73bae5c3
!26036 add output strategy for matmul operator
Merge pull request !26036 from yangzhenzhang/add-output-strategy-for-op-init
4 years ago
Xiaoda Zhang
a772767265
support reshape in sharding propagation:
1) using 'swc index of strategy_cost_' as reshape's selected strategy;
2) when encountering reshape in BFS, select the 'swc index' with zero communication cost;
3) when encountering a reshape that is already visited, check whether there exists communication between reshape and current operator. It is OK if communication happens between two configured operators;
4) currently, two consecutive reshapes are not supported;
5) adjusting BFS structure in graph_costmodel.cc;
6) adjusting some code in step_auto_parallel.cc to avoid cyclomatic complexity.
4 years ago
yangzhenzhang
8431ba616c
add output strategy for op init
4 years ago