liutongtong
7afcdfd211
add history and lambda callbacks
4 years ago
i-robot
b90cf43562
!30553 Support dataset reset() to recover after failure
Merge pull request !30553 from h.farahat/reset
4 years ago
i-robot
f01d841113
!27943 [MD][Autotune] Save/Load Autotune config
Merge pull request !27943 from harshvardhangupta/save_load_at_config
4 years ago
i-robot
7b26a32e98
!22608 [assistant][ops]New operator implementation, include KITTIDataset
Merge pull request !22608 from Wangsong95/kitti_dataset
4 years ago
h.farahat
a3dec34833
Dataset failover reset
4 years ago
harshvardhangupta
bd61adbb72
Implement save/load of autotune dataset pipeline configuration #27943
4 years ago
i-robot
70e61e9014
!30781 [MD] test_pyfunc_multiprocess_autotune.py - enable AutoTune
Merge pull request !30781 from cathwong/ckw_ut_fixup_map_python_multiproc
4 years ago
i-robot
edcc6b790d
!30714 [MD][Offload] Add TypeCast op to offload
Merge pull request !30714 from markuskunej/offload_typecast
4 years ago
Cathy Wong
eb931c7f88
[MD] test_pyfunc_multiprocess_autotune.py - enable AutoTune
4 years ago
i-robot
4aa82cc21e
!30679 [MD] Add tests for Python Multiprocessing with AutoTune
Merge pull request !30679 from cathwong/ckw_ut_map_python_multiproc
4 years ago
zx
93617ce91e
[feat][assistant][I3J6VO] add new data operator KITTI
4 years ago
Cathy Wong
b151f90b8d
[MD] Add tests for Python Multiprocessing with AutoTune
4 years ago
markuskunej
6de37045ff
Added TypeCast op in dataset offload.
4 years ago
wangjun
46612fabfb
add st for shard
4 years ago
i-robot
ad9757ccf0
!30661 [Auto parallel] [MoE] Fix an error of configuring MoE parallel
Merge pull request !30661 from Xiaoda/124-moe-changes
4 years ago
i-robot
789c1d6bd3
!30614 [AutoParallel] Fix Overflow As the cast is inserted before Mirror
Merge pull request !30614 from huangxinjing/fix_cast_error
4 years ago
Xiaoda Zhang
81e5abe580
fix an error of configuring parallel
4 years ago
i-robot
a92c54b206
!30496 [Fallback] Suppport scipy module.
Merge pull request !30496 from huangbingjian/support_scipy
4 years ago
i-robot
a8686ae3d9
!18827 [assistant][ops]New operator implementation, include LFWDataset
Merge pull request !18827 from Wangsong95/lfw_dataset
4 years ago
zx
2132f62d98
[feat][assistant][I3J6VQ] add new data operator LFW
4 years ago
huangxinjing
896daee845
[AutoParallel]Fix insert error for the mirror
4 years ago
yangzhenzhang
43e6e16da3
check platform for resizebilinear
4 years ago
i-robot
5deccfe64b
!30260 [MD][AutoTune] Re-enable AT for non-sink models
Merge pull request !30260 from danishfarid/re-enable_AT
4 years ago
i-robot
0341d96dd6
!30469 add shard function to support part of the graph executed in auto_parallel under pynative mode
Merge pull request !30469 from wangjun/0223_pp
4 years ago
huangbingjian
a69d13bc44
[Fallback] Suppport scipy module.
4 years ago
i-robot
cfe0f76d2b
!30491 ut for allgather fusion
Merge pull request !30491 from jiahongQian/master
4 years ago
danishfarid
6c4697fc8b
reenable AT for non-sink
tests re-enable
code check fix 1
remove self
lint fixing
lint fix 2
remove test as per req
4 years ago
wangjun
24d448239c
add pynative_parallel
4 years ago
i-robot
981eae461a
!30118 自动优化器并行特性
Merge pull request !30118 from zhuyuxiao/I4S85V
4 years ago
jiahongQian
25f57505bf
ut for allgather fusion
4 years ago
i-robot
bbcfbce9e0
!29997 [Auto parallel] [MoE] Support data_parallel + expert_parallel
Merge pull request !29997 from Xiaoda/124-moe-changes
4 years ago
zhuyuxiao
d0e0e305d3
good
4 years ago
i-robot
f2130e7434
!30483 [AutoParallel]Pipeline Automatic detection Opt
Merge pull request !30483 from lichen/pipeline_opt_detection
4 years ago
yao_yf
e21f878e14
adasum ut fix
4 years ago
Xiaoda Zhang
b714451937
implementing expert_parallel+data_parallel in MoE:
1) extending _Linear's input as 4-dimension tensor: [outer_batch, expert_dim, -1, hidden], and _Liner's BatchMatMul becomes BatchMatMul(4_dim_tensor, 3_dim_tensor);
2) configuring the _Linear's BatchMatMul sharding strategy as [[dp, ep, 1, 1], [ep, 1, mp]];
3) introducing a new parameter 'expert_parallel' in TransformerOpParallelConfig, creating a new class MoEParallelConfig to include 'data_parallel', 'model_parallel' and 'expert_parallel';
4) changing parallel config for FeedForward, TransformerEncoderLayer, TransformerDecoderLayer.
4 years ago
wangshengnan12@huawei.com
acbefd80ea
pipeline_opt_detection
4 years ago
i-robot
2a00ffd3b1
!30259 [MD]Update set_autotune_enable API to add save filepath
Merge pull request !30259 from cathwong/ckw_at_save_api
4 years ago
Cathy Wong
46e223e569
[MD] Update set_autotune_enable API to add save filepath
4 years ago
i-robot
81260a2319
!30466 takedown test_auto_parallel_adasum.py to ensure stability, again
Merge pull request !30466 from yanghaoran/master
4 years ago
i-robot
14393503b7
!30431 allreduce allgather fusion
Merge pull request !30431 from jiahongQian/master
4 years ago
yanghaoran
71d6b7d506
takedown test_auto_parallel_adasum.py to ensure stability, again
4 years ago
i-robot
2e8eac8341
!30367 auto_parallel_adasum_support_data_parallel
Merge pull request !30367 from yao_yf/auto_parallel_adasum_support_data_parallel
4 years ago
jiahongQian
8a2151d8bb
allgather reducescatter fusion
4 years ago
i-robot
eeb731ae3e
!18738 [assistant][ops]New operator implementation, include LSUNDataset
Merge pull request !18738 from Wangsong95/lsun_dataset
4 years ago
i-robot
5bee7156b9
!30369 add_virtualdataset_ut
Merge pull request !30369 from lilei/add_virtualdataset_ut
4 years ago
i-robot
0f24b679ec
!29819 Add GlobalNorm Search
Merge pull request !29819 from huangxinjing/add_global_norm_search
4 years ago
i-robot
abb89d3f06
!29705 [MD][Offload] Support Multi-Column Datasets and Map Column Names to Index
Merge pull request !29705 from markuskunej/offload_multi_col_ds
4 years ago
i-robot
6edc6fccee
!30189 Optimize error message while outmost network input is wrong
Merge pull request !30189 from zhangzhaoju/master_outmost
4 years ago
yao_yf
19236b1a70
auto parallel adasum support data parallel and hybrid parallel
4 years ago
huangxinjing
092ba035e3
Add global norm parallel support
4 years ago