lvchangquan
31f9e6a42c
add op_mul fusion based on allreduce fusion in pynative mode
5 years ago
VectorSL
36e11ae17c
fix GPUKernelMod about the using of shard_ptr
5 years ago
ms_yan
92e86804e1
init add acltdt handle create and destory
add hostpush part modify
optimize previous code
provide aclhandle access method
modify CMakeList format
add device_id parameter into TransferNode
update acltdt api
5 years ago
ms_yan
b4b4577b85
adapter profiling with new ascend lib
5 years ago
limingqi107
a046a5eb43
optimize GPU format transform
5 years ago
mindspore-ci-bot
149175dbb3
!13031 Report error log if Profiler is initlalized after hccl init
From: @yanghaitao1
Reviewed-by: @yelihua
Signed-off-by:
5 years ago
yanghaitao1
c8a4a2e9a5
print error msg if profiling enabled after hccl init
5 years ago
mindspore-ci-bot
fb31e724aa
!13093 Enable tbe dynamic shape op for run op
From: @HulkTang
Reviewed-by: @zhoufeng54,@jjfeing
Signed-off-by: @jjfeing
5 years ago
mindspore-ci-bot
c69142fdc1
!12968 update reshape type for 3d nodes
From: @liubuyu
Reviewed-by:
Signed-off-by:
5 years ago
mindspore-ci-bot
e2ad028194
!13029 signal int handler
From: @zhoufeng54
Reviewed-by: @kisnwang,@xu-yfei
Signed-off-by: @xu-yfei
5 years ago
tanghuikang
7138bf66f6
Enable tbe dynamic shape op for run op
5 years ago
mindspore-ci-bot
a21c8e13b5
!13010 Add device id log
From: @zpac
Reviewed-by: @cristoval,@wilfchen
Signed-off-by: @cristoval
5 years ago
mindspore-ci-bot
c262acbd8e
!13049 Support ms_function + heterogenous
From: @HulkTang
Reviewed-by: @kisnwang,@chujinjin
Signed-off-by: @chujinjin
5 years ago
mindspore-ci-bot
a0cc8306a6
!13058 bugfix: sq node is not enough for data transfer
From: @zuochuanyong
Reviewed-by: @jjfeing,@chujinjin
Signed-off-by: @jjfeing
5 years ago
liubuyu
518818fbef
reshape type for 3d nodes
5 years ago
mindspore-ci-bot
04e23927ef
!12688 using cpp infer firstly
From: @lianliguang
Reviewed-by:
Signed-off-by:
5 years ago
zuochuanyong
664c45e23b
bugfix: sq node is not enough for data transfer
5 years ago
tanghuikang
dac64f30ee
Support ms_function + heterogenous
5 years ago
LianLiguang
4acab81599
using cpp infer firstly
5 years ago
zhoufeng
3a4eda82aa
signal int handler
Signed-off-by: zhoufeng <zhoufeng54@huawei.com>
5 years ago
mindspore-ci-bot
a855cb2d24
!12706 graph kernel parallel building in gpu
From: @wenfangpei
Reviewed-by:
Signed-off-by:
5 years ago
ZPaC
f2edee750a
Add device id log
5 years ago
mindspore-ci-bot
82c095a4d7
!12720 add compile option H to visible hidden interface
From: @chuckchen521
Reviewed-by:
Signed-off-by:
5 years ago
mindspore-ci-bot
7ba21f8d8c
!12900 Add communication parallel mode.
From: @liujunzhu
Reviewed-by: @zhoufeng54,@guoqi1024
Signed-off-by: @guoqi1024
5 years ago
liujunzhu
6541b96c40
Add communication parallel mode.
5 years ago
mindspore-ci-bot
0789c487ad
!12864 Data dump support 3d format
From: @jojobugfree
Reviewed-by:
Signed-off-by:
5 years ago
mindspore-ci-bot
b575068cb3
!12851 Modify task number of hccl node
From: @liujunzhu
Reviewed-by:
Signed-off-by:
5 years ago
chuck
89eed7aa27
add compile option H to visible hidden interface
5 years ago
wenfangpei
d6b3a07b4a
parallel build gpu ops about graph kernel
5 years ago
luopengting
db5de7fd28
add trigger point for distribute task failed
1. add trigger point
2. reduce unuse import
3. modify GetRealFilePath
5 years ago
mindspore-ci-bot
2f312dac66
!12091 Performance optimization for PyNative AllReduce
From: @jojobugfree
Reviewed-by:
Signed-off-by:
5 years ago
mindspore-ci-bot
4365c332e6
!12813 unify AvgPoolGrad's MindIR
From: @yuchaojie
Reviewed-by: @kisnwang
Signed-off-by:
5 years ago
caifubi
113b4e57c0
data dump use ge format
5 years ago
mindspore-ci-bot
076e74c957
!12601 [MS][RDR] support recording GPU memory information
From: @louie5
Reviewed-by:
Signed-off-by:
5 years ago
mindspore-ci-bot
c529cfa427
!12754 auto tune step one construct json
From: @liubuyu
Reviewed-by:
Signed-off-by:
5 years ago
liujunzhu
5124359a3b
Modify task number of hccl node.
5 years ago
yuchaojie
d2cb3aa1c2
unify AvgPoolGrad
5 years ago
louei5
99203038a5
support recording gpu memory information and graph execute order
5 years ago
caifubi
171b468bb3
PyNative AllReduce Bucket
5 years ago
liubuyu
2d97244741
auto tune stage one: construct json
5 years ago
mindspore-ci-bot
50542793c8
!12077 optimize gpu backend logger
From: @wilfchen
Reviewed-by: @cristoval,@limingqi107
Signed-off-by: @limingqi107
5 years ago
chendongsheng
db0a6f1e19
replace ps-lite
5 years ago
wilfChen
58196f1faf
modify gpu backend logger
5 years ago
mindspore-ci-bot
5524280075
!12550 [MS][RDR] recording func_graph in pipeline and task debug info
From: @louie5
Reviewed-by:
Signed-off-by:
5 years ago
louei5
9a48405a41
recording func_graph in pipeline and task debug information
5 years ago
Zhang Qinghua
9b26c210f4
Eliminate all useless nodes related to UpdateStates.
5 years ago
mindspore-ci-bot
5224241ca7
!12577 fix bug for dynamic_shape_depends
From: @zhupuxu
Reviewed-by: @jjfeing,@zhoufeng54
Signed-off-by: @zhoufeng54
5 years ago
zhupuxu
b15d182cd2
fix bug for dynamic_shape_depends
Signed-off-by: zhupuxu <zhupuxu@huawei.com>
5 years ago
louei5
3d540a515a
add task_debug_info recorder
5 years ago
mindspore-ci-bot
c74b4d5d73
!12412 nlp perf(Pynative): change memory sync mode from synchronous to asynchronous in SyncHostToDevice
From: @zuochuanyong
Reviewed-by:
Signed-off-by:
5 years ago