mindspore-ci-bot
5fd3d140b6
!13344 add DeviceContext module
From: @zyli2020
Reviewed-by:
Signed-off-by:
5 years ago
lizhenyu
95565aa7b8
add hardware abstract layer
5 years ago
luopengting
c8ba7694c5
refactor RDR to support single name
1. support single name
2. add hash method for pair
3. move constructor and destructor of MemAddressInfo as public
4. remove graph_id
5. modify interval for somas info
5 years ago
TFBunny
4d35303265
support string in GPU print
5 years ago
mindspore-ci-bot
6f6d14d944
!13102 Add unique id for .dat and .dot file to avoid covering
From: @irmo
Reviewed-by:
Signed-off-by:
5 years ago
huanghui
a2ba47e18a
1. Add unique id for .dat and .dot file to avoid covering
2. Dump the end graph in gpu session and cu session
5 years ago
Islam Amin
cbbffbedef
fix gpu dump naming
5 years ago
mindspore-ci-bot
a21c8e13b5
!13010 Add device id log
From: @zpac
Reviewed-by: @cristoval,@wilfchen
Signed-off-by: @cristoval
5 years ago
tanghuikang
6102202abd
Not save InitDatasetQueue and GetNext op in PyNative Mode
5 years ago
ZPaC
f2edee750a
Add device id log
5 years ago
mindspore-ci-bot
7104e42304
!12808 Add graph_ to execution order filename
From: @islam_amin
Reviewed-by: @john_tzanakakis,@tom__chen
Signed-off-by:
5 years ago
caifubi
171b468bb3
PyNative AllReduce Bucket
5 years ago
Islam Amin
ed2f8876b9
adding graph_ to exec order filename
5 years ago
mindspore-ci-bot
7296659f14
!12764 [Ascend][GPU] Add execution order dumping of final execution graphs
From: @islam_amin
Reviewed-by: @john_tzanakakis,@yelihua
Signed-off-by: @yelihua
5 years ago
mindspore-ci-bot
00f25c8409
!12728 fix precision error after cache modification
From: @simson_wu
Reviewed-by: @chujinjin,@zhoufeng54
Signed-off-by: @chujinjin
5 years ago
Islam Amin
187222d461
Adding dump of order of execution for final exec graphs on ascend and gpu
5 years ago
dayschan
c165ab5bb1
Combine the GraphKernelOptimization of Gpu and Ascend
removed one cse from GPU passes,
some common passes was enabled for Ascend.
5 years ago
simson
c29d8f66d8
fix precision error after cache modification
5 years ago
mindspore-ci-bot
5524280075
!12550 [MS][RDR] recording func_graph in pipeline and task debug info
From: @louie5
Reviewed-by:
Signed-off-by:
5 years ago
mindspore-ci-bot
4dedab3775
!12593 Not AllocateMemory when CompileGraph in PyNative mode
From: @HulkTang
Reviewed-by: @zhoufeng54
Signed-off-by:
5 years ago
louei5
9a48405a41
recording func_graph in pipeline and task debug information
5 years ago
Islam Amin
722eb2ec5a
ascend graph dump trigger at data dump
5 years ago
tanghuikang
c346a96529
Not AllocateMemory when CompileGraph in PyNative mode
5 years ago
He Wei
7d9a783993
[auto-monad] Support side-effects by auto-monad
The basic idea is: exploits data dependency to control the execution order
of side-effect operations, and keep the semantics of ANF unchanged.
The ControlDepend primitive is removed and there are two primitives added:
1. UpdateState:
```
a = Assign(para, value)
```
became:
```
a = Assign(para, value, u)
u = UpdateState(u, a)
```
2. Load:
```
x = Add(para, value)
```
became:
```
p = Load(para, u)
x = Add(p, value)
u = UpdateState(u, p)
```
5 years ago
mindspore-ci-bot
0ff27ef3b4
!11930 【GraphKernel】Replace Assign with InplaceAssign
From: @dayschan
Reviewed-by: @gaoxiong1,@dylangeng
Signed-off-by: @gaoxiong1
5 years ago
mindspore-ci-bot
a24ff36d9c
!11777 stitch fusion
From: @r1chardf1d0
Reviewed-by:
Signed-off-by:
5 years ago
dayschan
08345c54ea
[GraphKernel] Replace Assign with InplaceAssign
1. added a pass to replace Assign with InplaceAssign.
2. bugfix in eliminate_redundant_output. the side-effect node should not be eliminated.
3. bugfix in graph_kernel/splitter.py, the kernel includes InplaceAssign should be a composite node.
4. added two tool functions GetAllInputDeviceTypes and GetAllOutputDeviceTypes into AnfAlgo.
5. do not fuse a single Assign in pass BasicOpsFusion.
5 years ago
dayschan
8a09279ec3
Moved ShapeOpsSplitter before GraphKernelSplitter, changed it to process sub func_graph only.
5 years ago
r1chardf1d0
9d6392c5c5
stitch info
5 years ago
mindspore-ci-bot
4364abc7ee
!11798 Support RunOpsInGraph on CPU&GPU in pynative mode
From: @HulkTang
Reviewed-by:
Signed-off-by:
5 years ago
tanghuikang
6f2cd92aba
Support RunOpsInGraph on CPU&GPU in pynative mode
5 years ago
mindspore-ci-bot
6e97c0004e
!11689 gpu support serving basic
From: @wilfchen
Reviewed-by: @cristoval,@limingqi107
Signed-off-by: @limingqi107
5 years ago
wilfChen
a911b9ef9e
mindspore serving support gpu backend
5 years ago
tronzhang
d078cbfa99
support parallel fusion
5 years ago
dayschan
27b4e1653a
Raise akg ReduceSum precision
Cast the float16-input to float32 before ReduceSum, and cast back to float16 after ReduceSum.
If the op after this ReduceSum is a casting from float16 to float32, then it can be eliminated.
5 years ago
chujinjin
9104ffaafa
fix inceptionv3 kernel build error in pynative
5 years ago
chujinjin
ade9a82c2b
fix device memory leak
5 years ago
mindspore-ci-bot
9591c325f7
!10865 Raise exception when sync stream failed
From: @jojobugfree
Reviewed-by: @kisnwang,@zhoufeng54,@chujinjin
Signed-off-by: @chujinjin
5 years ago
dayschan
8af78cd5ce
Added ExpandDims into GPU fusion list
what's more:
remove one restriction of getitem in ops fusion.
add a while loop for the ShapeOpsSplitter pass.
add ExpandDims into shape_ops list.
5 years ago
caifubi
ea2aa7dec4
Raise exception when Sync stream failed
5 years ago
dayschan
26ac9167f8
Enhance the fusion capacity for getitem nodes.
fixbug in ReplaceNewFuseCNode
add a pass to eliminate repeated output after cse
fixbug in graph_kernel_splitter
do not fuse reshape op as output in costmodel.
5 years ago
wilfChen
09e10e18bb
momentum weightdecay fusion
5 years ago
mindspore-ci-bot
a4b010cea8
!9746 add ps cache
From: @zyli2020
Reviewed-by:
Signed-off-by:
5 years ago
mindspore-ci-bot
be4e91339f
!9661 gpu relu optimize
From: @wilfchen
Reviewed-by: @cristoval,@limingqi107
Signed-off-by: @limingqi107
5 years ago
lizhenyu
e3f7ae61db
add ps cache manager
5 years ago
mindspore-ci-bot
2799b6d35f
!9683 [Debugger] Performance and state improvements
From: @harsh1995
Reviewed-by: @john_tzanakakis,@wenkai_dist
Signed-off-by: @wenkai_dist
5 years ago
wilfChen
c1d3bd2160
relu optimize
5 years ago
tronzhang
056d7ffc56
clean batch buffer in once
5 years ago
Harshvardhan Gupta
dd0084c52b
improve perf, keep consistent tensor state, fix recheck, check weights at step end
5 years ago
mindspore-ci-bot
d38f8205dc
!8987 support getnext in pynative mode
From: @chujinjin
Reviewed-by:
Signed-off-by:
5 years ago