mindspore-ci-bot
7296659f14
!12764 [Ascend][GPU] Add execution order dumping of final execution graphs
From: @islam_amin
Reviewed-by: @john_tzanakakis,@yelihua
Signed-off-by: @yelihua
5 years ago
mindspore-ci-bot
00f25c8409
!12728 fix precision error after cache modification
From: @simson_wu
Reviewed-by: @chujinjin,@zhoufeng54
Signed-off-by: @chujinjin
5 years ago
Islam Amin
187222d461
Adding dump of order of execution for final exec graphs on ascend and gpu
5 years ago
dayschan
c165ab5bb1
Combine the GraphKernelOptimization of Gpu and Ascend
removed one cse from GPU passes,
some common passes was enabled for Ascend.
5 years ago
simson
c29d8f66d8
fix precision error after cache modification
5 years ago
mindspore-ci-bot
5524280075
!12550 [MS][RDR] recording func_graph in pipeline and task debug info
From: @louie5
Reviewed-by:
Signed-off-by:
5 years ago
mindspore-ci-bot
4dedab3775
!12593 Not AllocateMemory when CompileGraph in PyNative mode
From: @HulkTang
Reviewed-by: @zhoufeng54
Signed-off-by:
5 years ago
louei5
9a48405a41
recording func_graph in pipeline and task debug information
5 years ago
Islam Amin
722eb2ec5a
ascend graph dump trigger at data dump
5 years ago
tanghuikang
c346a96529
Not AllocateMemory when CompileGraph in PyNative mode
5 years ago
He Wei
7d9a783993
[auto-monad] Support side-effects by auto-monad
The basic idea is: exploits data dependency to control the execution order
of side-effect operations, and keep the semantics of ANF unchanged.
The ControlDepend primitive is removed and there are two primitives added:
1. UpdateState:
```
a = Assign(para, value)
```
became:
```
a = Assign(para, value, u)
u = UpdateState(u, a)
```
2. Load:
```
x = Add(para, value)
```
became:
```
p = Load(para, u)
x = Add(p, value)
u = UpdateState(u, p)
```
5 years ago
mindspore-ci-bot
0ff27ef3b4
!11930 【GraphKernel】Replace Assign with InplaceAssign
From: @dayschan
Reviewed-by: @gaoxiong1,@dylangeng
Signed-off-by: @gaoxiong1
5 years ago
mindspore-ci-bot
a24ff36d9c
!11777 stitch fusion
From: @r1chardf1d0
Reviewed-by:
Signed-off-by:
5 years ago
dayschan
08345c54ea
[GraphKernel] Replace Assign with InplaceAssign
1. added a pass to replace Assign with InplaceAssign.
2. bugfix in eliminate_redundant_output. the side-effect node should not be eliminated.
3. bugfix in graph_kernel/splitter.py, the kernel includes InplaceAssign should be a composite node.
4. added two tool functions GetAllInputDeviceTypes and GetAllOutputDeviceTypes into AnfAlgo.
5. do not fuse a single Assign in pass BasicOpsFusion.
5 years ago
dayschan
8a09279ec3
Moved ShapeOpsSplitter before GraphKernelSplitter, changed it to process sub func_graph only.
5 years ago
r1chardf1d0
9d6392c5c5
stitch info
5 years ago
mindspore-ci-bot
4364abc7ee
!11798 Support RunOpsInGraph on CPU&GPU in pynative mode
From: @HulkTang
Reviewed-by:
Signed-off-by:
5 years ago
tanghuikang
6f2cd92aba
Support RunOpsInGraph on CPU&GPU in pynative mode
5 years ago
mindspore-ci-bot
6e97c0004e
!11689 gpu support serving basic
From: @wilfchen
Reviewed-by: @cristoval,@limingqi107
Signed-off-by: @limingqi107
5 years ago
wilfChen
a911b9ef9e
mindspore serving support gpu backend
5 years ago
tronzhang
d078cbfa99
support parallel fusion
5 years ago
dayschan
27b4e1653a
Raise akg ReduceSum precision
Cast the float16-input to float32 before ReduceSum, and cast back to float16 after ReduceSum.
If the op after this ReduceSum is a casting from float16 to float32, then it can be eliminated.
5 years ago
chujinjin
9104ffaafa
fix inceptionv3 kernel build error in pynative
5 years ago
chujinjin
ade9a82c2b
fix device memory leak
5 years ago
mindspore-ci-bot
9591c325f7
!10865 Raise exception when sync stream failed
From: @jojobugfree
Reviewed-by: @kisnwang,@zhoufeng54,@chujinjin
Signed-off-by: @chujinjin
5 years ago
dayschan
8af78cd5ce
Added ExpandDims into GPU fusion list
what's more:
remove one restriction of getitem in ops fusion.
add a while loop for the ShapeOpsSplitter pass.
add ExpandDims into shape_ops list.
5 years ago
caifubi
ea2aa7dec4
Raise exception when Sync stream failed
5 years ago
dayschan
26ac9167f8
Enhance the fusion capacity for getitem nodes.
fixbug in ReplaceNewFuseCNode
add a pass to eliminate repeated output after cse
fixbug in graph_kernel_splitter
do not fuse reshape op as output in costmodel.
5 years ago
wilfChen
09e10e18bb
momentum weightdecay fusion
5 years ago
mindspore-ci-bot
a4b010cea8
!9746 add ps cache
From: @zyli2020
Reviewed-by:
Signed-off-by:
5 years ago
mindspore-ci-bot
be4e91339f
!9661 gpu relu optimize
From: @wilfchen
Reviewed-by: @cristoval,@limingqi107
Signed-off-by: @limingqi107
5 years ago
lizhenyu
e3f7ae61db
add ps cache manager
5 years ago
mindspore-ci-bot
2799b6d35f
!9683 [Debugger] Performance and state improvements
From: @harsh1995
Reviewed-by: @john_tzanakakis,@wenkai_dist
Signed-off-by: @wenkai_dist
5 years ago
wilfChen
c1d3bd2160
relu optimize
5 years ago
tronzhang
056d7ffc56
clean batch buffer in once
5 years ago
Harshvardhan Gupta
dd0084c52b
improve perf, keep consistent tensor state, fix recheck, check weights at step end
5 years ago
mindspore-ci-bot
d38f8205dc
!8987 support getnext in pynative mode
From: @chujinjin
Reviewed-by:
Signed-off-by:
5 years ago
mindspore-ci-bot
1a5dd4a711
!9390 Pynative support dynamic op run in gpu
From: @joylvliang
Reviewed-by: @chujinjin,@jjfeing
Signed-off-by: @chujinjin
5 years ago
mindspore-ci-bot
95573571f0
!9511 Codedex change for tensor_loader
From: @liangzhibo
Reviewed-by:
Signed-off-by:
5 years ago
lvliang
8984cc9c03
pynative-support-dynamic-op-run-in-gpu
5 years ago
l00591931
1d1cab986d
Codedex change for tensor_loader
5 years ago
chujinjin
af031410bb
support getnext in pynative
5 years ago
dayschan
e5306b913d
GraphKernel Fuser
Refactor the BasicOpsFusion and CompositeOpsFusion to one pass.
Add a pass to eliminate the redundant output.
TODO: rename the file basic_ops_fusion and delete the file composite_ops_fusion
5 years ago
tronzhang
13126653ec
process cast when activate graph kernel in amp
5 years ago
tronzhang
2190da9946
support atomic clean and change package for akg.
5 years ago
caifubi
d44dd4f786
Move BuildOp into RunOp
5 years ago
HulkTang
c36b477568
Run ops one by one in pynative bp graph
5 years ago
mindspore-ci-bot
3f75f13556
!8648 PyNative Performance Optimization
From: @jojobugfree
Reviewed-by:
Signed-off-by:
5 years ago
caifubi
c7d6997819
pynative host device parallel
5 years ago
mindspore-ci-bot
270c156219
!8696 fix context null error
From: @kisnwang
Reviewed-by:
Signed-off-by:
5 years ago