wilfChen
|
0ad757f74c
|
trt operator
|
4 years ago |
mindspore-ci-bot
|
2173d08ba1
|
!16978 fix codecheck and pclint
From: @limingqi107
Reviewed-by: @cristoval,@wilfchen
Signed-off-by: @wilfchen
|
4 years ago |
limingqi107
|
c22185d586
|
fix codecheck and pclint
|
4 years ago |
TinaMengtingZhang
|
da6e068ed7
|
fix ci codecheck alarm in master
|
4 years ago |
mindspore-ci-bot
|
9f77a71d30
|
!16803 [GraphKernel]Simplify GetPrevNodeAddr Codes
From: @jiaoy1224
Reviewed-by: @gaoxiong1,@ckey_dou
Signed-off-by: @ckey_dou
|
4 years ago |
lizhenyu
|
2b50100d79
|
Unify runtime support profiling
|
4 years ago |
Yang Jiao
|
6693484ef3
|
simplify getPrevAddr code
|
4 years ago |
mindspore-ci-bot
|
ac9754b7c8
|
!16570 gpu inference
From: @wilfchen
Reviewed-by: @limingqi107,@cristoval
Signed-off-by: @cristoval
|
4 years ago |
mindspore-ci-bot
|
7eb9f8e1d6
|
!16750 [GraphKernel]Fix Kernel address Cache Table
From: @jiaoy1224
Reviewed-by: @gaoxiong1,@ckey_dou
Signed-off-by: @ckey_dou
|
4 years ago |
Yang Jiao
|
a535540d45
|
fix addr cache
|
4 years ago |
liuxiao93
|
2a3a787049
|
add host_format for device_info of MetaTensor.
|
4 years ago |
mindspore-ci-bot
|
19908168bf
|
!16631 actor runtime code review modify
From: @limingqi107
Reviewed-by: @cristoval,@wilfchen
Signed-off-by: @wilfchen
|
4 years ago |
mindspore-ci-bot
|
620ba53725
|
!16628 Unify runtime for PyNative distributed mode
From: @zyli2020
Reviewed-by: @limingqi107,@cristoval
Signed-off-by: @cristoval
|
4 years ago |
mindspore-ci-bot
|
c9e70eb0d9
|
!15650 GPU update tensor size check func
From: @VectorSL
Reviewed-by: @wilfchen,@wilfchen
Signed-off-by:
|
4 years ago |
lizhenyu
|
b3fbdf9d65
|
unify runtime for PyNative distributed mode
|
4 years ago |
mindspore-ci-bot
|
789b63b501
|
!16153 pynative refactoring to optimizing performance
From: @chujinjin
Reviewed-by:
Signed-off-by:
|
4 years ago |
limingqi107
|
c02e2a4801
|
actor runtime code review modify
|
4 years ago |
mindspore-ci-bot
|
35b2e40a72
|
!13593 ShuffleNetV1 implementation on GPU
From: @charlie__chen
Reviewed-by:
Signed-off-by:
|
4 years ago |
VectorSL
|
2dbf0e694e
|
update tensor size check func
|
4 years ago |
zhangzhaoju
|
bf98fcef56
|
issue#I3ARG6
lenet memory leak fix
|
5 years ago |
wilfChen
|
095c99d199
|
gpu inference
|
4 years ago |
chujinjin
|
059b05a72e
|
fix memcopy async error
|
4 years ago |
mindspore-ci-bot
|
2c980119f4
|
!16356 add the sync interface between different devcie addresses
From: @limingqi107
Reviewed-by: @cristoval,@wilfchen
Signed-off-by: @cristoval
|
4 years ago |
limingqi107
|
7352c78c07
|
add copy actor and sync interface between devcie addresses
|
4 years ago |
chenchang
|
7afc8af4ed
|
shufflenetv1 on GPU
|
4 years ago |
tronzhang
|
14c525a671
|
donnot valid new addr in prev node mutable output address cache
|
4 years ago |
mindspore-ci-bot
|
84607e3a51
|
!15890 fix an allreduce calculate bug in pynative mode
From: @lvchangquan
Reviewed-by:
Signed-off-by:
|
4 years ago |
lvchangquan
|
0b09fdf94c
|
fix an allreduce bug with two streams sync problem
|
4 years ago |
dayschan
|
c688116f9d
|
move the akg kernel build timer into AkgKernelBuilder::AkgKernelParallelBuild, so that it can time the Ascend kernel builder
|
4 years ago |
John Tzanakakis
|
89c78069ce
|
use host_type vs deprecated type_id
|
4 years ago |
limingqi107
|
179c677fef
|
fix graph output address set in the one time memory application scenarios
|
4 years ago |
mindspore-ci-bot
|
25ce6a104e
|
!15600 add the continue memory alloc of communication kernel for actor runtime
From: @limingqi107
Reviewed-by: @wilfchen,@cristoval
Signed-off-by: @cristoval
|
4 years ago |
limingqi107
|
fba1dd8f2f
|
add the continue memory alloc of communication kernel for actor runtime
|
4 years ago |
wilfChen
|
ba9bbfadf8
|
gpu inference mixed precision
|
4 years ago |
mindspore-ci-bot
|
a000f39764
|
!15552 use host shape instead of device shape for debugger
From: @john_tzanakakis
Reviewed-by: @yelihua,@pandoublefeng
Signed-off-by: @pandoublefeng
|
4 years ago |
zengzitao
|
1fd87c6d83
|
add address state to fix cache problem when open graph kernel
|
4 years ago |
John Tzanakakis
|
ddae425e0c
|
use host instead of device shape for debugger
|
4 years ago |
limingqi107
|
b3a5ccebc3
|
fix codedex
|
5 years ago |
mindspore-ci-bot
|
4bdad7ef49
|
!14802 Avoid calling too many time of SetDevice in PyNative mode
From: @jojobugfree
Reviewed-by: @kisnwang,@zhoufeng54
Signed-off-by: @zhoufeng54
|
5 years ago |
caifubi
|
e76e7d4a27
|
Fix bug of wrong cuda device id
|
5 years ago |
mindspore-ci-bot
|
da93717c5d
|
!14682 GPU set tensor limitation to 2G
From: @VectorSL
Reviewed-by: @cristoval,@wilfchen,@cristoval
Signed-off-by: @cristoval
|
5 years ago |
mindspore-ci-bot
|
83b25e10e9
|
!13009 [debugger] offline debug feature
From: @islam_amin
Reviewed-by:
Signed-off-by:
|
5 years ago |
VectorSL
|
7b9b84d651
|
addtensor size limitation to 2G
|
5 years ago |
John Tzanakakis
|
da3b13a0e1
|
Offline debugger
Authors: John Tzanakakis, Adel Shafiei, Amir Lashkari, Islam Amin
|
5 years ago |
hwjiaorui
|
dac67cbabb
|
clean code
|
5 years ago |
lizhenyu
|
cf2244f1ef
|
[bugfix]Not set device id in asynchronous compile and run graph
|
5 years ago |
mindspore-ci-bot
|
825ce95756
|
!13832 fix a bug that cuda error show wrong name in log
From: @hanhuifeng2020
Reviewed-by: @dylangeng,@anyrenwei
Signed-off-by: @anyrenwei
|
5 years ago |
mindspore-ci-bot
|
e9d57490c5
|
!14155 [MS][RDR] fix fps degradation of gpu trainning in master branch
From: @louie5
Reviewed-by: @ouwenchang,@lixiaohui33
Signed-off-by: @lixiaohui33
|
5 years ago |
wilfChen
|
e1d443efe3
|
tensor-rt library dynamic loadg
|
5 years ago |
louei5
|
f23ce6c7d9
|
optimize record gpu memory information
|
5 years ago |