i-robot
|
9852cced86
|
!17839 Fix device memory can not release in PyNative mode
Merge pull request !17839 from zyli2020/fix_issue_defect
|
4 years ago |
i-robot
|
71bb69695f
|
!12151 Add UNet Model for GPU
Merge pull request !12151 from fanrb/unet
|
4 years ago |
fan1997
|
be3d4e6fd3
|
1.Optimize bias add grad kernel
2.Optimize slice grad kernel
3.Add Unet GPU Model
|
5 years ago |
lizhenyu
|
f3e5d67512
|
fix core dump when destroy device context in PyNative mode
|
4 years ago |
i-robot
|
4932854776
|
!17987 fix repeated release device resource of actor runtime
Merge pull request !17987 from limingqi107/actor_runtime
|
4 years ago |
limingqi107
|
e9b0eab177
|
fix repeated release device resource of actor runtime
|
4 years ago |
mindspore-ci-bot
|
8fa9e3e611
|
!17712 fix pclint & codex in profiler
From: @yanghaitao1
Reviewed-by: @ouwenchang,@yelihua
Signed-off-by: @yelihua
|
4 years ago |
i-robot
|
8fe3da0ddc
|
!17819 Add all gather fusion and concat pass for gpu
Merge pull request !17819 from ZPaC/master-add-gpu-all-gather-fusion
|
4 years ago |
yanghaitao1
|
127e4d4068
|
fix profiler pclint&codex
|
4 years ago |
zengzitao
|
43cf630e38
|
fix code_docs for gpu_kernel_runtime.h
|
4 years ago |
ZPaC
|
35b639868d
|
Add all gather fusion and concat pass for gpu
|
4 years ago |
zengzitao
|
31a372da88
|
fix oom bug when open graphkernel flag in network
|
4 years ago |
mindspore-ci-bot
|
83a9fc2939
|
!17466 GPU fix reduce precision
From: @VectorSL
Reviewed-by: @limingqi107,@wilfchen
Signed-off-by: @wilfchen
|
4 years ago |
VectorSL
|
cbe01fc836
|
fix gpu reduce precision
|
4 years ago |
limingqi107
|
d405964aab
|
actor runtimie supports allreduce multi-stream
|
4 years ago |
wilfChen
|
0ad757f74c
|
trt operator
|
4 years ago |
mindspore-ci-bot
|
2173d08ba1
|
!16978 fix codecheck and pclint
From: @limingqi107
Reviewed-by: @cristoval,@wilfchen
Signed-off-by: @wilfchen
|
4 years ago |
limingqi107
|
c22185d586
|
fix codecheck and pclint
|
4 years ago |
TinaMengtingZhang
|
da6e068ed7
|
fix ci codecheck alarm in master
|
4 years ago |
mindspore-ci-bot
|
9f77a71d30
|
!16803 [GraphKernel]Simplify GetPrevNodeAddr Codes
From: @jiaoy1224
Reviewed-by: @gaoxiong1,@ckey_dou
Signed-off-by: @ckey_dou
|
4 years ago |
lizhenyu
|
2b50100d79
|
Unify runtime support profiling
|
5 years ago |
Yang Jiao
|
6693484ef3
|
simplify getPrevAddr code
|
4 years ago |
mindspore-ci-bot
|
ac9754b7c8
|
!16570 gpu inference
From: @wilfchen
Reviewed-by: @limingqi107,@cristoval
Signed-off-by: @cristoval
|
4 years ago |
mindspore-ci-bot
|
7eb9f8e1d6
|
!16750 [GraphKernel]Fix Kernel address Cache Table
From: @jiaoy1224
Reviewed-by: @gaoxiong1,@ckey_dou
Signed-off-by: @ckey_dou
|
4 years ago |
Yang Jiao
|
a535540d45
|
fix addr cache
|
5 years ago |
liuxiao93
|
2a3a787049
|
add host_format for device_info of MetaTensor.
|
5 years ago |
mindspore-ci-bot
|
19908168bf
|
!16631 actor runtime code review modify
From: @limingqi107
Reviewed-by: @cristoval,@wilfchen
Signed-off-by: @wilfchen
|
5 years ago |
mindspore-ci-bot
|
620ba53725
|
!16628 Unify runtime for PyNative distributed mode
From: @zyli2020
Reviewed-by: @limingqi107,@cristoval
Signed-off-by: @cristoval
|
5 years ago |
mindspore-ci-bot
|
c9e70eb0d9
|
!15650 GPU update tensor size check func
From: @VectorSL
Reviewed-by: @wilfchen,@wilfchen
Signed-off-by:
|
5 years ago |
lizhenyu
|
b3fbdf9d65
|
unify runtime for PyNative distributed mode
|
5 years ago |
mindspore-ci-bot
|
789b63b501
|
!16153 pynative refactoring to optimizing performance
From: @chujinjin
Reviewed-by:
Signed-off-by:
|
5 years ago |
limingqi107
|
c02e2a4801
|
actor runtime code review modify
|
5 years ago |
mindspore-ci-bot
|
35b2e40a72
|
!13593 ShuffleNetV1 implementation on GPU
From: @charlie__chen
Reviewed-by:
Signed-off-by:
|
5 years ago |
VectorSL
|
2dbf0e694e
|
update tensor size check func
|
5 years ago |
zhangzhaoju
|
bf98fcef56
|
issue#I3ARG6
lenet memory leak fix
|
5 years ago |
wilfChen
|
095c99d199
|
gpu inference
|
5 years ago |
chujinjin
|
059b05a72e
|
fix memcopy async error
|
5 years ago |
mindspore-ci-bot
|
2c980119f4
|
!16356 add the sync interface between different devcie addresses
From: @limingqi107
Reviewed-by: @cristoval,@wilfchen
Signed-off-by: @cristoval
|
5 years ago |
limingqi107
|
7352c78c07
|
add copy actor and sync interface between devcie addresses
|
5 years ago |
chenchang
|
7afc8af4ed
|
shufflenetv1 on GPU
|
5 years ago |
tronzhang
|
14c525a671
|
donnot valid new addr in prev node mutable output address cache
|
5 years ago |
mindspore-ci-bot
|
84607e3a51
|
!15890 fix an allreduce calculate bug in pynative mode
From: @lvchangquan
Reviewed-by:
Signed-off-by:
|
5 years ago |
lvchangquan
|
0b09fdf94c
|
fix an allreduce bug with two streams sync problem
|
5 years ago |
dayschan
|
c688116f9d
|
move the akg kernel build timer into AkgKernelBuilder::AkgKernelParallelBuild, so that it can time the Ascend kernel builder
|
5 years ago |
John Tzanakakis
|
89c78069ce
|
use host_type vs deprecated type_id
|
5 years ago |
limingqi107
|
179c677fef
|
fix graph output address set in the one time memory application scenarios
|
5 years ago |
mindspore-ci-bot
|
25ce6a104e
|
!15600 add the continue memory alloc of communication kernel for actor runtime
From: @limingqi107
Reviewed-by: @wilfchen,@cristoval
Signed-off-by: @cristoval
|
5 years ago |
limingqi107
|
fba1dd8f2f
|
add the continue memory alloc of communication kernel for actor runtime
|
5 years ago |
wilfChen
|
ba9bbfadf8
|
gpu inference mixed precision
|
5 years ago |
mindspore-ci-bot
|
a000f39764
|
!15552 use host shape instead of device shape for debugger
From: @john_tzanakakis
Reviewed-by: @yelihua,@pandoublefeng
Signed-off-by: @pandoublefeng
|
5 years ago |