sabrinasun
220245f592
add security isolation to online and offline debugger
4 years ago
liangzelang
1832d7c152
Use rtMemcpy trans data in Ascend instead of device -> host -> device
4 years ago
ZPaC
8f0a0682b8
Fix static check
4 years ago
gaoyong10
e7f6b034cf
Fix double output for single device address
4 years ago
i-robot
3f9fed78c4
!21860 PyNative kerenl parallel build in FIRST step
Merge pull request !21860 from caifubi/master-kernel-parallel-build-simple
4 years ago
caifubi
537fce0ee1
PyNative Kernel Parallel Build
1. Create Tensor and DeviceAddress for output before Launch.
2. Push Launch/Build Task to Queue and execute togather.
4 years ago
Parastoo Ashtari
bf034bddb5
Apply comments on tensor stat online and offline debugger
4 years ago
zuochuanyong
8fa68ebd98
fix Conv3D precision under fp16
4 years ago
ms_yan
36a8886ca2
Revert "[feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset"
This reverts commit b077aa1cab .
Revert "[feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset"
This reverts commit 4e6f7dc97d .
delete pass_registry_test.cc
comment hiai_nlu_model_multi.pb related line
4 years ago
djc
b077aa1cab
[feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset
4 years ago
djc
4e6f7dc97d
[feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset
4 years ago
zjun
35aab6144d
Fix pynative memory leak
Signed-off-by: zjun <zhangjun0@huawei.com>
4 years ago
yelihua
72e6058265
get rank id when set hccl env for single card train
4 years ago
limingqi107
5766234426
code review of gpu backend
4 years ago
wangjunbao
f9d99e97d2
fix ci warning for not handling function retrun of RDR
4 years ago
Margaret_wangrui
f9a064e464
Add ref user to UpdateState to ensure the order
4 years ago
limingqi107
1958b436b7
disable mindRT in control flow
4 years ago
maning202007
2b3d215ef8
Fix the forever loop for multigraph on gpu
4 years ago
kswang
bfab67a206
optimize node get target
4 years ago
chendongsheng
ecc8e379e8
fixed log error
4 years ago
kswang
3247c00555
optimize heter memcpy
5 years ago
chendongsheng
7d0d8f2a92
fixed ps data_parallel case result is error
4 years ago
i-robot
1cdaa12cfd
!18093 [Debugger] Add root graph id to sub graph's pb file
Merge pull request !18093 from TinaMengtingZhang/add_graph_id
4 years ago
i-robot
4861711676
!18107 dump and offline debug fixes
Merge pull request !18107 from john_tzanakakis/jt_bug_fixes
4 years ago
i-robot
eaac4f47b3
!18058 Update graph input shape
Merge pull request !18058 from chenweifeng/graph-dynamic
4 years ago
John Tzanakakis
ac1847ffac
fix iter 0 and iter 1 being dumped in dir 0, make op_debug_mode optional for sync mode, read input files for offline debugger
4 years ago
TinaMengtingZhang
dd6884eb6f
add root graph id to pb file
4 years ago
lizhenyu
f3e5d67512
fix core dump when destroy device context in PyNative mode
4 years ago
wilfChen
27ed501716
graph input dynamic
4 years ago
wilfChen
2e6afc07ac
graph input dynamic
4 years ago
chujinjin
90feb6a6d2
fix bcewithlogitsloss op error in pynative
4 years ago
mindspore-ci-bot
9193b4d997
!17859 Change rank id in dump path
From: @tina_mengting_zhang
Reviewed-by: @ouwenchang,@yelihua,@zhoufeng54,@yelihua,@ouwenchang
Signed-off-by: @zhoufeng54,@ouwenchang
4 years ago
TinaMengtingZhang
2fa05b66a1
change device_id to rank_id in dump path
4 years ago
ZPaC
35b639868d
Add all gather fusion and concat pass for gpu
4 years ago
wilfChen
5373a2bb1b
open trt pass
4 years ago
mindspore-ci-bot
4e741f8aa6
!16701 gpu matmul and biasadd fusion
From: @wilfchen
Reviewed-by: @cristoval,@limingqi107
Signed-off-by: @limingqi107
4 years ago
TinaMengtingZhang
4926d74570
unify dir path
dump hccl and config json files to dir
update filename for sync dump except cpu dump
update testcases
5 years ago
wilfChen
b2242d13c4
gpu matmul biasadd fusion
5 years ago
lizhenyu
b3fbdf9d65
unify runtime for PyNative distributed mode
5 years ago
Parastoo Ashtari
e9e18d2253
add suspend for last node in the graph
removed double stopping for the last kernel
moved the ResetLoadedTensors to the correct place for GPU
5 years ago
lvchangquan
0b09fdf94c
fix an allreduce bug with two streams sync problem
5 years ago
mindspore-ci-bot
19158780b5
!15978 Add GPU BCEWithLogitsLoss
From: @TFbunny
Reviewed-by: @tom__chen,@robingrosman
Signed-off-by: @robingrosman
5 years ago
TFBunny
9eae68efaa
add gpu BCEWithLogitsLoss kernel
5 years ago
Parastoo Ashtari
7b9a73fb1b
Fixing multi graph suspend for debugger in GPU
removed the suspend from preExecute function to avoid double stopping in multigraph models
replaced else if with else in postExecute
add else if to check the smoke ascend test
improve the format
5 years ago
He Wei
121a6a28d9
[auto-monad] Enforce order of exection for Loads user nodes in frontend
5 years ago
mindspore-ci-bot
3ed60633c8
!15710 gpu inference syupport multi-outputs
From: @wilfchen
Reviewed-by: @limingqi107,@cristoval
Signed-off-by: @cristoval
5 years ago
wilfChen
662bce82ae
inference support multi outputs
5 years ago
limingqi107
179c677fef
fix graph output address set in the one time memory application scenarios
5 years ago
mindspore-ci-bot
4189a0c06f
!15563 [GraphKernel] fix precision error when open graph kernel
From: @zengzitao
Reviewed-by: @limingqi107,@anyrenwei
Signed-off-by: @anyrenwei
5 years ago
zengzitao
1fd87c6d83
add address state to fix cache problem when open graph kernel
5 years ago