i-robot
4252b24335
!26792 malloc ts memory for label
Merge pull request !26792 from zhoufeng/change-label-memory-type
4 years ago
zhoufeng
881179fa10
malloc ts memory for label
Signed-off-by: zhoufeng <zhoufeng54@huawei.com>
4 years ago
i-robot
3d0f9d8aae
!26683 Enable compile cache feature to load hyper parameter data from python
Merge pull request !26683 from LiangZhibo/mindir
4 years ago
i-robot
cfc6ea32ff
!24714 replace rtmemcpyxx to acl memcpy
Merge pull request !24714 from jjfeing/br_replace_rtmemcpyxx_with_acl_api
4 years ago
l00591931
21df240f23
Enable mindir to load initialize weight from python
4 years ago
jjfeing
05485d991c
replace api with acl api
4 years ago
i-robot
ce00ee1ad1
!25367 use acl api to control profiling
Merge pull request !25367 from yanghaitao/yht_condation_start_profiler
4 years ago
i-robot
9d6248194e
!26310 MindSpore support load custom aicpu kernels.
Merge pull request !26310 from linqingke/aicpu
4 years ago
yanghaitao1
c94aa6b872
use profiler acl api instead
4 years ago
linqingke
bef2923acf
MindSpore support load custom aicpu ops.
4 years ago
yao_yf
501b978d16
find data parallel common group in auto parallel
4 years ago
ougongchang
9229f1c1ff
profiler support to collect parallel strategy info
If SetNodeOutputType functions forcibly splits into multiple functions, the readability decreases, so it blocks lizard scans
4 years ago
baihuawei
e59d07899b
fix reset8p pynative performance
4 years ago
LaiYongqiang
dc7988f4bd
log improvement
4 years ago
i-robot
cb307e24cf
!25153 refactor device loop control
Merge pull request !25153 from laiyongqiang/adjust_kernel_refactory
4 years ago
LaiYongqiang
9bfb2d99fa
refactor device loop control
4 years ago
lby
6872e67131
split compile ang gen kernel mod
4 years ago
i-robot
e920a1c07e
!24593 Continue execution when saving and loading mindir failed
Merge pull request !24593 from YuJianfeng/master
4 years ago
yujianfeng
d384db6c01
Continue execution when saving and loading mindir failed
4 years ago
lby
3e9fd763c3
delete old build process
4 years ago
caifubi
f092e623e0
Compile isolation for Profiling and Dump
4 years ago
i-robot
2c692bf7de
!22450 insert the overflow check operators according to the "gradients" scope name.
Merge pull request !22450 from guoqi/overflow-check-master
4 years ago
guoqi
8fccec4c20
insert overflow check operaters according to the 'gradients' scope
4 years ago
gaojing
fa02606348
step train modified
4 years ago
baihuawei
a9694a9230
ascend add nontask sink mode
4 years ago
ms_yan
36a8886ca2
Revert "[feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset"
This reverts commit b077aa1cab .
Revert "[feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset"
This reverts commit 4e6f7dc97d .
delete pass_registry_test.cc
comment hiai_nlu_model_multi.pb related line
4 years ago
djc
b077aa1cab
[feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset
4 years ago
djc
4e6f7dc97d
[feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset
4 years ago
yanghaitao1
8fc11cb676
adapt delete libms_profiler_fwk.a
4 years ago
caifubi
dfe0e94466
Fix PyNative get_rank_id/get_rank_size
4 years ago
i-robot
6afcd815d2
!21362 add pynative profiling codes based on ascend and gpu
Merge pull request !21362 from lvchangquan/profiling_refactor
4 years ago
zhoufeng
03a56f2bb0
alltoall exception handle
Signed-off-by: zhoufeng <zhoufeng54@huawei.com>
4 years ago
lby
a5029f061c
ascend kernel build refactory
4 years ago
lvchangquan
e8d9803258
add profiling codes based on ascend and gpu in pynative mode
4 years ago
lby
e6cdf098db
op tiling compute interface replace
4 years ago
baihuawei
41de02a58c
ascend support nontask sink
4 years ago
i-robot
69c5021bb5
!20995 pyfunc cpu kernel
Merge pull request !20995 from chenweifeng/cpu-dynamic-input
4 years ago
wilfChen
d6fffdad6e
support dynamic inputs & outputs
4 years ago
yanghaoran
0364650eae
Upgrade Ascend packages 28 Jul 21, with testcases removed
4 years ago
dayschan
3ab53dd26d
Send compilation attrs to akg
1. Add a new message type "AKG/ATTR" in AkgKernelBuilder.
the attrs was sent before the kernel infos.
2. Send "online_tuning" attribute when the flag is not zero,
but error occurs in the latest akg submodule.
3. Send "repository_path" attribute when the flag is not empty.
4. Add a new value "compute_capability" into kernel info when the processor is GPU.
4 years ago
dingpeifei
24ff7ab8b4
upgrade_ascend_0705
4 years ago
zhoufeng
b9378e36c3
dynamic default memory size for vnpu
Signed-off-by: zhoufeng <zhoufeng54@huawei.com>
4 years ago
dingpeifei
63784e49f5
upgrade_ascend_0626_mindspore
4 years ago
i-robot
708e56f659
!17899 Fix compile cache bug for resent50
Merge pull request !17899 from LiangZhibo/cache
4 years ago
l00591931
8ae5d7cc84
Fix compile cache for resnet50
4 years ago
lishanni513
8c98146c76
Add register limit constraint
4 years ago
caifubi
61efa2c23b
remove some code
4 years ago
yanghaitao1
127e4d4068
fix profiler pclint&codex
4 years ago
yujianfeng
7cea682c7d
Add func_graph caching after validate action
5 years ago
zhoufeng
fee35368d1
modify defination of hccl origin function in hccl plugin
Signed-off-by: zhoufeng <zhoufeng54@huawei.com>
4 years ago