baihuawei
b9ebd9c280
add gpu nccl broadcast
5 years ago
zhousiyi
e1aa49a4b7
use built-in float16 in arm_neon.h for lite arm
5 years ago
mindspore-ci-bot
9f635e52c7
!4463 change some wrong log about static memory
Merge pull request !4463 from liangzelang/fix-static-memory-size-log
5 years ago
liangzelang
1608c4d096
change-static-memory-size-log
5 years ago
mindspore-ci-bot
b48c1f45f0
!4236 fix gpu heterogeneous bug
Merge pull request !4236 from baihuawei/heter
5 years ago
baihuawei
517fcc16ee
fix gpu heterogeneous
5 years ago
mindspore-ci-bot
21014fd624
!4235 add gpu profiler feature
Merge pull request !4235 from 治愈系潇洒哥/master
5 years ago
mindspore-ci-bot
15c533d481
!4277 Profiling Support Multi Graph
Merge pull request !4277 from caifubi/profiling
5 years ago
askmiao
25cae1a2e7
add profiler featrue
5 years ago
gukecai
adb6ff6c78
independent stream parallel
5 years ago
John Tzanakakis
3569513232
fix d-chip wacthpoints, latest value for GPU inputs
5 years ago
caifubi
09946fcad5
Support Profiling With Graph Only Have Hccl Op
5 years ago
mindspore-ci-bot
8040e8bf89
!4130 modify some bug and add test case for gpu dropout op
Merge pull request !4130 from hanhuifeng/gpu_dropout
5 years ago
hanhuifeng2020
ab6f7420b5
modify some bug and add test case for gpu dropout op
5 years ago
kswang
69c096c2a6
dlopen cpu mpi adapter
5 years ago
mindspore-ci-bot
21dfac0432
!4105 mem pools expands from high addr, dynamic mem expands from low addr
Merge pull request !4105 from liangzelang/change-mem-pools-management
5 years ago
mindspore-ci-bot
64721fa57e
!4104 fix a bug with using dynamic memory
Merge pull request !4104 from lvchangquan/transdata
5 years ago
mindspore-ci-bot
7280d3170a
!3768 GPU debugger grpc implementation and smart kernel read
Merge pull request !3768 from lichen_101010/master_ms1_grpc
5 years ago
liangzelang
fe1f36ea5c
mem pools expands from high addr, dynamic mem expands from low addr
5 years ago
mindspore-ci-bot
1f28a7c097
!4063 Decouple ME and AKG for GPU.
Merge pull request !4063 from ZhangQinghua/master1
5 years ago
lvchangquan
87022fce3c
fix a bug with using dynamic memory.
5 years ago
Zhang Qinghua
22e0a0ba76
Decouple ME and AKG for GPU.
5 years ago
mindspore-ci-bot
8908f6ef19
!4098 Graph compile performance optimization
Merge pull request !4098 from zhoufeng/graph-compile-performance-optimize
5 years ago
zhoufeng
2f5cbfc26f
graph compile performance optimize
Signed-off-by: zhoufeng <zhoufeng54@huawei.com>
5 years ago
lvliang
3a61d646d4
decoupling-the-interface-of-mallocing-mem
5 years ago
mindspore-ci-bot
ba3a2976dc
!4038 support ops cholesky for resnet50 thor gpu
Merge pull request !4038 from mamba_ni/master
5 years ago
mamba_ni
96642a76fd
support cusolver AND OPS cholesky_solve
fix bug
clang-format
format fix
5 years ago
kswang
51e9fbf973
format ompi
5 years ago
mindspore-ci-bot
7b9478aae9
!3989 change parameter's device dtype to infer
Merge pull request !3989 from lianliguang/test-merge
5 years ago
mindspore-ci-bot
52689a7dcf
!3938 decoupling core and context
Merge pull request !3938 from liubuyu/master
5 years ago
zhoufeng
ca7154a548
graph compile performance optimization
Signed-off-by: zhoufeng <zhoufeng54@huawei.com>
5 years ago
lianliguang
57fe409c7c
update mindspore/ccsrc/runtime/device/ascend/kernel_select_ascend.cc.
5 years ago
liubuyu
d81862a916
decoupling core and context
5 years ago
lichen_101010
7499c72d54
mindspore grpc implementation
fix bugs for grpc implementation
addressed peer review comments
delete device_target code from Adel
add checksinglewatchpoint function for node level debugger
set the device target when sending metadata
add current node name
fix bugs for current node name
fix run_level_ bug
fix bugs for CheckSingleWatchpoint
fix multi-outputs node issue
fix num_step_ bug
fix continue_to previous node issue
fix run_level issue
fix merge conflict
smart kernel read, watch hit stop mid-sep, fix step number, read input tensors
cleanup the code and isolate UpdataStepNum function
do cpplint, Cppcheck and clang-format check
recover CMakeList.txt
mindspore grpc implementation
fix bugs for grpc implementation
addressed peer review comments
delete device_target code from Adel
add checksinglewatchpoint function for node level debugger
set the device target when sending metadata
add current node name
fix bugs for current node name
fix run_level_ bug
fix bugs for CheckSingleWatchpoint
fix multi-outputs node issue
fix num_step_ bug
fix continue_to previous node issue
fix run_level issue
fix merge conflict
smart kernel read, watch hit stop mid-sep, fix step number, read input tensors
cleanup the code and isolate UpdataStepNum function
do cpplint, Cppcheck and clang-format check
recover CMakeList.txt
only update step_num in one place
fix clang-format error
fix CI errors part2
update graphengine version
addressed comments
5 years ago
mindspore-ci-bot
a11e0e35f4
!3872 add internal output tensor
Merge pull request !3872 from kisnwang/cache-internal-tensor
5 years ago
mindspore-ci-bot
0a89563f37
!3867 Fix Profiling Data Flush Failed while GLOG_v > 1
Merge pull request !3867 from caifubi/profiling
5 years ago
mindspore-ci-bot
6f7f376b1c
!3821 Decouple ME and TBE by the IPC way.
Merge pull request !3821 from ZhangQinghua/master1
5 years ago
kswang
3ce0f33b27
add internal output tensor
5 years ago
caifubi
d01ca09e41
Fix profiling data Flush failed in GLOG_v > 1
5 years ago
Zhang Qinghua
960da5cbed
Decouple the backend TBE from binding Python API.
5 years ago
z00505269
87668d6ea2
remove predict
5 years ago
mindspore-ci-bot
b331e62400
!3631 fix subgraph maketuple ref error
Merge pull request !3631 from kisnwang/fix-subgraph-maketuple-error
5 years ago
kswang
76733ce816
fix cpu multi graph mem error
5 years ago
mindspore-ci-bot
a337a02732
!3638 fix codex and support akg op profiling
Merge pull request !3638 from geekun/yjk_master
5 years ago
mindspore-ci-bot
d4b52ac59f
!3489 use kernelruntime::mem_manager to reduce rtMalloc and rtFree time in trans data format
Merge pull request !3489 from lvchangquan/master
5 years ago
geekun
17d71280b8
fix codex and support akg op profiling
5 years ago
lvchangquan
fdbe4c19ba
use kernel_runtime::mem_manager to reduce rtMalloc and rtFree time in trans data format
5 years ago
laiyongqiang
d99786e938
fix refnode input type assign
5 years ago
mindspore-ci-bot
b486abc076
!3531 clear warning
Merge pull request !3531 from baihuawei/0727
5 years ago
baihuawei
ca4162805a
clear warning
5 years ago