limingqi107
ff6b64a598
gpu GoogleNet performance optimize
5 years ago
limingqi107
7029a861d7
add kernel release resource
5 years ago
mindspore-ci-bot
5b1cf18cb9
!5055 prepare to support int64
Merge pull request !5055 from lirongzhen1/int64
5 years ago
lizhenyu
57b27c9fb2
code refine for BN docs
5 years ago
lirongzhen1
531ad4df70
prepare to support int64
5 years ago
anthonyaje
09a99cf80b
Add Size() and Capacity() in gpu queue.
5 years ago
mindspore-ci-bot
8c7444ab47
!5140 add cuda path checker
Merge pull request !5140 from zyli2020/add_cuda_path_check
5 years ago
lizhenyu
551879240c
add cuda path checker
5 years ago
lizhenyu
5d6f7204d3
[bugfix]LSTM SyncDeviceToHost failed
5 years ago
lizhenyu
1becddf3a4
[bugfix]SyncDeviceToHost failed when device address size is zero
5 years ago
mindspore-ci-bot
a245ee665e
!4934 fix nccl kernel memory align bug
Merge pull request !4934 from zyli2020/bug_fix
5 years ago
lizhenyu
fcaf86f5d9
fix nccl kernel memory align bug
5 years ago
qianlong
113619f1ca
Revert "Add Size() and Capacity() in gpu queue."
This reverts commit e2b346d5af .
5 years ago
mindspore-ci-bot
b69b1ca8a8
!4830 [gpu] fix continuous allreduces bug
Merge pull request !4830 from yuchaojie/gpu_allreduce
5 years ago
mindspore-ci-bot
d04d58fd21
!4472 Add API to query GPU queue size and capacity
Merge pull request !4472 from anthonyaje/gpu_queue_size
5 years ago
anthonyaje
e2b346d5af
Add Size() and Capacity() in gpu queue.
5 years ago
lizhenyu
839ec02542
Add FusedBatchEx support
5 years ago
limingqi107
5b76e8f3d7
gpu add format transform pass
5 years ago
yuchaojie
61bf4b18a2
fix_consecutive_allreduce_bug
5 years ago
mindspore-ci-bot
55bd09c689
!4684 add log for gpu profiler
Merge pull request !4684 from 治愈系潇洒哥/master
5 years ago
askmiao
5a817d7444
add log and modify log level for gpu profiler
5 years ago
mindspore-ci-bot
3fb58fcbe4
!4585 add gpu nccl broadcast
Merge pull request !4585 from baihuawei/broadcast
5 years ago
baihuawei
b9ebd9c280
add gpu nccl broadcast
5 years ago
mindspore-ci-bot
21014fd624
!4235 add gpu profiler feature
Merge pull request !4235 from 治愈系潇洒哥/master
5 years ago
askmiao
25cae1a2e7
add profiler featrue
5 years ago
John Tzanakakis
3569513232
fix d-chip wacthpoints, latest value for GPU inputs
5 years ago
mindspore-ci-bot
8040e8bf89
!4130 modify some bug and add test case for gpu dropout op
Merge pull request !4130 from hanhuifeng/gpu_dropout
5 years ago
hanhuifeng2020
ab6f7420b5
modify some bug and add test case for gpu dropout op
5 years ago
kswang
69c096c2a6
dlopen cpu mpi adapter
5 years ago
mindspore-ci-bot
7280d3170a
!3768 GPU debugger grpc implementation and smart kernel read
Merge pull request !3768 from lichen_101010/master_ms1_grpc
5 years ago
Zhang Qinghua
22e0a0ba76
Decouple ME and AKG for GPU.
5 years ago
mamba_ni
96642a76fd
support cusolver AND OPS cholesky_solve
fix bug
clang-format
format fix
5 years ago
liubuyu
d81862a916
decoupling core and context
5 years ago
lichen_101010
7499c72d54
mindspore grpc implementation
fix bugs for grpc implementation
addressed peer review comments
delete device_target code from Adel
add checksinglewatchpoint function for node level debugger
set the device target when sending metadata
add current node name
fix bugs for current node name
fix run_level_ bug
fix bugs for CheckSingleWatchpoint
fix multi-outputs node issue
fix num_step_ bug
fix continue_to previous node issue
fix run_level issue
fix merge conflict
smart kernel read, watch hit stop mid-sep, fix step number, read input tensors
cleanup the code and isolate UpdataStepNum function
do cpplint, Cppcheck and clang-format check
recover CMakeList.txt
mindspore grpc implementation
fix bugs for grpc implementation
addressed peer review comments
delete device_target code from Adel
add checksinglewatchpoint function for node level debugger
set the device target when sending metadata
add current node name
fix bugs for current node name
fix run_level_ bug
fix bugs for CheckSingleWatchpoint
fix multi-outputs node issue
fix num_step_ bug
fix continue_to previous node issue
fix run_level issue
fix merge conflict
smart kernel read, watch hit stop mid-sep, fix step number, read input tensors
cleanup the code and isolate UpdataStepNum function
do cpplint, Cppcheck and clang-format check
recover CMakeList.txt
only update step_num in one place
fix clang-format error
fix CI errors part2
update graphengine version
addressed comments
5 years ago
John Tzanakakis
96744f087e
GPU dump - input bins lag behind by 1 iteration
5 years ago
John Tzanakakis
b3c0eb61d5
GPU debugger - milestone 1 and GPU dump
Additonal Authors: Adel Shafiei, Harshvardhan Gupta
5 years ago
mindspore-ci-bot
b13c7a3d48
!3268 refine GPU memory swap performance
Merge pull request !3268 from zyli2020/refine_gpu_mem_swap
5 years ago
ZPaC
0bc74f28c5
Enable get rank id and size by group
5 years ago
lizhenyu
c67e562373
refine GPU memory swap performance
5 years ago
zongha
226dbde481
fix bug for wide&deep
fix bug
5 years ago
mindspore-ci-bot
cfafdcbcf0
!3246 refine gpu memory swap performance
Merge pull request !3246 from zyli2020/refine_gpu_mem_swap
5 years ago
lizhenyu
3ace75509b
refine gpu memory swap performance
5 years ago
mindspore-ci-bot
bae2f964e5
!3213 Unified code style
Merge pull request !3213 from liubuyu/dev
5 years ago
liubuyu
76dc80e7b7
Unified code style
5 years ago
mindspore-ci-bot
ad5c649e86
!3165 support library cusolver for gpu backend
Merge pull request !3165 from mamba_ni/master
5 years ago
limingqi107
a596dd6e43
gpu fix the graph of 'nop node + depend + node'
5 years ago
zongha
82412429cf
support cusolverDn
fix clang format
5 years ago
mindspore-ci-bot
251683096a
!3045 Gpu support TopK kernel
Merge pull request !3045 from chenweifeng/sort
5 years ago
wilfChen
c10e07734c
gpu support TopK kernel
5 years ago
ZPaC
ab23776f5f
GPU supports to create groups for auto parallel.
5 years ago