Parastoo Ashtari
c6f5fb06f2
add comments for dump and debugger code.
4 years ago
caifubi
d0059e678a
Fix data-parallel mix-precision bug in PyNative Mode
4 years ago
i-robot
0b007e3973
!28843 GPU codex fix
Merge pull request !28843 from VectorSL/fix-codex
4 years ago
maning202007
3b4b8673c4
Upatate the api docstring for Summary
4 years ago
VectorSL
d3992332e1
fix codex
4 years ago
i-robot
b25bcb6d4c
!28707 Fix hang at MPI_comm_create in unsymmetrical case.
Merge pull request !28707 from ZPaC/fix-get-rank-in-alltoall-fusion
4 years ago
ZPaC
723a2d359c
Fix hang at MPI_comm_create in unsymmetrical case.
4 years ago
zjun
dad29b207a
Fix pynative lazy out
Signed-off-by: zjun <zhangjun0@huawei.com>
4 years ago
chujinjin
7d050d0c03
fix getnext timeout or coredump
4 years ago
i-robot
c42cc24a6c
!24550 add log: gpu capability check
Merge pull request !24550 from VectorSL/add-minimum-sm-check
4 years ago
i-robot
3058a16213
!28144 using data type to get cube size
Merge pull request !28144 from liubuyu/SBB
4 years ago
i-robot
9df63697af
!28356 Use max block size in non task sink mode
Merge pull request !28356 from tanghuikang/block_size_max
4 years ago
tanghuikang
f0298ff5f6
Non sink mode use one huge block
4 years ago
caifubi
3719ec1be6
pynative allreduce bucket use memory pool
4 years ago
lby
e6857a6553
using cube to get c0 value
4 years ago
ZPaC
0eaa2f3404
Fix scheduler nullptr error and add server group
4 years ago
zjun
eb450dd31f
Optimize pynative device memory use
Add gradient to pynative unique
Signed-off-by: zjun <zhangjun0@huawei.com>
4 years ago
ZPaC
78a79a9b5e
Fix comm helper method
4 years ago
lizhenyu
b307016fd1
refine log of kernel select
4 years ago
i-robot
520fe19b27
!27064 Reimply tensorarray
Merge pull request !27064 from VectorSL/reimpy-tensorarray
4 years ago
ZPaC
ae3bae1571
Replace OpenMPI
4 years ago
VectorSL
20b38e880b
update tensor-array
4 years ago
VectorSL
7c7cd34276
move cpu register into cpp kernel
4 years ago
VectorSL
cb3d25c8f0
add cpu tensor array
4 years ago
i-robot
745c1eaff8
!26869 1.Purge not used API. 2.Adapt for collective_init.h
Merge pull request !26869 from ZPaC/dir-of-distributed
4 years ago
ZPaC
2b7429c5d2
1.Purge not used API.
2.Adapt for collective_init.h
4 years ago
wangshuide2020
6cbe8dd02e
optimizes the kernel error description of LSTM, Pad, ReLU, etc.
4 years ago
i-robot
d66f811022
!26751 optimizes the kernel error description of Adagrad, Adam, Conv2d, etc.
Merge pull request !26751 from wangshuide/wsd_master
4 years ago
wangshuide2020
674e3aa9d6
optimizes the kernel error description of Adagrad, Adam, Conv2d, etc.
4 years ago
VectorSL
710289a72d
add tensor array
4 years ago
VectorSL
8160fba758
update cublas error string
4 years ago
i-robot
21ffa1fb7b
!25091 Partial support for multi root graph in online debugger
Merge pull request !25091 from parastooashtari/online_multi_root_graph
4 years ago
sabrinasun
e7d7476a8e
fix dynamic shape dump issue and apply comments from cell dump pr
4 years ago
Parastoo Ashtari
7f682ba2f6
partial support for multi root graph in online debugger
4 years ago
ZPaC
611de83fd8
Fix dynamic load error
4 years ago
i-robot
8072e6d7f7
!26062 add Custom, custom_op_info_register, CustomRegOp to __init__
Merge pull request !26062 from looop5/custom_init_commit
4 years ago
looop5
58e27d87bc
add Custom, custom_op_info_register, CustomRegOp to __init__
4 years ago
Liu_Xuu
255e2c03b4
[MSLITE] add nccl and mpi distribution in tensorrt delegate 1111_05
4 years ago
dayschan
cbb84ff580
Move IsRealKernel and IsRealCNodeKernel from AnfAlgo to AnfUtils
the function IsOneOfPrimitive and IsOneOfPrimitiveCNode is useful,
we can move them into anf.cc
4 years ago
i-robot
11bec4d85e
!25995 Add nvidia collective lib implementation.
Merge pull request !25995 from ZPaC/dir-of-distributed
4 years ago
ZPaC
9e18bad126
Add nvidia collective lib implementation.
4 years ago
looop5
b89d744e80
Custom op supports no reg info
4 years ago
LaiYongqiang
7f251e3f08
add attr kAttrSkipNopOpAddr for nop node hidden in execution order
4 years ago
dayschan
6a26d7f6d9
Move TypeId2String from kernel_compiler/ to ir/dtype_extends.cc
changed the function to "TypeIdToString", and use the Type::ToString() function,
instead of TypeId-String map.
changed the DtypeToTypeId together, the original StringToType can be used.
added a new interface StringToTypeId.
4 years ago
i-robot
19b04d3ff3
!24074 Support AOT Operator for GPU/CPU Backend
Merge pull request !24074 from jiaoy1224/pyfunc
4 years ago
Yang Jiao
40b648b873
add aot
4 years ago
ZPaC
4c1ef4cef6
Fix ps cache broadcast error.
4 years ago
i-robot
b271aa7a25
!24969 device address add the key of device
Merge pull request !24969 from limingqi107/new_actor_runtime
4 years ago
limingqi107
be100476d6
device address add the key of device
4 years ago
looop5
46789f260a
Custom operator supports tbe dsl
4 years ago