383 Commits (88ef81663fa598abbcf87510d8a0de9b767ff09f)

Author SHA1 Message Date
  Parastoo Ashtari c6f5fb06f2 add comments for dump and debugger code. 4 years ago
  caifubi d0059e678a Fix data-parallel mix-precision bug in PyNative Mode 4 years ago
  i-robot 0b007e3973
!28843 GPU codex fix 4 years ago
  maning202007 3b4b8673c4 Upatate the api docstring for Summary 4 years ago
  VectorSL d3992332e1 fix codex 4 years ago
  i-robot b25bcb6d4c
!28707 Fix hang at MPI_comm_create in unsymmetrical case. 4 years ago
  ZPaC 723a2d359c Fix hang at MPI_comm_create in unsymmetrical case. 4 years ago
  zjun dad29b207a Fix pynative lazy out 4 years ago
  chujinjin 7d050d0c03 fix getnext timeout or coredump 4 years ago
  i-robot c42cc24a6c !24550 add log: gpu capability check 4 years ago
  i-robot 3058a16213 !28144 using data type to get cube size 4 years ago
  i-robot 9df63697af !28356 Use max block size in non task sink mode 4 years ago
  tanghuikang f0298ff5f6 Non sink mode use one huge block 4 years ago
  caifubi 3719ec1be6 pynative allreduce bucket use memory pool 4 years ago
  lby e6857a6553 using cube to get c0 value 4 years ago
  ZPaC 0eaa2f3404 Fix scheduler nullptr error and add server group 4 years ago
  zjun eb450dd31f Optimize pynative device memory use 4 years ago
  ZPaC 78a79a9b5e Fix comm helper method 4 years ago
  lizhenyu b307016fd1 refine log of kernel select 4 years ago
  i-robot 520fe19b27 !27064 Reimply tensorarray 4 years ago
  ZPaC ae3bae1571 Replace OpenMPI 4 years ago
  VectorSL 20b38e880b update tensor-array 4 years ago
  VectorSL 7c7cd34276 move cpu register into cpp kernel 4 years ago
  VectorSL cb3d25c8f0 add cpu tensor array 4 years ago
  i-robot 745c1eaff8 !26869 1.Purge not used API. 2.Adapt for collective_init.h 4 years ago
  ZPaC 2b7429c5d2 1.Purge not used API. 4 years ago
  wangshuide2020 6cbe8dd02e optimizes the kernel error description of LSTM, Pad, ReLU, etc. 4 years ago
  i-robot d66f811022 !26751 optimizes the kernel error description of Adagrad, Adam, Conv2d, etc. 4 years ago
  wangshuide2020 674e3aa9d6 optimizes the kernel error description of Adagrad, Adam, Conv2d, etc. 4 years ago
  VectorSL 710289a72d add tensor array 4 years ago
  VectorSL 8160fba758 update cublas error string 4 years ago
  i-robot 21ffa1fb7b !25091 Partial support for multi root graph in online debugger 4 years ago
  sabrinasun e7d7476a8e fix dynamic shape dump issue and apply comments from cell dump pr 4 years ago
  Parastoo Ashtari 7f682ba2f6 partial support for multi root graph in online debugger 4 years ago
  ZPaC 611de83fd8 Fix dynamic load error 4 years ago
  i-robot 8072e6d7f7 !26062 add Custom, custom_op_info_register, CustomRegOp to __init__ 4 years ago
  looop5 58e27d87bc add Custom, custom_op_info_register, CustomRegOp to __init__ 4 years ago
  Liu_Xuu 255e2c03b4 [MSLITE] add nccl and mpi distribution in tensorrt delegate 1111_05 4 years ago
  dayschan cbb84ff580 Move IsRealKernel and IsRealCNodeKernel from AnfAlgo to AnfUtils 4 years ago
  i-robot 11bec4d85e !25995 Add nvidia collective lib implementation. 4 years ago
  ZPaC 9e18bad126 Add nvidia collective lib implementation. 4 years ago
  looop5 b89d744e80 Custom op supports no reg info 4 years ago
  LaiYongqiang 7f251e3f08 add attr kAttrSkipNopOpAddr for nop node hidden in execution order 4 years ago
  dayschan 6a26d7f6d9 Move TypeId2String from kernel_compiler/ to ir/dtype_extends.cc 4 years ago
  i-robot 19b04d3ff3 !24074 Support AOT Operator for GPU/CPU Backend 4 years ago
  Yang Jiao 40b648b873 add aot 4 years ago
  ZPaC 4c1ef4cef6 Fix ps cache broadcast error. 4 years ago
  i-robot b271aa7a25 !24969 device address add the key of device 4 years ago
  limingqi107 be100476d6 device address add the key of device 4 years ago
  looop5 46789f260a Custom operator supports tbe dsl 4 years ago