You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
VectorSL cb3d25c8f0 add cpu tensor array 4 years ago
..
distribution 1.Purge not used API. 4 years ago
mpi code warning clean 5 years ago
blocking_queue.cc fix codecheck and pclint 4 years ago
blocking_queue.h [feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset 4 years ago
cuda_common.h add more dtypes support for gatherdgrad and other bugfix 5 years ago
cuda_driver.cc fix an allreduce bug with two streams sync problem 4 years ago
cuda_driver.h fix an allreduce bug with two streams sync problem 4 years ago
cuda_env_checker.cc [feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset 4 years ago
cuda_env_checker.h search nvcc in entire PATH 5 years ago
gpu_bucket.cc unify runtime for PyNative distributed mode 4 years ago
gpu_bucket.h unify runtime for PyNative distributed mode 4 years ago
gpu_buffer_mgr.cc [feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset 4 years ago
gpu_buffer_mgr.h add push opt logic 5 years ago
gpu_common.h optimizes the kernel error description of LSTM, Pad, ReLU, etc. 4 years ago
gpu_device_address.cc !25091 Partial support for multi root graph in online debugger 4 years ago
gpu_device_address.h partial support for multi root graph in online debugger 4 years ago
gpu_device_manager.cc fix the test case of CPU dump 4 years ago
gpu_device_manager.h fix an allreduce bug with two streams sync problem 4 years ago
gpu_event.cc codedex and pclint 4 years ago
gpu_event.h codedex and pclint 4 years ago
gpu_kernel_build.cc add validation of vector size and non-zero validation of denominator for nn gpu operators. 4 years ago
gpu_kernel_build.h add hardware abstract layer 5 years ago
gpu_kernel_runtime.cc add attr kAttrSkipNopOpAddr for nop node hidden in execution order 4 years ago
gpu_kernel_runtime.h add attr kAttrSkipNopOpAddr for nop node hidden in execution order 4 years ago
gpu_launch_kernel.cc Cherry-pick code from enterprise 4 years ago
gpu_launch_kernel.h Cherry-pick code from enterprise 4 years ago
gpu_launch_mul.cc Cherry-pick code from enterprise 4 years ago
gpu_launch_mul.h Cherry-pick code from enterprise 4 years ago
gpu_memory_allocator.cc add ascend memory adapter for ascend memory management 4 years ago
gpu_memory_allocator.h add ascend memory adapter for ascend memory management 4 years ago
gpu_memory_copy_manager.cc fix an allreduce bug with two streams sync problem 4 years ago
gpu_memory_copy_manager.h fix an allreduce bug with two streams sync problem 4 years ago
gpu_memory_manager.cc add ascend memory adapter for ascend memory management 4 years ago
gpu_memory_manager.h Fix PyNative Parameter Broadcast bug 4 years ago
gpu_stream_assign.cc [feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset 4 years ago
gpu_stream_assign.h actor runtimie supports allreduce multi-stream 4 years ago
gpu_tensor_array.cc add cpu tensor array 4 years ago
gpu_tensor_array.h add cpu tensor array 4 years ago
kernel_info_setter.cc add Custom, custom_op_info_register, CustomRegOp to __init__ 4 years ago
kernel_info_setter.h 1.Optimize bias add grad kernel 4 years ago
queue_common.h add trace for gpu error/excpt log 5 years ago
readme.md mindspore path adjust 5 years ago
trt_loader.cc trt operator 4 years ago
trt_loader.h gpu inference mixed precision 4 years ago