You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
wilfChen 59c4cf256c gpu support broadcast kernels 5 years ago
..
argmax_impl.cu initial version 6 years ago
argmax_impl.cuh initial version 6 years ago
assign_add_impl.cu initial version 6 years ago
assign_add_impl.cuh initial version 6 years ago
batchnorm_fold2_impl.cu add quantizaiton gpu op 5 years ago
batchnorm_fold2_impl.cuh add quantizaiton gpu op 5 years ago
batchnorm_fold_impl.cu add quantizaiton gpu op 5 years ago
batchnorm_fold_impl.cuh add quantizaiton gpu op 5 years ago
broadcast_grad_impl.cu gpu support MinimumGrad & MaximumGrad kernel 5 years ago
broadcast_grad_impl.cuh gpu support MinimumGrad & MaximumGrad kernel 5 years ago
broadcast_impl.cu gpu support broadcast kernels 5 years ago
broadcast_impl.cuh gpu support broadcast kernels 5 years ago
concatv2_impl.cu gpu concat kernel support 4 inputs 5 years ago
concatv2_impl.cuh gpu concat kernel support 4 inputs 5 years ago
correction_mul_impl.cu add quantizaiton gpu op 5 years ago
correction_mul_impl.cuh add quantizaiton gpu op 5 years ago
cross_entropy_impl.cu fix bug in cross entropy error 5 years ago
cross_entropy_impl.cuh fix bug in cross entropy error 5 years ago
dropout_impl.cu add quantizaiton gpu op 5 years ago
dropout_impl.cuh add quantizaiton gpu op 5 years ago
equalcount_impl.cu initial version 6 years ago
equalcount_impl.cuh initial version 6 years ago
fake_quant_impl.cu bug fix 5 years ago
fake_quant_impl.cuh bug fix 5 years ago
fake_quant_per_channel_impl.cu bug fix 5 years ago
fake_quant_per_channel_impl.cuh add quantizaiton gpu op 5 years ago
float_status_impl.cu gpu add float_status kernel 5 years ago
float_status_impl.cuh gpu add float_status kernel 5 years ago
gather.cu initial version 6 years ago
gather.cuh add quantizaiton gpu op 5 years ago
gelu_impl.cu gpu support Gelu & GeluGrad kernels 5 years ago
gelu_impl.cuh gpu support Gelu & GeluGrad kernels 5 years ago
layer_norm_grad_impl.cu Gpu support LayerNorm kernel 5 years ago
layer_norm_grad_impl.cuh Gpu support LayerNorm kernel 5 years ago
layer_norm_impl.cu Gpu support LayerNorm kernel 5 years ago
layer_norm_impl.cuh Gpu support LayerNorm kernel 5 years ago
momentum_impl.cu initial version 6 years ago
momentum_impl.cuh initial version 6 years ago
one_hot_impl.cu initial version 6 years ago
one_hot_impl.cuh initial version 6 years ago
pad_impl.cu initial version 6 years ago
pad_impl.cuh initial version 6 years ago
rmsprop_impl.cu gpu support RMSProp kernel 5 years ago
rmsprop_impl.cuh gpu support RMSProp kernel 5 years ago
select_impl.cu gpu add kernel select 5 years ago
select_impl.cuh gpu add kernel select 5 years ago
slice_impl.cu Gpu Slice kernel performance improvement 5 years ago
slice_impl.cuh Gpu Slice kernel performance improvement 5 years ago
sparse_cross_entropy_cuda_impl.cu add quantizaiton gpu op 5 years ago
sparse_cross_entropy_cuda_impl.cuh add quantizaiton gpu op 5 years ago
tanh_impl.cu gpu support tanh & tanhgrad kernel 5 years ago
tanh_impl.cuh gpu support tanh & tanhgrad kernel 5 years ago
transpose_impl.cu initial version 6 years ago
transpose_impl.cuh initial version 6 years ago
unary_op_impl.cu gpu queue support unary 5 years ago
unary_op_impl.cuh gpu queue support unary 5 years ago
unsorted_segment_sum.cu gpu support UnsortedSegmentSum kernel 5 years ago
unsorted_segment_sum.cuh gpu support UnsortedSegmentSum kernel 5 years ago