16 Commits (cdd3e50e986fd85375bd3cb89eabfa60ecc31896)

Author SHA1 Message Date
  huangxinjing b16dbf2b5d 1. Fix optimizer check error, as the check is done by the class name, too naive 4 years ago
  huangxinjing 8c9b2b93a8 Add transformer 4 years ago
  yao_yf 188d39da83 slice_activation_in_recompute 4 years ago
  huangxinjing f354ab22a3 add pipeline shard interface 4 years ago
  linqingke acde7febef update pangu reshape and softmax performance. 4 years ago
  huangxinjing 0b89d5c9c4 fix batch size error 4 years ago
  huangxinjing e02f553010 Fix spell error and add mode check 4 years ago
  huangxinjing 6cea07f749 Add args check 4 years ago
  yao_yf 82889ec56b fixed sparse attention 4 years ago
  zhihenghu ce12c02343 Add Sparse Attention 4 years ago
  huangxinjing 62496d75f3 less the interface exposed 4 years ago
  ms_yan 36a8886ca2 Revert "[feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset" 4 years ago
  djc 4e6f7dc97d [feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset 4 years ago
  huangxinjing d777742904 1. Move the class to mindspore.parallel, support activation sharding 4 years ago
  huangxinjing 18044aff0f 1. Add docstring, elimitate attention mask, tuple append the deocoder return layer past 4 years ago
  huangxinjing 615d1a179d Add transformer layer 4 years ago