You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

mindspore.dataset.text.SentencePieceModel.rst 901 B

4 years ago
1234567891011121314
  1. mindspore.dataset.text.SentencePieceModel
  2. ==========================================
  3. .. py::class:: mindspore.dataset.text.SentencePieceModel(value, names=None, *, module=None, qualname=None, type=None, start=1)
  4. `SentencePiece` 分词方法的枚举类。
  5. 可选的枚举值包括:`SentencePieceModel.UNIGRAM`、`SentencePieceModel.BPE`、`SentencePieceModel.CHAR`和`SentencePieceModel.WORD`
  6. - **SentencePieceModel.UNIGRAM** - Unigram语言模型意味着句子中的下一个单词被假定为独立于模型生成的前一个单词。
  7. - **SentencePieceModel.BPE** - 指字节对编码算法,它取代了最频繁的对句子中的字节数,其中包含一个未使用的字节。
  8. - **SentencePieceModel.CHAR** - 引用基于字符的SentencePiece模型类型。
  9. - **SentencePieceModel.WORD** - 引用基于单词的SentencePiece模型类型。