Cathy Wong
f7adf648e9
dataset API docstring: Update datasets, samplers, graphdata and text
5 years ago
xulei2020
18b519ae0f
add sentence piece
5 years ago
YangLuo
4136892a3e
add SlidingWindow Op
5 years ago
qianlong
94581f1c43
del JiebaMode and NormalizeForm from python api doc
5 years ago
qianlong
d9f4549d13
add comment for dataset.text
5 years ago
hesham
e981c67acd
Python Tokenizer
!38 Synchronize with latest Ascend software suite 17 Jun 2020
Merge pull request !38 from yanghaoran/master
5 years ago
peilinwang
1e36b0649f
remove graphengine changes
remove graphengine changes
concat op
Truncate Pair
concat_op
remove graph engine changes
ToNumberOp implementation almost done
ToNumberOp complete
ci fix
ci fix
ci fix
ci fix
ci fix
ci fix
ci fix
ci fix
ci fix
ci fix
merge conflicts
5 years ago
hesham
b9495a9ccc
Truncate Pair
5 years ago
qianlong
4f16f036be
Add WhitespaceTokenizer and UnicodeScriptTokenizer for nlp
add CaseFold, NormalizeUTF8
add RegexReplace
add RegexTokenizer
add BasicTokenizer
add WordpieceTokenizer
add BertTokenizer
6 years ago
Zirui Wu
dbf9936ec4
Implemented n-gram for dataset TensorOp
5 years ago
hesham
6c21e556c4
Clean up work for text python package
6 years ago
Zirui Wu
25ab2ef303
Implemented lookup and vocab
6 years ago