mindspore-ci-bot
e4451a1a49
!2464 [Dataset] code review & add citation
Merge pull request !2464 from luoyang/pylint
5 years ago
YangLuo
36d1613f9a
!2464 [Dataset] code review & add citation
5 years ago
qianlong
cae77c0c22
BasicTokenizer not case fold on preserverd words
5 years ago
YangLuo
4e3bfcf4c9
!2306 [Dataset] Code review & improve quality
5 years ago
qianlong
980ddd32a2
change output of WordpieceTokenizer and BertTokenizer to 1-D string tensors
5 years ago
peilinwang
1e36b0649f
remove graphengine changes
remove graphengine changes
concat op
Truncate Pair
concat_op
remove graph engine changes
ToNumberOp implementation almost done
ToNumberOp complete
ci fix
ci fix
ci fix
ci fix
ci fix
ci fix
ci fix
ci fix
ci fix
ci fix
merge conflicts
5 years ago
Zirui Wu
b6e9504b31
phase I of Vocab rework
phase II vocab rework
added more test cases
fix api doc string
address review cmts and fix CI
address ci complains
fix review cmts
ci
5 years ago
hesham
b9495a9ccc
Truncate Pair
5 years ago
qianlong
4f16f036be
Add WhitespaceTokenizer and UnicodeScriptTokenizer for nlp
add CaseFold, NormalizeUTF8
add RegexReplace
add RegexTokenizer
add BasicTokenizer
add WordpieceTokenizer
add BertTokenizer
5 years ago
Zirui Wu
2794883644
fix selected minor issues
fix review comments
5 years ago
xiefangqi
8fdfe34f3c
fix codex problems
5 years ago
Zirui Wu
dbf9936ec4
Implemented n-gram for dataset TensorOp
5 years ago
xiefangqi
d971106fec
fix minddata codex
5 years ago
hesham
6c21e556c4
Clean up work for text python package
5 years ago