You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

bert_tokenizer.txt 264 B

123456789101112131415
  1. 床前明月光
  2. 疑是地上霜
  3. 举头望明月
  4. 低头思故乡
  5. I am making small mistakes during working hours
  6. 😀嘿嘿😃哈哈😄大笑😁嘻嘻
  7. 繁體字
  8. unused [CLS]
  9. unused [SEP]
  10. unused [UNK]
  11. unused [PAD]
  12. unused [MASK]
  13. [unused1]
  14. [unused10]
  15. 12+/-28=40/-16