You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 8.7 kB

5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179
  1. # Contents
  2. - [TextCNN Description](#textcnn-description)
  3. - [Model Architecture](#model-architecture)
  4. - [Dataset](#dataset)
  5. - [Environment Requirements](#environment-requirements)
  6. - [Quick Start](#quick-start)
  7. - [Script Description](#script-description)
  8. - [Script and Sample Code](#script-and-sample-code)
  9. - [Script Parameters](#script-parameters)
  10. - [Training Process](#training-process)
  11. - [Evaluation Process](#evaluation-process)
  12. - [Model Description](#model-description)
  13. - [Performance](#performance)
  14. - [ModelZoo Homepage](#modelzoo-homepage)
  15. # [TextCNN Description](#contents)
  16. TextCNN is an algorithm that uses convolutional neural networks to classify text. It was proposed by Yoon Kim in the article "Convolutional Neural Networks for Sentence Classification" in 2014. It is widely used in various tasks of text classification (such as sentiment analysis). It has become the standard benchmark for the new text classification framework. Each module of TextCNN can complete text classification tasks independently, and it is convenient for distributed configuration and parallel execution. TextCNN is very suitable for the semantic analysis of short texts such as Weibo/News/E-commerce reviews and video bullet screens.
  17. [Paper](https://arxiv.org/abs/1408.5882): Kim Y. Convolutional neural networks for sentence classification[J]. arXiv preprint arXiv:1408.5882, 2014.
  18. # [Model Architecture](#contents)
  19. The basic network structure design of TextCNN can refer to the paper "Convolutional Neural Networks for Sentence Classification". The specific implementation takes reading a sentence "I like this movie very much!" as an example. First, the word segmentation algorithm is used to divide the words into 7 words, and then the words in each part are expanded into a five-dimensional vector through the embedding method. Then use different convolution kernels ([3,4,5]*5) to perform convolution operations on them to obtain feature maps. The default number of convolution kernels is 2. Then use the maxpool operation to pool all the feature maps, and finally merge the pooling result into a one-dimensional feature vector through the connection operation. At last, it can be divided into 2 categories with softmax, and the positive/negative emotions are obtained.
  20. # [Dataset](#contents)
  21. Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
  22. Dataset used: [Movie Review Data](<http://www.cs.cornell.edu/people/pabo/movie-review-data/>)
  23. - Dataset size:1.18M,5331 positive and 5331 negative processed sentences / snippets.
  24. - Train:1.06M, 9596 sentences / snippets
  25. - Test:0.12M, 1066 sentences / snippets
  26. - Data format:text
  27. - Please click [here](<http://www.cs.cornell.edu/people/pabo/movie-review-data/rt-polaritydata.tar.gz>) to download the data, and change the files into utf-8. Then put it into the `data` directory.
  28. - Note:Data will be processed in src/dataset.py
  29. # [Environment Requirements](#contents)
  30. - Hardware(Ascend)
  31. - Prepare hardware environment with Ascend processor.
  32. - Framework
  33. - [MindSpore](https://www.mindspore.cn/install/en)
  34. - For more information, please check the resources below:
  35. - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
  36. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
  37. # [Quick Start](#contents)
  38. After installing MindSpore via the official website, you can start training and evaluation as follows:
  39. - running on Ascend
  40. ```python
  41. # run training example
  42. python train.py > train.log 2>&1 &
  43. OR
  44. sh scripts/run_train.sh
  45. # run evaluation example
  46. python eval.py > eval.log 2>&1 &
  47. OR
  48. sh scripts/run_eval.sh ckpt_path
  49. ```
  50. # [Script Description](#contents)
  51. ## [Script and Sample Code](#contents)
  52. ```bash
  53. ├── model_zoo
  54. ├── README.md // descriptions about all the models
  55. ├── textcnn
  56. ├── README.md // descriptions about textcnn
  57. ├──scripts
  58. │ ├── run_train.sh // shell script for distributed on Ascend
  59. │ ├── run_eval.sh // shell script for evaluation on Ascend
  60. ├── src
  61. │ ├── dataset.py // Processing dataset
  62. │ ├── textcnn.py // textcnn architecture
  63. │ ├── config.py // parameter configuration
  64. ├── train.py // training script
  65. ├── eval.py // evaluation script
  66. ├── export.py // export checkpoint to other format file
  67. ```
  68. ## [Script Parameters](#contents)
  69. Parameters for both training and evaluation can be set in config.py
  70. - config for movie review dataset
  71. ```python
  72. 'pre_trained': 'False' # whether training based on the pre-trained model
  73. 'nump_classes': 2 # the number of classes in the dataset
  74. 'batch_size': 64 # training batch size
  75. 'epoch_size': 4 # total training epochs
  76. 'weight_decay': 3e-5 # weight decay value
  77. 'data_path': './data/' # absolute full path to the train and evaluation datasets
  78. 'device_target': 'Ascend' # device running the program
  79. 'device_id': 0 # device ID used to train or evaluate the dataset. Ignore it when you use run_train.sh for distributed training
  80. 'keep_checkpoint_max': 1 # only keep the last keep_checkpoint_max checkpoint
  81. 'checkpoint_path': './train_textcnn.ckpt' # the absolute full path to save the checkpoint file
  82. 'word_len': 51 # The length of the word
  83. 'vec_length': 40 # The length of the vector
  84. 'base_lr': 1e-3 # The base learning rate
  85. ```
  86. For more configuration details, please refer the script `config.py`.
  87. ## [Training Process](#contents)
  88. - running on Ascend
  89. ```python
  90. python train.py > train.log 2>&1 &
  91. OR
  92. sh scripts/run_train.sh
  93. ```
  94. The python command above will run in the background, you can view the results through the file `train.log`.
  95. After training, you'll get some checkpoint files in `ckpt`. The loss value will be achieved as follows:
  96. ```python
  97. # grep "loss is " train.log
  98. epoch: 1 step 149, loss is 0.6194226145744324
  99. epoch: 2 step 149, loss is 0.38729554414749146
  100. ...
  101. ```
  102. The model checkpoint will be saved in the `ckpt` directory.
  103. ## [Evaluation Process](#contents)
  104. - evaluation on movie review dataset when running on Ascend
  105. Before running the command below, please check the checkpoint path used for evaluation. Please set the checkpoint path to be the absolute full path, e.g., "username/textcnn/ckpt/train_textcnn.ckpt".
  106. ```python
  107. python eval.py --checkpoint_path=ckpt_path > eval.log 2>&1 &
  108. OR
  109. sh scripts/run_eval.sh ckpt_path
  110. ```
  111. The above python command will run in the background. You can view the results through the file "eval.log". The accuracy of the test dataset will be as follows:
  112. ```python
  113. # grep "accuracy: " eval.log
  114. accuracy: {'acc': 0.7971428571428572}
  115. ```
  116. # [Model Description](#contents)
  117. ## [Performance](#contents)
  118. ### TextCNN on Movie Review Dataset
  119. | Parameters | Ascend |
  120. | ------------------- | ----------------------------------------------------- |
  121. | Model Version | TextCNN |
  122. | Resource | Ascend 910 ;CPU 2.60GHz,192cores;Memory,755G |
  123. | uploaded Date | 11/10/2020 (month/day/year) |
  124. | MindSpore Version | 1.0.1 |
  125. | Dataset | Movie Review Data |
  126. | Training Parameters | epoch=4, steps=149, batch_size = 64 |
  127. | Optimizer | Adam |
  128. | Loss Function | Softmax Cross Entropy |
  129. | outputs | probability |
  130. | Loss | 0.1724 |
  131. | Speed | 1pc: 12.069 ms/step |
  132. | Total time | 1pc: 13s |
  133. | Scripts | [textcnn script](https://gitee.com/xinyunfan/textcnn) |
  134. # [ModelZoo Homepage](#contents)
  135. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).