You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 11 kB

5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213
  1. # Contents
  2. - [TextCNN Description](#textcnn-description)
  3. - [Model Architecture](#model-architecture)
  4. - [Dataset](#dataset)
  5. - [Environment Requirements](#environment-requirements)
  6. - [Quick Start](#quick-start)
  7. - [Script Description](#script-description)
  8. - [Script and Sample Code](#script-and-sample-code)
  9. - [Script Parameters](#script-parameters)
  10. - [Training Process](#training-process)
  11. - [Evaluation Process](#evaluation-process)
  12. - [Model Description](#model-description)
  13. - [Performance](#performance)
  14. - [ModelZoo Homepage](#modelzoo-homepage)
  15. # [TextCNN Description](#contents)
  16. TextCNN is an algorithm that uses convolutional neural networks to classify text. It was proposed by Yoon Kim in the article "Convolutional Neural Networks for Sentence Classification" in 2014. It is widely used in various tasks of text classification (such as sentiment analysis). It has become the standard benchmark for the new text classification framework. Each module of TextCNN can complete text classification tasks independently, and it is convenient for distributed configuration and parallel execution. TextCNN is very suitable for the semantic analysis of short texts such as Weibo/News/E-commerce reviews and video bullet screens.
  17. [Paper](https://arxiv.org/abs/1408.5882): Kim Y. Convolutional neural networks for sentence classification[J]. arXiv preprint arXiv:1408.5882, 2014.
  18. # [Model Architecture](#contents)
  19. The basic network structure design of TextCNN can refer to the paper "Convolutional Neural Networks for Sentence Classification". The specific implementation takes reading a sentence "I like this movie very much!" as an example. First, the word segmentation algorithm is used to divide the words into 7 words, and then the words in each part are expanded into a five-dimensional vector through the embedding method. Then use different convolution kernels ([3,4,5]*5) to perform convolution operations on them to obtain feature maps. The default number of convolution kernels is 2. Then use the maxpool operation to pool all the feature maps, and finally merge the pooling result into a one-dimensional feature vector through the connection operation. At last, it can be divided into 2 categories with softmax, and the positive/negative emotions are obtained.
  20. # [Dataset](#contents)
  21. Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
  22. Dataset used: [Movie Review Data](<http://www.cs.cornell.edu/people/pabo/movie-review-data/>)
  23. - Dataset size:1.18M,5331 positive and 5331 negative processed sentences / snippets.
  24. - Train:1.06M, 9596 sentences / snippets
  25. - Test:0.12M, 1066 sentences / snippets
  26. - Data format:text
  27. - Please click [here](<http://www.cs.cornell.edu/people/pabo/movie-review-data/rt-polaritydata.tar.gz>) to download the data, and change the files into utf-8. Then put it into the `data` directory.
  28. - Note:Data will be processed in src/dataset.py
  29. # [Environment Requirements](#contents)
  30. - Hardware(Ascend)
  31. - Prepare hardware environment with Ascend processor.
  32. - Framework
  33. - [MindSpore](https://www.mindspore.cn/install/en)
  34. - For more information, please check the resources below:
  35. - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
  36. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
  37. # [Quick Start](#contents)
  38. After installing MindSpore via the official website, you can start training and evaluation as follows:
  39. - running on Ascend
  40. ```python
  41. # run training example
  42. python train.py > train.log 2>&1 &
  43. OR
  44. sh scripts/run_train.sh
  45. # run evaluation example
  46. python eval.py > eval.log 2>&1 &
  47. OR
  48. sh scripts/run_eval.sh ckpt_path
  49. ```
  50. If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start training and evaluation as follows:
  51. ```python
  52. # run distributed training on modelarts example
  53. # (1) First, Perform a or b.
  54. # a. Set "enable_modelarts=True" on yaml file.
  55. # Set other parameters on yaml file you need.
  56. # b. Add "enable_modelarts=True" on the website UI interface.
  57. # Add other parameters on the website UI interface.
  58. # (2) Set the code directory to "/path/textcnn" on the website UI interface.
  59. # (3) Set the startup file to "train.py" on the website UI interface.
  60. # (4) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
  61. # (5) Create your job.
  62. # run evaluation on modelarts example
  63. # (1) Copy or upload your trained model to S3 bucket.
  64. # (2) Perform a or b.
  65. # a. Set "checkpoint_file_path='/cache/checkpoint_path/model.ckpt'" on yaml file.
  66. # Set "checkpoint_url=/The path of checkpoint in S3/" on yaml file.
  67. # b. Add "checkpoint_file_path='/cache/checkpoint_path/model.ckpt'" on the website UI interface.
  68. # Add "checkpoint_url=/The path of checkpoint in S3/" on the website UI interface.
  69. # (3) Set the code directory to "/path/textcnn" on the website UI interface.
  70. # (4) Set the startup file to "eval.py" on the website UI interface.
  71. # (5) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
  72. # (6) Create your job.
  73. ```
  74. # [Script Description](#contents)
  75. ## [Script and Sample Code](#contents)
  76. ```bash
  77. ├── model_zoo
  78. ├── README.md // descriptions about all the models
  79. ├── textcnn
  80. ├── README.md // descriptions about textcnn
  81. ├──scripts
  82. │ ├── run_train.sh // shell script for distributed on Ascend
  83. │ ├── run_eval.sh // shell script for evaluation on Ascend
  84. ├── src
  85. │ ├── dataset.py // Processing dataset
  86. │ ├── textcnn.py // textcnn architecture
  87. ├── utils
  88. │ ├──device_adapter.py // device adapter
  89. │ ├──local_adapter.py // local adapter
  90. │ ├──moxing_adapter.py // moxing adapter
  91. │ ├── config.py // parameter analysis
  92. ├── mr_config.yaml // parameter configuration
  93. ├── sst2_config.yaml // parameter configuration
  94. ├── subj_config.yaml // parameter configuration
  95. ├── train.py // training script
  96. ├── eval.py // evaluation script
  97. ├── export.py // export checkpoint to other format file
  98. ```
  99. ## [Script Parameters](#contents)
  100. Parameters for both training and evaluation can be set in config.py
  101. - config for movie review dataset
  102. ```python
  103. 'pre_trained': 'False' # whether training based on the pre-trained model
  104. 'nump_classes': 2 # the number of classes in the dataset
  105. 'batch_size': 64 # training batch size
  106. 'epoch_size': 4 # total training epochs
  107. 'weight_decay': 3e-5 # weight decay value
  108. 'data_path': './data/' # absolute full path to the train and evaluation datasets
  109. 'device_target': 'Ascend' # device running the program
  110. 'device_id': 0 # device ID used to train or evaluate the dataset. Ignore it when you use run_train.sh for distributed training
  111. 'keep_checkpoint_max': 1 # only keep the last keep_checkpoint_max checkpoint
  112. 'checkpoint_path': './train_textcnn.ckpt' # the absolute full path to save the checkpoint file
  113. 'word_len': 51 # The length of the word
  114. 'vec_length': 40 # The length of the vector
  115. 'base_lr': 1e-3 # The base learning rate
  116. ```
  117. For more configuration details, please refer the script `config.py`.
  118. ## [Training Process](#contents)
  119. - running on Ascend
  120. ```python
  121. python train.py > train.log 2>&1 &
  122. OR
  123. sh scripts/run_train.sh
  124. ```
  125. The python command above will run in the background, you can view the results through the file `train.log`.
  126. After training, you'll get some checkpoint files in `ckpt`. The loss value will be achieved as follows:
  127. ```python
  128. # grep "loss is " train.log
  129. epoch: 1 step 149, loss is 0.6194226145744324
  130. epoch: 2 step 149, loss is 0.38729554414749146
  131. ...
  132. ```
  133. The model checkpoint will be saved in the `ckpt` directory.
  134. ## [Evaluation Process](#contents)
  135. - evaluation on movie review dataset when running on Ascend
  136. Before running the command below, please check the checkpoint path used for evaluation. Please set the checkpoint path to be the absolute full path, e.g., "username/textcnn/ckpt/train_textcnn.ckpt".
  137. ```python
  138. python eval.py --checkpoint_path=ckpt_path > eval.log 2>&1 &
  139. OR
  140. sh scripts/run_eval.sh ckpt_path
  141. ```
  142. The above python command will run in the background. You can view the results through the file "eval.log". The accuracy of the test dataset will be as follows:
  143. ```python
  144. # grep "accuracy: " eval.log
  145. accuracy: {'acc': 0.7971428571428572}
  146. ```
  147. # [Model Description](#contents)
  148. ## [Performance](#contents)
  149. ### TextCNN on Movie Review Dataset
  150. | Parameters | Ascend |
  151. | ------------------- | ----------------------------------------------------- |
  152. | Model Version | TextCNN |
  153. | Resource |Ascend 910; CPU 2.60GHz, 192cores; Memory 755G; OS Euler2.8 |
  154. | uploaded Date | 11/10/2020 (month/day/year) |
  155. | MindSpore Version | 1.0.1 |
  156. | Dataset | Movie Review Data |
  157. | Training Parameters | epoch=4, steps=149, batch_size = 64 |
  158. | Optimizer | Adam |
  159. | Loss Function | Softmax Cross Entropy |
  160. | outputs | probability |
  161. | Loss | 0.1724 |
  162. | Speed | 1pc: 12.069 ms/step |
  163. | Total time | 1pc: 13s |
  164. | Scripts | [textcnn script](https://gitee.com/xinyunfan/textcnn) |
  165. # [ModelZoo Homepage](#contents)
  166. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).