You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 11 kB

5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255
  1. # DeepText for Ascend
  2. - [DeepText Description](#DeepText-description)
  3. - [Model Architecture](#model-architecture)
  4. - [Dataset](#dataset)
  5. - [Features](#features)
  6. - [Mixed Precision](#mixed-precision)
  7. - [Environment Requirements](#environment-requirements)
  8. - [Script Description](#script-description)
  9. - [Script and Sample Code](#script-and-sample-code)
  10. - [Training Process](#training-process)
  11. - [Evaluation Process](#evaluation-process)
  12. - [Evaluation](#evaluation)
  13. - [Model Description](#model-description)
  14. - [Performance](#performance)
  15. - [Training Performance](#evaluation-performance)
  16. - [Inference Performance](#evaluation-performance)
  17. - [Description of Random Situation](#description-of-random-situation)
  18. - [ModelZoo Homepage](#modelzoo-homepage)
  19. # [DeepText Description](#contents)
  20. DeepText is a convolutional neural network architecture for text detection in non-specific scenarios. The DeepText system is based on the elegant framework of Faster R-CNN. This idea was proposed in the paper "DeepText: A new approach for text proposal generation and text detection in natural images.", published in 2017.
  21. [Paper](https://arxiv.org/pdf/1605.07314v1.pdf) Zhuoyao Zhong, Lianwen Jin, Shuangping Huang, South China University of Technology (SCUT), Published in ICASSP 2017.
  22. # [Model architecture](#contents)
  23. The overall network architecture of InceptionV4 is show below:
  24. [Link](https://arxiv.org/pdf/1605.07314v1.pdf)
  25. # [Dataset](#contents)
  26. Here we used 4 datasets for training, and 1 datasets for Evaluation.
  27. - Dataset1: ICDAR 2013: Focused Scene Text
  28. - Train: 142MB, 229 images
  29. - Test: 110MB, 233 images
  30. - Dataset2: ICDAR 2013: Born-Digital Images
  31. - Train: 27.7MB, 410 images
  32. - Dataset3: SCUT-FORU: Flickr OCR Universal Database
  33. - Train: 388MB, 1715 images
  34. - Dataset4: CocoText v2(Subset of MSCOCO2017):
  35. - Train: 13GB, 63686 images
  36. # [Features](#contents)
  37. # [Environment Requirements](#contents)
  38. - Hardware(Ascend)
  39. - Prepare hardware environment with Ascend processor.
  40. - Framework
  41. - [MindSpore](https://www.mindspore.cn/install/en)
  42. - For more information, please check the resources below:
  43. - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
  44. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
  45. # [Script description](#contents)
  46. ## [Script and sample code](#contents)
  47. ```shell
  48. .
  49. └─deeptext
  50. ├─README.md
  51. ├─ascend310_infer #application for 310 inference
  52. ├─scripts
  53. ├─run_standalone_train_ascend.sh # launch standalone training with ascend platform(1p)
  54. ├─run_distribute_train_ascend.sh # launch distributed training with ascend platform(8p)
  55. ├─run_infer_310.sh # shell script for 310 inference
  56. └─run_eval_ascend.sh # launch evaluating with ascend platform
  57. ├─src
  58. ├─DeepText
  59. ├─__init__.py # package init file
  60. ├─anchor_genrator.py # anchor generator
  61. ├─bbox_assign_sample.py # proposal layer for stage 1
  62. ├─bbox_assign_sample_stage2.py # proposal layer for stage 2
  63. ├─deeptext_vgg16.py # main network definition
  64. ├─proposal_generator.py # proposal generator
  65. ├─rcnn.py # rcnn
  66. ├─roi_align.py # roi_align cell wrapper
  67. ├─rpn.py # region-proposal network
  68. └─vgg16.py # backbone
  69. ├─config.py # training configuration
  70. ├─aipp.cfg # aipp config file
  71. ├─dataset.py # data proprocessing
  72. ├─lr_schedule.py # learning rate scheduler
  73. ├─network_define.py # network definition
  74. └─utils.py # some functions which is commonly used
  75. ├─eval.py # eval net
  76. ├─export.py # export checkpoint, surpport .onnx, .air, .mindir convert
  77. ├─postprogress.py # post process for 310 inference
  78. └─train.py # train net
  79. ```
  80. ## [Training process](#contents)
  81. ### Usage
  82. - Ascend:
  83. ```bash
  84. # distribute training example(8p)
  85. sh run_distribute_train_ascend.sh [IMGS_PATH] [ANNOS_PATH] [RANK_TABLE_FILE] [PRETRAINED_PATH] [COCO_TEXT_PARSER_PATH]
  86. # standalone training
  87. sh run_standalone_train_ascend.sh [IMGS_PATH] [ANNOS_PATH] [PRETRAINED_PATH] [COCO_TEXT_PARSER_PATH] [DEVICE_ID]
  88. # evaluation:
  89. sh run_eval_ascend.sh [IMGS_PATH] [ANNOS_PATH] [CHECKPOINT_PATH] [COCO_TEXT_PARSER_PATH] [DEVICE_ID]
  90. ```
  91. > Notes:
  92. > RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/distributed_training_ascend.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size.
  93. >
  94. > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh`
  95. >
  96. > The `pretrained_path` should be a checkpoint of vgg16 trained on Imagenet2012. The name of weight in dict should be totally the same, also the batch_norm should be enabled in the trainig of vgg16, otherwise fails in further steps.
  97. > COCO_TEXT_PARSER_PATH coco_text.py can refer to [Link](https://github.com/andreasveit/coco-text).
  98. >
  99. ### Launch
  100. ```bash
  101. # training example
  102. shell:
  103. Ascend:
  104. # distribute training example(8p)
  105. sh run_distribute_train_ascend.sh [IMGS_PATH] [ANNOS_PATH] [RANK_TABLE_FILE] [PRETRAINED_PATH] [COCO_TEXT_PARSER_PATH]
  106. # standalone training
  107. sh run_standalone_train_ascend.sh [IMGS_PATH] [ANNOS_PATH] [PRETRAINED_PATH] [COCO_TEXT_PARSER_PATH] [DEVICE_ID]
  108. ```
  109. ### Result
  110. Training result will be stored in the example path. Checkpoints will be stored at `ckpt_path` by default, and training log will be redirected to `./log`, also the loss will be redirected to `./loss_0.log` like followings.
  111. ```python
  112. 469 epoch: 1 step: 982 ,rpn_loss: 0.03940, rcnn_loss: 0.48169, rpn_cls_loss: 0.02910, rpn_reg_loss: 0.00344, rcnn_cls_loss: 0.41943, rcnn_reg_loss: 0.06223, total_loss: 0.52109
  113. 659 epoch: 2 step: 982 ,rpn_loss: 0.03607, rcnn_loss: 0.32129, rpn_cls_loss: 0.02916, rpn_reg_loss: 0.00230, rcnn_cls_loss: 0.25732, rcnn_reg_loss: 0.06390, total_loss: 0.35736
  114. 847 epoch: 3 step: 982 ,rpn_loss: 0.07074, rcnn_loss: 0.40527, rpn_cls_loss: 0.03494, rpn_reg_loss: 0.01193, rcnn_cls_loss: 0.30591, rcnn_reg_loss: 0.09937, total_loss: 0.47601
  115. ```
  116. ## [Eval process](#contents)
  117. ### Usage
  118. You can start training using python or shell scripts. The usage of shell scripts as follows:
  119. - Ascend:
  120. ```bash
  121. sh run_eval_ascend.sh [IMGS_PATH] [ANNOS_PATH] [CHECKPOINT_PATH] [COCO_TEXT_PARSER_PATH] [DEVICE_ID]
  122. ```
  123. ### Launch
  124. ```bash
  125. # eval example
  126. shell:
  127. Ascend:
  128. sh run_eval_ascend.sh [IMGS_PATH] [ANNOS_PATH] [CHECKPOINT_PATH] [COCO_TEXT_PARSER_PATH] [DEVICE_ID]
  129. ```
  130. > checkpoint can be produced in training process.
  131. ### Result
  132. Evaluation result will be stored in the example path, you can find result like the followings in `log`.
  133. ```python
  134. ========================================
  135. class 1 precision is 88.01%, recall is 82.77%
  136. ```
  137. ## Model Export
  138. ```shell
  139. python export.py --ckpt_file [CKPT_PATH] --device_target [DEVICE_TARGET] --file_format[EXPORT_FORMAT]
  140. ```
  141. `EXPORT_FORMAT` should be in ["AIR", "MINDIR"]
  142. ## Inference Process
  143. ### Usage
  144. Before performing inference, the air file must bu exported by export script on the Ascend910 environment.
  145. ```shell
  146. # Ascend310 inference
  147. bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [LABEL_PATH] [DEVICE_ID]
  148. ```
  149. ### result
  150. Inference result is saved in current path, you can find result like this in acc.log file.
  151. ```python
  152. ========================================
  153. class 1 precision is 84.24%, recall is 87.40%, F1 is 85.79%
  154. ```
  155. # [Model description](#contents)
  156. ## [Performance](#contents)
  157. ### Training Performance
  158. | Parameters | Ascend |
  159. | -------------------------- | ------------------------------------------------------------ |
  160. | Model Version | Deeptext |
  161. | Resource | Ascend 910; cpu 2.60GHz, 192cores; memory 755G; OS Euler2.8 |
  162. | uploaded Date | 12/26/2020 |
  163. | MindSpore Version | 1.1.0 |
  164. | Dataset | 66040 images |
  165. | Batch_size | 2 |
  166. | Training Parameters | src/config.py |
  167. | Optimizer | Momentum |
  168. | Loss Function | SoftmaxCrossEntropyWithLogits for classification, SmoothL2Loss for bbox regression|
  169. | Loss | ~0.008 |
  170. | Total time (8p) | 4h |
  171. | Scripts | [deeptext script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/deeptext) |
  172. #### Inference Performance
  173. | Parameters | Ascend |
  174. | ------------------- | --------------------------- |
  175. | Model Version | Deeptext |
  176. | Resource | Ascend 910; cpu 2.60GHz, 192cores; memory 755G; OS Euler2.8 |
  177. | Uploaded Date | 12/26/2020 |
  178. | MindSpore Version | 1.1.0 |
  179. | Dataset | 229 images |
  180. | Batch_size | 2 |
  181. | Accuracy | precision=0.8801, recall=0.8277 |
  182. | Total time | 1 min |
  183. | Model for inference | 3492M (.ckpt file) |
  184. #### Training performance results
  185. | **Ascend** | train performance |
  186. | :--------: | :---------------: |
  187. | 1p | 14 img/s |
  188. | **Ascend** | train performance |
  189. | :--------: | :---------------: |
  190. | 8p | 50 img/s |
  191. # [Description of Random Situation](#contents)
  192. We set seed to 1 in train.py.
  193. # [ModelZoo Homepage](#contents)
  194. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).