You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 9.9 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245
  1. # Contents
  2. - [Face Detection Description](#face-detection-description)
  3. - [Model Architecture](#model-architecture)
  4. - [Dataset](#dataset)
  5. - [Environment Requirements](#environment-requirements)
  6. - [Script Description](#script-description)
  7. - [Script and Sample Code](#script-and-sample-code)
  8. - [Running Example](#running-example)
  9. - [Model Description](#model-description)
  10. - [Performance](#performance)
  11. - [ModelZoo Homepage](#modelzoo-homepage)
  12. # [Face Detection Description](#contents)
  13. This is a Face Detection network based on Yolov3, with support for training and evaluation on Ascend910.
  14. You only look once (YOLO) is a state-of-the-art, real-time object detection system. YOLOv3 is extremely fast and accurate.
  15. Prior detection systems repurpose classifiers or localizers to perform detection. They apply the model to an image at multiple locations and scales. High scoring regions of the image are considered detections.
  16. YOLOv3 use a totally different approach. It apply a single neural network to the full image. This network divides the image into regions and predicts bounding boxes and probabilities for each region. These bounding boxes are weighted by the predicted probabilities.
  17. [Paper](https://pjreddie.com/media/files/papers/YOLOv3.pdf): YOLOv3: An Incremental Improvement. Joseph Redmon, Ali Farhadi,
  18. University of Washington
  19. # [Model Architecture](#contents)
  20. Face Detection uses a modified-DarkNet53 network for performing feature extraction. It has 45 convolutional layers.
  21. # [Dataset](#contents)
  22. We use about 13K images as training dataset and 3K as evaluating dataset in this example, and you can also use your own datasets or open source datasets (e.g. WiderFace)
  23. - step 1: The dataset should follow the Pascal VOC data format for object detection. The directory structure is as follows:(Because of the small input shape of network, we remove the face lower than 50*50 at 1080P in evaluating dataset )
  24. ```python
  25. .
  26. └─ dataset
  27. ├─ Annotations
  28. ├─ img1.xml
  29. ├─ img2.xml
  30. ├─ ...
  31. ├─ JPEGImages
  32. ├─ img1.jpg
  33. ├─ img2.jpg
  34. ├─ ...
  35. └─ ImageSets
  36. └─ Main
  37. └─ train.txt or test.txt
  38. ```
  39. - step 2: Convert the dataset to mindrecord:
  40. ```bash
  41. python data_to_mindrecord_train.py
  42. ```
  43. or
  44. ```bash
  45. python data_to_mindrecord_eval.py
  46. ```
  47. If your dataset is too big to convert at a time, you can add data to an existed mindrecord in turn:
  48. ```shell
  49. python data_to_mindrecord_train_append.py
  50. ```
  51. # [Environment Requirements](#contents)
  52. - Hardware(Ascend)
  53. - Prepare hardware environment with Ascend processor.
  54. - Framework
  55. - [MindSpore](https://www.mindspore.cn/install/en)
  56. - For more information, please check the resources below:
  57. - [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
  58. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
  59. # [Script Description](#contents)
  60. ## [Script and Sample Code](#contents)
  61. The entire code structure is as following:
  62. ```python
  63. .
  64. └─ Face Detection
  65. ├─ README.md
  66. ├─ scripts
  67. ├─ run_standalone_train.sh # launch standalone training(1p) in ascend
  68. ├─ run_distribute_train.sh # launch distributed training(8p) in ascend
  69. ├─ run_eval.sh # launch evaluating in ascend
  70. └─ run_export.sh # launch exporting air model
  71. ├─ src
  72. ├─ FaceDetection
  73. ├─ voc_wrapper.py # get detection results
  74. ├─ yolo_loss.py # loss function
  75. ├─ yolo_postprocess.py # post process
  76. └─ yolov3.py # network
  77. ├─ config.py # parameter configuration
  78. ├─ data_preprocess.py # preprocess
  79. ├─ logging.py # log function
  80. ├─ lrsche_factory.py # generate learning rate
  81. ├─ network_define.py # network define
  82. ├─ transforms.py # data transforms
  83. ├─ data_to_mindrecord_train.py # convert dataset to mindrecord for training
  84. ├─ data_to_mindrecord_train_append.py # add dataset to an existed mindrecord for training
  85. └─ data_to_mindrecord_eval.py # convert dataset to mindrecord for evaluating
  86. ├─ train.py # training scripts
  87. ├─ eval.py # evaluation scripts
  88. └─ export.py # export air model
  89. ```
  90. ## [Running Example](#contents)
  91. ### Train
  92. - Stand alone mode
  93. ```bash
  94. cd ./scripts
  95. sh run_standalone_train.sh [MINDRECORD_FILE] [USE_DEVICE_ID]
  96. ```
  97. or (fine-tune)
  98. ```bash
  99. cd ./scripts
  100. sh run_standalone_train.sh [MINDRECORD_FILE] [USE_DEVICE_ID] [PRETRAINED_BACKBONE]
  101. ```
  102. for example:
  103. ```bash
  104. cd ./scripts
  105. sh run_standalone_train.sh /home/train.mindrecord 0 /home/a.ckpt
  106. ```
  107. - Distribute mode (recommended)
  108. ```bash
  109. cd ./scripts
  110. sh run_distribute_train.sh [MINDRECORD_FILE] [RANK_TABLE]
  111. ```
  112. or (fine-tune)
  113. ```bash
  114. cd ./scripts
  115. sh run_distribute_train.sh [MINDRECORD_FILE] [RANK_TABLE] [PRETRAINED_BACKBONE]
  116. ```
  117. for example:
  118. ```bash
  119. cd ./scripts
  120. sh run_distribute_train.sh /home/train.mindrecord ./rank_table_8p.json /home/a.ckpt
  121. ```
  122. You will get the loss value of each step as following in "./output/[TIME]/[TIME].log" or "./scripts/device0/train.log":
  123. ```python
  124. rank[0], iter[0], loss[318555.8], overflow:False, loss_scale:1024.0, lr:6.24999984211172e-06, batch_images:(64, 3, 448, 768), batch_labels:(64, 200, 6)
  125. rank[0], iter[1], loss[95394.28], overflow:True, loss_scale:1024.0, lr:6.24999984211172e-06, batch_images:(64, 3, 448, 768), batch_labels:(64, 200, 6)
  126. rank[0], iter[2], loss[81332.92], overflow:True, loss_scale:512.0, lr:6.24999984211172e-06, batch_images:(64, 3, 448, 768), batch_labels:(64, 200, 6)
  127. rank[0], iter[3], loss[27250.805], overflow:True, loss_scale:256.0, lr:6.24999984211172e-06, batch_images:(64, 3, 448, 768), batch_labels:(64, 200, 6)
  128. ...
  129. rank[0], iter[62496], loss[2218.6282], overflow:False, loss_scale:256.0, lr:6.24999984211172e-06, batch_images:(64, 3, 448, 768), batch_labels:(64, 200, 6)
  130. rank[0], iter[62497], loss[3788.5146], overflow:False, loss_scale:256.0, lr:6.24999984211172e-06, batch_images:(64, 3, 448, 768), batch_labels:(64, 200, 6)
  131. rank[0], iter[62498], loss[3427.5479], overflow:False, loss_scale:256.0, lr:6.24999984211172e-06, batch_images:(64, 3, 448, 768), batch_labels:(64, 200, 6)
  132. rank[0], iter[62499], loss[4294.194], overflow:False, loss_scale:256.0, lr:6.24999984211172e-06, batch_images:(64, 3, 448, 768), batch_labels:(64, 200, 6)
  133. ```
  134. ### Evaluation
  135. ```bash
  136. cd ./scripts
  137. sh run_eval.sh [MINDRECORD_FILE] [USE_DEVICE_ID] [PRETRAINED_BACKBONE]
  138. ```
  139. for example:
  140. ```bash
  141. cd ./scripts
  142. sh run_eval.sh /home/eval.mindrecord 0 /home/a.ckpt
  143. ```
  144. You will get the result as following in "./scripts/device0/eval.log":
  145. ```python
  146. calculate [recall | persicion | ap]...
  147. Saving ../../results/0-2441_61000/.._.._results_0-2441_61000_face_AP_0.760.png
  148. ```
  149. And the detect result and P-R graph will also be saved in "./results/[MODEL_NAME]/"
  150. ### Convert model
  151. If you want to infer the network on Ascend 310, you should convert the model to AIR:
  152. ```bash
  153. cd ./scripts
  154. sh run_export.sh [BATCH_SIZE] [USE_DEVICE_ID] [PRETRAINED_BACKBONE]
  155. ```
  156. # [Model Description](#contents)
  157. ## [Performance](#contents)
  158. ### Training Performance
  159. | Parameters | Face Detection |
  160. | -------------------------- | ----------------------------------------------------------- |
  161. | Model Version | V1 |
  162. | Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory 755G; OS Euler2.8 |
  163. | uploaded Date | 09/30/2020 (month/day/year) |
  164. | MindSpore Version | 1.0.0 |
  165. | Dataset | 13K images |
  166. | Training Parameters | epoch=2500, batch_size=64, momentum=0.5 |
  167. | Optimizer | Momentum |
  168. | Loss Function | Softmax Cross Entropy, Sigmoid Cross Entropy, SmoothL1Loss |
  169. | outputs | boxes and label |
  170. | Speed | 1pc: 800~850 ms/step; 8pcs: 1000~1150 ms/step |
  171. | Total time | 1pc: 120 hours; 8pcs: 18 hours |
  172. | Checkpoint for Fine tuning | 37M (.ckpt file) |
  173. ### Evaluation Performance
  174. | Parameters | Face Detection |
  175. | ------------------- | --------------------------- |
  176. | Model Version | V1 |
  177. | Resource | Ascend 910; OS Euler2.8 |
  178. | Uploaded Date | 09/30/2020 (month/day/year) |
  179. | MindSpore Version | 1.0.0 |
  180. | Dataset | 3K images |
  181. | batch_size | 1 |
  182. | outputs | mAP |
  183. | Accuracy | 8pcs: 76.0% |
  184. | Model for inference | 37M (.ckpt file) |
  185. # [ModelZoo Homepage](#contents)
  186. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).