You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 10 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237
  1. # Contents
  2. - [Face Quality Assessment Description](#face-quality-assessment-description)
  3. - [Model Architecture](#model-architecture)
  4. - [Dataset](#dataset)
  5. - [Environment Requirements](#environment-requirements)
  6. - [Script Description](#script-description)
  7. - [Script and Sample Code](#script-and-sample-code)
  8. - [Running Example](#running-example)
  9. - [Model Description](#model-description)
  10. - [Performance](#performance)
  11. - [ModelZoo Homepage](#modelzoo-homepage)
  12. # [Face Quality Assessment Description](#contents)
  13. This is a Face Quality Assessment network based on Resnet12, with support for training and evaluation on Ascend910.
  14. ResNet (residual neural network) was proposed by Kaiming He and other four Chinese of Microsoft Research Institute. Through the use of ResNet unit, it successfully trained 152 layers of neural network, and won the championship in ilsvrc2015. The error rate on top 5 was 3.57%, and the parameter quantity was lower than vggnet, so the effect was very outstanding. Traditional convolution network or full connection network will have more or less information loss. At the same time, it will lead to the disappearance or explosion of gradient, which leads to the failure of deep network training. ResNet solves this problem to a certain extent. By passing the input information to the output, the integrity of the information is protected. The whole network only needs to learn the part of the difference between input and output, which simplifies the learning objectives and difficulties.The structure of ResNet can accelerate the training of neural network very quickly, and the accuracy of the model is also greatly improved. At the same time, ResNet is very popular, even can be directly used in the concept net network.
  15. [Paper](https://arxiv.org/pdf/1512.03385.pdf): Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. "Deep Residual Learning for Image Recognition"
  16. # [Model Architecture](#contents)
  17. Face Quality Assessment uses a modified-Resnet12 network for performing feature extraction.
  18. # [Dataset](#contents)
  19. This network can recognize the euler angel of human head and 5 key points of human face.
  20. We use about 122K face images as training dataset and 2K as evaluating dataset in this example, and you can also use your own datasets or open source datasets (e.g. 300W-LP as training dataset, AFLW2000 as evaluating dataset)
  21. - step 1: The training dataset should be saved in a txt file, which contains the following contents:
  22. ```python
  23. [PATH_TO_IMAGE]/1.jpg [YAW] [PITCH] [ROLL] [LEFT_EYE_CENTER_X] [LEFT_EYE_CENTER_Y] [RIGHT_EYE_CENTER_X] [RIGHT_EYE_CENTER_Y] [NOSE_TIP_X] [NOSE_TIP_Y] [MOUTH_LEFT_CORNER_X] [MOUTH_LEFT_CORNER_Y] [MOUTH_RIGHT_CORNER_X] [MOUTH_RIGHT_CORNER_Y]
  24. [PATH_TO_IMAGE]/2.jpg [YAW] [PITCH] [ROLL] [LEFT_EYE_CENTER_X] [LEFT_EYE_CENTER_Y] [RIGHT_EYE_CENTER_X] [RIGHT_EYE_CENTER_Y] [NOSE_TIP_X] [NOSE_TIP_Y] [MOUTH_LEFT_CORNER_X] [MOUTH_LEFT_CORNER_Y] [MOUTH_RIGHT_CORNER_X] [MOUTH_RIGHT_CORNER_Y]
  25. [PATH_TO_IMAGE]/3.jpg [YAW] [PITCH] [ROLL] [LEFT_EYE_CENTER_X] [LEFT_EYE_CENTER_Y] [RIGHT_EYE_CENTER_X] [RIGHT_EYE_CENTER_Y] [NOSE_TIP_X] [NOSE_TIP_Y] [MOUTH_LEFT_CORNER_X] [MOUTH_LEFT_CORNER_Y] [MOUTH_RIGHT_CORNER_X] [MOUTH_RIGHT_CORNER_Y]
  26. ...
  27. e.g. /home/train/1.jpg -33.073415 -9.533774 -9.285695 229.802368 257.432800 289.186188 262.831543 271.241638 301.224426 218.571747 322.097321 277.498291 328.260376
  28. The label info are separated by '\t'.
  29. Set -1 when the keypoint is not visible.
  30. ```
  31. - step 2: The directory structure of evaluating dataset is as follows:
  32. ```python
  33. ├─ dataset
  34. ├─ img1.jpg
  35. ├─ img1.txt
  36. ├─ img2.jpg
  37. ├─ img2.txt
  38. ├─ img3.jpg
  39. ├─ img3.txt
  40. ├─ ...
  41. ```
  42. The txt file contains the following contents:
  43. ```python
  44. [YAW] [PITCH] [ROLL] [LEFT_EYE_CENTER_X] [LEFT_EYE_CENTER_Y] [RIGHT_EYE_CENTER_X] [RIGHT_EYE_CENTER_Y] [NOSE_TIP_X] [NOSE_TIP_Y] [MOUTH_LEFT_CORNER_X] [MOUTH_LEFT_CORNER_Y] [MOUTH_RIGHT_CORNER_X] [MOUTH_RIGHT_CORNER_Y]
  45. The label info are separated by ' '.
  46. Set -1 when the keypoint is not visible.
  47. ```
  48. # [Environment Requirements](#contents)
  49. - Hardware(Ascend)
  50. - Prepare hardware environment with Ascend processor.
  51. - Framework
  52. - [MindSpore](https://www.mindspore.cn/install/en)
  53. - For more information, please check the resources below:
  54. - [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
  55. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
  56. # [Script Description](#contents)
  57. ## [Script and Sample Code](#contents)
  58. The entire code structure is as following:
  59. ```python
  60. .
  61. └─ Face Quality Assessment
  62. ├─ README.md
  63. ├─ scripts
  64. ├─ run_standalone_train.sh # launch standalone training(1p) in ascend
  65. ├─ run_distribute_train.sh # launch distributed training(8p) in ascend
  66. ├─ run_eval.sh # launch evaluating in ascend
  67. └─ run_export.sh # launch exporting air model
  68. ├─ src
  69. ├─ config.py # parameter configuration
  70. ├─ dataset.py # dataset loading and preprocessing for training
  71. ├─ face_qa.py # network backbone
  72. ├─ log.py # log function
  73. ├─ loss_factory.py # loss function
  74. └─ lr_generator.py # generate learning rate
  75. ├─ train.py # training scripts
  76. ├─ eval.py # evaluation scripts
  77. └─ export.py # export air model
  78. ```
  79. ## [Running Example](#contents)
  80. ### Train
  81. - Stand alone mode
  82. ```bash
  83. cd ./scripts
  84. sh run_standalone_train.sh [TRAIN_LABEL_FILE] [USE_DEVICE_ID]
  85. ```
  86. or (fine-tune)
  87. ```bash
  88. cd ./scripts
  89. sh run_standalone_train.sh [TRAIN_LABEL_FILE] [USE_DEVICE_ID] [PRETRAINED_BACKBONE]
  90. ```
  91. for example:
  92. ```bash
  93. cd ./scripts
  94. sh run_standalone_train.sh /home/train.txt 0 /home/a.ckpt
  95. ```
  96. - Distribute mode (recommended)
  97. ```bash
  98. cd ./scripts
  99. sh run_distribute_train.sh [TRAIN_LABEL_FILE] [RANK_TABLE]
  100. ```
  101. or (fine-tune)
  102. ```bash
  103. cd ./scripts
  104. sh run_distribute_train.sh [TRAIN_LABEL_FILE] [RANK_TABLE] [PRETRAINED_BACKBONE]
  105. ```
  106. for example:
  107. ```bash
  108. cd ./scripts
  109. sh run_distribute_train.sh /home/train.txt ./rank_table_8p.json /home/a.ckpt
  110. ```
  111. You will get the loss value of each step as following in "./output/[TIME]/[TIME].log" or "./scripts/device0/train.log":
  112. ```python
  113. epoch[0], iter[0], loss:39.206444, 5.31 imgs/sec
  114. epoch[0], iter[10], loss:38.200620, 10423.44 imgs/sec
  115. epoch[0], iter[20], loss:31.253260, 13555.87 imgs/sec
  116. epoch[0], iter[30], loss:26.349678, 8762.34 imgs/sec
  117. epoch[0], iter[40], loss:23.469613, 7848.85 imgs/sec
  118. ...
  119. epoch[39], iter[19080], loss:1.881406, 7620.63 imgs/sec
  120. epoch[39], iter[19090], loss:2.091236, 7601.15 imgs/sec
  121. epoch[39], iter[19100], loss:2.140766, 8088.52 imgs/sec
  122. epoch[39], iter[19110], loss:2.111101, 8791.05 imgs/sec
  123. ```
  124. ### Evaluation
  125. ```bash
  126. cd ./scripts
  127. sh run_eval.sh [EVAL_DIR] [USE_DEVICE_ID] [PRETRAINED_BACKBONE]
  128. ```
  129. for example:
  130. ```bash
  131. cd ./scripts
  132. sh run_eval.sh /home/eval/ 0 /home/a.ckpt
  133. ```
  134. You will get the result as following in "./scripts/device0/eval.log" or txt file in [PRETRAINED_BACKBONE]'s folder:
  135. ```python
  136. 5 keypoints average err:['4.069', '3.439', '4.001', '3.206', '3.413']
  137. 3 eulers average err:['21.667', '15.627', '16.770']
  138. IPN of 5 keypoints:19.57019303768714
  139. MAE of elur:18.021210976971098
  140. ```
  141. ### Convert model
  142. If you want to infer the network on Ascend 310, you should convert the model to AIR:
  143. ```bash
  144. cd ./scripts
  145. sh run_export.sh [BATCH_SIZE] [USE_DEVICE_ID] [PRETRAINED_BACKBONE]
  146. ```
  147. # [Model Description](#contents)
  148. ## [Performance](#contents)
  149. ### Training Performance
  150. | Parameters | Face Quality Assessment |
  151. | -------------------------- | ----------------------------------------------------------- |
  152. | Model Version | V1 |
  153. | Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory 755G; OS Euler2.8 |
  154. | uploaded Date | 09/30/2020 (month/day/year) |
  155. | MindSpore Version | 1.0.0 |
  156. | Dataset | 122K images |
  157. | Training Parameters | epoch=40, batch_size=32, momentum=0.9, lr=0.02 |
  158. | Optimizer | Momentum |
  159. | Loss Function | MSELoss, Softmax Cross Entropy |
  160. | outputs | probability and point |
  161. | Speed | 1pc: 200-240 ms/step; 8pcs: 35-40 ms/step |
  162. | Total time | 1ps: 2.5 hours; 8pcs: 0.5 hours |
  163. | Checkpoint for Fine tuning | 16M (.ckpt file) |
  164. ### Evaluation Performance
  165. | Parameters | Face Quality Assessment |
  166. | ------------------- | --------------------------- |
  167. | Model Version | V1 |
  168. | Resource | Ascend 910; OS Euler2.8 |
  169. | Uploaded Date | 09/30/2020 (month/day/year) |
  170. | MindSpore Version | 1.0.0 |
  171. | Dataset | 2K images |
  172. | batch_size | 1 |
  173. | outputs | IPN, MAE |
  174. | Accuracy(8pcs) | IPN of 5 keypoints:19.5 |
  175. | | MAE of elur:18.02 |
  176. | Model for inference | 16M (.ckpt file) |
  177. # [ModelZoo Homepage](#contents)
  178. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).