You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 11 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244
  1. # Contents
  2. - [Face Recognition Description](#Face-Recognition-description)
  3. - [Model Architecture](#model-architecture)
  4. - [Dataset](#dataset)
  5. - [Environment Requirements](#environment-requirements)
  6. - [Script Description](#script-description)
  7. - [Script and Sample Code](#script-and-sample-code)
  8. - [Running Example](#running-example)
  9. - [Model Description](#model-description)
  10. - [Performance](#performance)
  11. - [ModelZoo Homepage](#modelzoo-homepage)
  12. # [Face Recognition Description](#contents)
  13. This is a face recognition network based on Resnet, with support for training and evaluation on Ascend910.
  14. ResNet (residual neural network) was proposed by Kaiming He and other four Chinese of Microsoft Research Institute. Through the use of ResNet unit, it successfully trained 152 layers of neural network, and won the championship in ilsvrc2015. The error rate on top 5 was 3.57%, and the parameter quantity was lower than vggnet, so the effect was very outstanding. Traditional convolution network or full connection network will have more or less information loss. At the same time, it will lead to the disappearance or explosion of gradient, which leads to the failure of deep network training. ResNet solves this problem to a certain extent. By passing the input information to the output, the integrity of the information is protected. The whole network only needs to learn the part of the difference between input and output, which simplifies the learning objectives and difficulties.The structure of ResNet can accelerate the training of neural network very quickly, and the accuracy of the model is also greatly improved. At the same time, ResNet is very popular, even can be directly used in the concept net network.
  15. [Paper](https://arxiv.org/pdf/1512.03385.pdf): Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. "Deep Residual Learning for Image Recognition"
  16. # [Model Architecture](#contents)
  17. Face Recognition uses a Resnet network for performing feature extraction, more details are show below:[Link](https://arxiv.org/pdf/1512.03385.pdf)
  18. # [Dataset](#contents)
  19. We use about 4.7 million face images as training dataset and 1.1 million as evaluating dataset in this example, and you can also use your own datasets or open source datasets (e.g. face_emore).
  20. The directory structure is as follows:
  21. ```python
  22. .
  23. └─ dataset
  24. ├─ train dataset
  25. ├─ ID1
  26. ├─ ID1_0001.jpg
  27. ├─ ID1_0002.jpg
  28. ...
  29. ├─ ID2
  30. ...
  31. ├─ ID3
  32. ...
  33. ...
  34. ├─ test dataset
  35. ├─ ID1
  36. ├─ ID1_0001.jpg
  37. ├─ ID1_0002.jpg
  38. ...
  39. ├─ ID2
  40. ...
  41. ├─ ID3
  42. ...
  43. ...
  44. ```
  45. # [Environment Requirements](#contents)
  46. - Hardware(Ascend)
  47. - Prepare hardware environment with Ascend processor.
  48. - Framework
  49. - [MindSpore](https://www.mindspore.cn/install/en)
  50. - For more information, please check the resources below:
  51. - [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
  52. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
  53. # [Script Description](#contents)
  54. ## [Script and Sample Code](#contents)
  55. The entire code structure is as following:
  56. ```python
  57. └─ face_recognition
  58. ├── README.md // descriptions about face_recognition
  59. ├── scripts
  60. │ ├── run_distribute_train_base.sh // shell script for distributed training on Ascend
  61. │ ├── run_distribute_train_beta.sh // shell script for distributed training on Ascend
  62. │ ├── run_eval.sh // shell script for evaluation on Ascend
  63. │ ├── run_export.sh // shell script for exporting air model
  64. │ ├── run_standalone_train_base.sh // shell script for standalone training on Ascend
  65. │ ├── run_standalone_train_beta.sh // shell script for standalone training on Ascend
  66. ├── src
  67. │ ├── backbone
  68. │ │ ├── head.py // head unit
  69. │ │ ├── resnet.py // resnet architecture
  70. │ ├── callback_factory.py // callback logging
  71. │ ├── config.py // parameter configuration
  72. │ ├── custom_dataset.py // custom dataset and sampler
  73. │ ├── custom_net.py // custom cell define
  74. │ ├── dataset_factory.py // creating dataset
  75. │ ├── init_network.py // init network parameter
  76. │ ├── my_logging.py // logging format setting
  77. │ ├── loss_factory.py // loss calculation
  78. │ ├── lrsche_factory.py // learning rate schedule
  79. │ ├── me_init.py // network parameter init method
  80. │ ├── metric_factory.py // metric fc layer
  81. ├─ train.py // training scripts
  82. ├─ eval.py // evaluation scripts
  83. └─ export.py // export air model
  84. ```
  85. ## [Running Example](#contents)
  86. ### Train
  87. - Stand alone mode
  88. - base model
  89. ```bash
  90. cd ./scripts
  91. sh run_standalone_train_base.sh [USE_DEVICE_ID]
  92. ```
  93. for example:
  94. ```bash
  95. cd ./scripts
  96. sh run_standalone_train_base.sh 0
  97. ```
  98. - beta model
  99. ```bash
  100. cd ./scripts
  101. sh run_standalone_train_beta.sh [USE_DEVICE_ID]
  102. ```
  103. for example:
  104. ```bash
  105. cd ./scripts
  106. sh run_standalone_train_beta.sh 0
  107. ```
  108. - Distribute mode (recommended)
  109. - base model
  110. ```bash
  111. cd ./scripts
  112. sh run_distribute_train_base.sh [RANK_TABLE]
  113. ```
  114. for example:
  115. ```bash
  116. cd ./scripts
  117. sh run_distribute_train_base.sh ./rank_table_8p.json
  118. ```
  119. - beta model
  120. ```bash
  121. cd ./scripts
  122. sh run_distribute_train_beta.sh [RANK_TABLE]
  123. ```
  124. for example:
  125. ```bash
  126. cd ./scripts
  127. sh run_distribute_train_beta.sh ./rank_table_8p.json
  128. ```
  129. You will get the loss value of each epoch as following in "./scripts/data_parallel_log_[DEVICE_ID]/outputs/logs/[TIME].log" or "./scripts/log_parallel_graph/face_recognition_[DEVICE_ID].log":
  130. ```python
  131. epoch[0], iter[100], loss:(Tensor(shape=[], dtype=Float32, value= 50.2733), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 32768)), cur_lr:0.000660, mean_fps:743.09 imgs/sec
  132. epoch[0], iter[200], loss:(Tensor(shape=[], dtype=Float32, value= 49.3693), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 32768)), cur_lr:0.001314, mean_fps:4426.42 imgs/sec
  133. epoch[0], iter[300], loss:(Tensor(shape=[], dtype=Float32, value= 48.7081), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 16384)), cur_lr:0.001968, mean_fps:4428.09 imgs/sec
  134. epoch[0], iter[400], loss:(Tensor(shape=[], dtype=Float32, value= 45.7791), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 16384)), cur_lr:0.002622, mean_fps:4428.17 imgs/sec
  135. ...
  136. epoch[8], iter[27300], loss:(Tensor(shape=[], dtype=Float32, value= 2.13556), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 65536)), cur_lr:0.004000, mean_fps:4429.38 imgs/sec
  137. epoch[8], iter[27400], loss:(Tensor(shape=[], dtype=Float32, value= 2.36922), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 65536)), cur_lr:0.004000, mean_fps:4429.88 imgs/sec
  138. epoch[8], iter[27500], loss:(Tensor(shape=[], dtype=Float32, value= 2.08594), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 65536)), cur_lr:0.004000, mean_fps:4430.59 imgs/sec
  139. epoch[8], iter[27600], loss:(Tensor(shape=[], dtype=Float32, value= 2.38706), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 65536)), cur_lr:0.004000, mean_fps:4430.37 imgs/sec
  140. ```
  141. ### Evaluation
  142. ```bash
  143. cd ./scripts
  144. sh run_eval.sh [USE_DEVICE_ID]
  145. ```
  146. You will get the result as following in "./scripts/log_inference/outputs/models/logs/[TIME].log":
  147. [test_dataset]: zj2jk=0.9495, jk2zj=0.9480, avg=0.9487
  148. ### Convert model
  149. If you want to infer the network on Ascend 310, you should convert the model to AIR:
  150. ```bash
  151. cd ./scripts
  152. sh run_export.sh [BATCH_SIZE] [USE_DEVICE_ID] [PRETRAINED_BACKBONE]
  153. ```
  154. for example:
  155. ```bash
  156. cd ./scripts
  157. sh run_export.sh 16 0 ./0-1_1.ckpt
  158. ```
  159. # [Model Description](#contents)
  160. ## [Performance](#contents)
  161. ### Training Performance
  162. | Parameters | Face Recognition |
  163. | -------------------------- | ----------------------------------------------------------- |
  164. | Model Version | V1 |
  165. | Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory 755G; OS Euler2.8 |
  166. | uploaded Date | 09/30/2020 (month/day/year) |
  167. | MindSpore Version | 1.0.0 |
  168. | Dataset | 4.7 million images |
  169. | Training Parameters | epoch=100, batch_size=192, momentum=0.9 |
  170. | Optimizer | Momentum |
  171. | Loss Function | Cross Entropy |
  172. | outputs | probability |
  173. | Speed | 1pc: 350-600 fps; 8pcs: 2500-4500 fps |
  174. | Total time | 1pc: NA hours; 8pcs: 10 hours |
  175. | Checkpoint for Fine tuning | 584M (.ckpt file) |
  176. ### Evaluation Performance
  177. | Parameters |Face Recognition For Tracking|
  178. | ------------------- | --------------------------- |
  179. | Model Version | V1 |
  180. | Resource | Ascend 910; OS Euler2.8 |
  181. | Uploaded Date | 09/30/2020 (month/day/year) |
  182. | MindSpore Version | 1.0.0 |
  183. | Dataset | 1.1 million images |
  184. | batch_size | 512 |
  185. | outputs | ACC |
  186. | ACC | 0.9 |
  187. | Model for inference | 584M (.ckpt file) |
  188. # [ModelZoo Homepage](#contents)
  189. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).