You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 7.7 kB

4 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143
  1. # Masked Face Recognition with Latent Part Detection
  2. # Contents
  3. - [Masked Face Recognition Description](#masked-face-recognition-description)
  4. - [Dataset](#dataset)
  5. - [Environment Requirements](#environment-requirements)
  6. - [Script Description](#script-description)
  7. - [Training](#training)
  8. - [Evaluation](#evaluation)
  9. - [ModelZoo Homepage](#modelzoo-homepage)
  10. # [Masked Face Recognition Description](#contents)
  11. <p align="center">
  12. <img src="./img/overview.png">
  13. </p>
  14. This is a **MindSpore** implementation of [Masked Face Recognition with Latent Part Detection (ACM MM20)](https://dl.acm.org/doi/10.1145/3394171.3413731) by *Feifei Ding, Peixi Peng, Yangru Huang, Mengyue Geng and Yonghong Tian*.
  15. *Masked Face Recognition* aims to match masked faces with common faces and is important especially during the global outbreak of COVID-19. It is challenging to identify masked faces since most facial cues are occluded by mask.
  16. *Latent Part Detection* (LPD) is a differentiable module that can locate the latent facial part which is robust to mask wearing, and the latent part is further used to extract discriminative features. The proposed LPD model is trained in an end-to-end manner and only utilizes the original and synthetic training data.
  17. # [Dataset](#contents)
  18. ## Training Dataset
  19. We use [CASIA-WebFace Dataset](http://www.cbsr.ia.ac.cn/english/CASIA-WebFace-Database.html) as the training dataset. After downloading CASIA-WebFace, we first detect faces and facial landmarks using `MTCNN` and align faces to a canonical pose using similarity transformation. (see: [MTCNN - face detection & alignment](https://github.com/kpzhang93/MTCNN_face_detection_alignment)).
  20. Collecting and labeling realistic masked facial data requires a great deal of human labor. To address this issue, we generate masked face images based on CASIA-WebFace. We generate 8 kinds of synthetic masked face images to augment training data based on 8 different styles of masks, such as surgical masks, N95 respirators and activated carbon masks. We mix the original face images with the synthetic masked images as the training data.
  21. <p align="center">
  22. <img src="./img/generated_masked_faces.png" width="600px">
  23. </p>
  24. ## Evaluating Dataset
  25. We use [PKU-Masked-Face Dataset](https://pkuml.org/resources/pku-masked-face-dataset.html) as the evaluating dataset. The dataset contains 10,301 face images of 1,018 identities. Each identity has masked and common face images with various orientations, lighting conditions and mask types. Most identities have 5 holistic face images and 5 masked face images with 5 different views: front, left, right, up and down.
  26. The directory structure is as follows:
  27. ```python
  28. .
  29. └─ dataset
  30. ├─ train dataset
  31. ├─ ID1
  32. ├─ ID1_0001.jpg
  33. ├─ ID1_0002.jpg
  34. ...
  35. ├─ ID2
  36. ...
  37. ├─ ID3
  38. ...
  39. ...
  40. ├─ test dataset
  41. ├─ ID1
  42. ├─ ID1_0001.jpg
  43. ├─ ID1_0002.jpg
  44. ...
  45. ├─ ID2
  46. ...
  47. ├─ ID3
  48. ...
  49. ...
  50. ```
  51. # [Environment Requirements](#contents)
  52. - Hardware(Ascend)
  53. - Prepare hardware environment with Ascend processor. If you want to get Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
  54. - Framework
  55. - [MindSpore](https://www.mindspore.cn/install/en)
  56. - For more information, please check the resources below:
  57. - [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
  58. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
  59. # [Script Description](#contents)
  60. The entire code structure is as following:
  61. ```python
  62. └─ face_recognition
  63. ├── README.md // descriptions about face_recognition
  64. ├── scripts
  65. │ ├── run_train.sh // shell script for training on Ascend
  66. │ ├── run_eval.sh // shell script for evaluation on Ascend
  67. ├── src
  68. │ ├── dataset
  69. │ │ ├── Dataset.py // loading evaluating dataset
  70. │ │ ├── MGDataset.py // loading training dataset
  71. │ ├── model
  72. │ │ ├── model.py // lpd model
  73. │ │ ├── stn.py // spatial transformer network module
  74. │ ├── utils
  75. │ │ ├── distance.py // calculate distance of two features
  76. │ │ ├── metric.py // calculate mAP and CMC scores
  77. ├─ config.py // hyperparameter setting
  78. ├─ train_dataset.py // training data format setting
  79. ├─ test_dataset.py // evaluating data format setting
  80. ├─ train.py // training scripts
  81. ├─ test.py // evaluation scripts
  82. ```
  83. # [Training](#contents)
  84. ```bash
  85. sh scripts/run_train.sh [USE_DEVICE_ID]
  86. ```
  87. You will get the loss value of each epoch as following in "./scripts/data_parallel_log_[DEVICE_ID]/outputs/logs/[TIME].log" or "./scripts/log_parallel_graph/face_recognition_[DEVICE_ID].log":
  88. ```python
  89. epoch[0], iter[100], loss:(Tensor(shape=[], dtype=Float32, value= 50.2733), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 32768)), cur_lr:0.000660, mean_fps:743.09 imgs/sec
  90. epoch[0], iter[200], loss:(Tensor(shape=[], dtype=Float32, value= 49.3693), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 32768)), cur_lr:0.001314, mean_fps:4426.42 imgs/sec
  91. epoch[0], iter[300], loss:(Tensor(shape=[], dtype=Float32, value= 48.7081), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 16384)), cur_lr:0.001968, mean_fps:4428.09 imgs/sec
  92. epoch[0], iter[400], loss:(Tensor(shape=[], dtype=Float32, value= 45.7791), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 16384)), cur_lr:0.002622, mean_fps:4428.17 imgs/sec
  93. ...
  94. epoch[8], iter[27300], loss:(Tensor(shape=[], dtype=Float32, value= 2.13556), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 65536)), cur_lr:0.004000, mean_fps:4429.38 imgs/sec
  95. epoch[8], iter[27400], loss:(Tensor(shape=[], dtype=Float32, value= 2.36922), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 65536)), cur_lr:0.004000, mean_fps:4429.88 imgs/sec
  96. epoch[8], iter[27500], loss:(Tensor(shape=[], dtype=Float32, value= 2.08594), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 65536)), cur_lr:0.004000, mean_fps:4430.59 imgs/sec
  97. epoch[8], iter[27600], loss:(Tensor(shape=[], dtype=Float32, value= 2.38706), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 65536)), cur_lr:0.004000, mean_fps:4430.37 imgs/sec
  98. ```
  99. # [Evaluation](#contents)
  100. ```bash
  101. sh scripts/run_eval.sh [USE_DEVICE_ID]
  102. ```
  103. You will get the result as following in "./scripts/log_inference/outputs/models/logs/[TIME].log":
  104. [test_dataset]: zj2jk=0.9495, jk2zj=0.9480, avg=0.9487
  105. | model | mAP | rank1 | rank5 | rank10|
  106. | ---------| ------| ----- | ----- | ----- |
  107. | Baseline | 27.09 | 70.17 | 87.95 | 91.80 |
  108. | MG | 36.55 | 94.12 | 98.01 | 98.66 |
  109. | LPD | 42.14 | 96.22 | 98.11 | 98.75 |
  110. # [ModelZoo Homepage](#contents)
  111. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).