You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 11 kB

4 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220
  1. # Contents
  2. - [Openpose Description](#googlenet-description)
  3. - [Model Architecture](#model-architecture)
  4. - [Dataset](#dataset)
  5. - [Features](#features)
  6. - [Mixed Precision](#mixed-precision)
  7. - [Environment Requirements](#environment-requirements)
  8. - [Quick Start](#quick-start)
  9. - [Script Description](#script-description)
  10. - [Script and Sample Code](#script-and-sample-code)
  11. - [Script Parameters](#script-parameters)
  12. - [Training Process](#training-process)
  13. - [Training](#training)
  14. - [Distributed Training](#distributed-training)
  15. - [Evaluation Process](#evaluation-process)
  16. - [Evaluation](#evaluation)
  17. - [Model Description](#model-description)
  18. - [Performance](#performance)
  19. - [Evaluation Performance](#evaluation-performance)
  20. # [Openpose Description](#contents)
  21. Openpose network proposes a bottom-up human attitude estimation algorithm using Part Affinity Fields (PAFs). Instead of a top-down algorithm: Detect people first and then return key-points and skeleton. The advantage of openpose is that the computing time does not increase significantly as the number of people in the image increases.However,the top-down algorithm is based on the detection result, and the runtimes grow linearly with the number of people.
  22. [Paper](https://arxiv.org/abs/1611.08050): Zhe Cao,Tomas Simon,Shih-En Wei,Yaser Sheikh,"Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields",The IEEE Conference on Computer Vision and Pattern Recongnition(CVPR),2017
  23. # [Model Architecture](#contents)
  24. In first step the image is passed through baseline CNN network to extract the feature maps of the input In the paper. In this paper thee authors used first 10 layers of VGG-19 network.
  25. The feature map is then process in a multi-stage CNN pipeline to generate the Part Confidence Maps and Part Affinity Field.
  26. In the last step, the Confidence Maps and Part Affinity Fields that are generated above are processed by a greedy bipartite matching algorithm to obtain the poses for each person in the image.
  27. # [Dataset](#contents)
  28. Prepare datasets, including training sets, verification sets, and annotations.The training set and validation set samples are located in the "dataset" directory, The available datasets include coco2014,coco2017 datasets.
  29. In the currently provided training script, the coco2017 data set is used as an example to perform data preprocessing during the training process. If users use data sets in other formats, please modify the data set loading and preprocessing methods
  30. - Download data from coco2017 data official website and unzip.
  31. ````bash
  32. wget http://images.cocodataset.org/zips/train2017.zip
  33. wget http://images.cocodataset.org/zips/val2017.zip
  34. wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
  35. ````
  36. - Create the mask dataset.
  37. Run python gen_ignore_mask.py
  38. ````python
  39. python gen_ignore_mask.py --train_ann ../dataset/annotations/person_keypoints_train2017.json --val_ann ../dataset/annotations/person_keypoints_val2017.json --train_dir train2017 --val_dir val2017
  40. ````
  41. - The dataset folder is generated in the root directory and contains the following files:
  42. ```python
  43. ├── dataset
  44. ├── annotation
  45. ├─person_keypoints_train2017.json
  46. └─person_keypoints_val2017.json
  47. ├─ignore_mask_train
  48. ├─ignore_mask_val
  49. ├─tran2017
  50. └─val2017
  51. ```
  52. # [Features](#contents)
  53. ## Mixed Precision
  54. The [mixed precision](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
  55. For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’.
  56. # [Environment Requirements](#contents)
  57. - Hardware (Ascend)
  58. - Prepare hardware environment with Ascend.
  59. - Framework
  60. - [MindSpore](https://www.mindspore.cn/install/en)
  61. - Download the VGG19 model of the MindSpore version:
  62. - vgg19-0-97_5004.ckpt
  63. - For more information, please check the resources below:
  64. - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
  65. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
  66. # [Quick Start](#contents)
  67. After installing MindSpore via the official website, you can start training and evaluation as follows:
  68. ```python
  69. # run training example
  70. python train.py --train_dir train2017 --train_ann person_keypoints_train2017.json > train.log 2>&1 &
  71. # run distributed training example
  72. bash run_distribute_train.sh [RANK_TABLE_FILE]
  73. # run evaluation example
  74. python eval.py --model_path path_to_eval_model.ckpt --imgpath_val ./dataset/val2017 --ann ./dataset/annotations/person_keypoints_val2017.json > eval.log 2>&1 &
  75. OR
  76. bash scripts/run_eval_ascend.sh
  77. ```
  78. [RANK_TABLE_FILE] is the path of the multi-card information configuration table in the environment. The configuration table can be automatically generated by the tool [hccl_tool](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools).
  79. # [Script Description](#contents)
  80. ## [Script and Sample Code](#contents)
  81. ```python
  82. ├── ModelZoo_openpose_MS_MIT
  83. ├── README.md // descriptions about openpose
  84. ├── scripts
  85. │ ├──run_standalone_train.sh // shell script for distributed on Ascend
  86. │ ├──run_distribute_train.sh // shell script for distributed on Ascend with 8p
  87. │ ├──run_eval_ascend.sh // shell script for evaluation on Ascend
  88. ├── src
  89. │ ├──openposenet.py // Openpose architecture
  90. │ ├──loss.py // Loss function
  91. │ ├──config.py // parameter configuration
  92. │ ├──dataset.py // Data preprocessing
  93. │ ├──utils.py // Utils
  94. │ ├──gen_ignore_mask.py // Generating mask data script
  95. ├── export.py // model conversion script
  96. ├── train.py // training script
  97. ├── eval.py // evaluation script
  98. ```
  99. ## [Script Parameters](#contents)
  100. Parameters for both training and evaluation can be set in config.py
  101. - config for openpose
  102. ```python
  103. 'data_dir': 'path to dataset' # absolute full path to the train and evaluation datasets
  104. 'vgg_path': 'path to vgg model' # absolute full path to vgg19 model
  105. 'save_model_path': 'path of saving models' # absolute full path to output models
  106. 'load_pretrain': 'False' # whether training based on the pre-trained model
  107. 'pretrained_model_path':'' # load pre-trained model path
  108. 'lr': 1e-4 # initial learning rate
  109. 'batch_size': 10 # training batch size
  110. 'lr_gamma': 0.1 # lr scale when reach lr_steps
  111. 'lr_steps': '100000,200000,250000' # the steps when lr * lr_gamma
  112. 'loss scale': 16384 # the loss scale of mixed precision
  113. 'max_epoch_train': 60 # total training epochs
  114. 'insize': 368 # image size used as input to the model
  115. 'keep_checkpoint_max': 1 # only keep the last keep_checkpoint_max checkpoint
  116. 'log_interval': 100 # the interval of print a log
  117. 'ckpt_interval': 5000 # the interval of saving a output model
  118. ```
  119. For more configuration details, please refer the script `config.py`.
  120. ## [Training Process](#contents)
  121. ### Training
  122. - running on Ascend
  123. ```python
  124. python train.py --train_dir train2017 --train_ann person_keypoints_train2017.json > train.log 2>&1 &
  125. ```
  126. The python command above will run in the background, you can view the results through the file `train.log`.
  127. After training, you'll get some checkpoint files under the script folder by default. The loss value will be achieved as follows:
  128. ```python
  129. # grep "epoch " train.log
  130. epoch[0], iter[0], loss[0.29211228793809957], 0.13 imgs/sec, vgglr=0.0,baselr=2.499999936844688e-05,stagelr=9.999999747378752e-05
  131. epoch[0], iter[100], loss[0.060355084178521694], 24.92 imgs/sec, vgglr=0.0,baselr=2.499999936844688e-05,stagelr=9.999999747378752e-05
  132. epoch[0], iter[200], loss[0.026628130997662272], 26.20 imgs/sec, vgglr=0.0,baselr=2.499999936844688e-05,stagelr=9.999999747378752e-05
  133. ...
  134. ```
  135. The model checkpoint will be saved in the directory of config.py: 'save_model_path'.
  136. ## [Evaluation Process](#contents)
  137. ### Evaluation
  138. - running on Ascend
  139. Before running the command below, please check the checkpoint path used for evaluation. Please set the checkpoint path to be the absolute full path, e.g., "username/openpose/outputs/\*time*\/0-6_30000.ckpt".
  140. ```python
  141. python eval.py --model_path path_to_eval_model.ckpt --imgpath_val ./dataset/val2017 --ann ./dataset/annotations/person_keypoints_val2017.json > eval.log 2>&1 &
  142. OR
  143. bash scripts/run_eval_ascend.sh
  144. ```
  145. The above python command will run in the background. You can view the results through the file "eval.log". The accuracy of the test dataset will be as follows:
  146. ```python
  147. # grep "AP" eval.log
  148. {'AP': 0.40250956300341397, 'Ap .5': 0.6658941566481336, 'AP .75': 0.396047897339743, 'AP (M)': 0.3075356543635785, 'AP (L)': 0.533772768618845, 'AR': 0.4519836272040302, 'AR .5': 0.693639798488665, 'AR .75': 0.4570214105793451, 'AR (M)': 0.32155148866429945, 'AR (L)': 0.6330360460795242}
  149. ```
  150. # [Model Description](#contents)
  151. ## [Performance](#contents)
  152. ### Evaluation Performance
  153. | Parameters | Ascend
  154. | -------------------------- | -----------------------------------------------------------
  155. | Model Version | openpose
  156. | Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory 755G; OS Euler2.8
  157. | uploaded Date | 12/14/2020 (month/day/year)
  158. | MindSpore Version | 1.0.1-alpha
  159. | Training Parameters | epoch=60(1pcs)/80(8pcs), steps=30k(1pcs)/5k(8pcs), batch_size=10, init_lr=0.0001
  160. | Optimizer | Adam(1pcs)/Momentum(8pcs)
  161. | Loss Function | MSE
  162. | outputs | pose
  163. | Speed | 1pcs: 35fps, 8pcs: 230fps
  164. | Total time | 1pcs: 22.5h, 8pcs: 5.1h
  165. | Checkpoint for Fine tuning | 602.33M (.ckpt file)