You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 18 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411
  1. # Contents
  2. - [FasterRcnn Description](#fasterrcnn-description)
  3. - [Model Architecture](#model-architecture)
  4. - [Dataset](#dataset)
  5. - [Environment Requirements](#environment-requirements)
  6. - [Quick Start](#quick-start)
  7. - [Run in docker](#Run-in-docker)
  8. - [Script Description](#script-description)
  9. - [Script and Sample Code](#script-and-sample-code)
  10. - [Training Process](#training-process)
  11. - [Training Usage](#usage)
  12. - [Training Result](#result)
  13. - [Evaluation Process](#evaluation-process)
  14. - [Evaluation Usage](#usage)
  15. - [Evaluation Result](#result)
  16. - [Model Description](#model-description)
  17. - [Performance](#performance)
  18. - [Evaluation Performance](#evaluation-performance)
  19. - [Inference Performance](#inference-performance)
  20. - [ModelZoo Homepage](#modelzoo-homepage)
  21. # FasterRcnn Description
  22. Before FasterRcnn, the target detection networks rely on the region proposal algorithm to assume the location of targets, such as SPPnet and Fast R-CNN. Progress has reduced the running time of these detection networks, but it also reveals that the calculation of the region proposal is a bottleneck.
  23. FasterRcnn proposed that convolution feature maps based on region detectors (such as Fast R-CNN) can also be used to generate region proposals. At the top of these convolution features, a Region Proposal Network (RPN) is constructed by adding some additional convolution layers (which share the convolution characteristics of the entire image with the detection network, thus making it possible to make regions almost costlessProposal), outputting both region bounds and objectness score for each location.Therefore, RPN is a full convolutional network (FCN), which can be trained end-to-end, generate high-quality region proposals, and then fed into Fast R-CNN for detection.
  24. [Paper](https://arxiv.org/abs/1506.01497): Ren S , He K , Girshick R , et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39(6).
  25. # Model Architecture
  26. FasterRcnn is a two-stage target detection network,This network uses a region proposal network (RPN), which can share the convolution features of the whole image with the detection network, so that the calculation of region proposal is almost cost free. The whole network further combines RPN and FastRcnn into a network by sharing the convolution features.
  27. # Dataset
  28. Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
  29. Dataset used: [COCO2017](<https://cocodataset.org/>)
  30. - Dataset size:19G
  31. - Train:18G,118000 images
  32. - Val:1G,5000 images
  33. - Annotations:241M,instances,captions,person_keypoints etc
  34. - Data format:image and json files
  35. - Note:Data will be processed in dataset.py
  36. # Environment Requirements
  37. - Hardware(Ascend/GPU)
  38. - Prepare hardware environment with Ascend processor.
  39. - Docker base image
  40. - [Ascend Hub](ascend.huawei.com/ascendhub/#/home)
  41. - Install [MindSpore](https://www.mindspore.cn/install/en).
  42. - Download the dataset COCO2017.
  43. - We use COCO2017 as training dataset in this example by default, and you can also use your own datasets.
  44. 1. If coco dataset is used. **Select dataset to coco when run script.**
  45. Install Cython and pycocotool, and you can also install mmcv to process data.
  46. ```pip
  47. pip install Cython
  48. pip install pycocotools
  49. pip install mmcv==0.2.14
  50. ```
  51. And change the COCO_ROOT and other settings you need in `config.py`. The directory structure is as follows:
  52. ```path
  53. .
  54. └─cocodataset
  55. ├─annotations
  56. ├─instance_train2017.json
  57. └─instance_val2017.json
  58. ├─val2017
  59. └─train2017
  60. ```
  61. 2. If your own dataset is used. **Select dataset to other when run script.**
  62. Organize the dataset information into a TXT file, each row in the file is as follows:
  63. ```log
  64. train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2
  65. ```
  66. Each row is an image annotation which split by space, the first column is a relative path of image, the others are box and class information of the format [xmin,ymin,xmax,ymax,class]. We read image from an image path joined by the `IMAGE_DIR`(dataset directory) and the relative path in `ANNO_PATH`(the TXT file path), `IMAGE_DIR` and `ANNO_PATH` are setting in `config.py`.
  67. # Quick Start
  68. After installing MindSpore via the official website, you can start training and evaluation as follows:
  69. Note:
  70. 1. the first run will generate the mindeocrd file, which will take a long time.
  71. 2. pretrained model is a resnet50 checkpoint that trained over ImageNet2012.you can train it with [resnet50](https://gitee.com/qujianwei/mindspore/tree/master/model_zoo/official/cv/resnet) scripts in modelzoo, and use src/convert_checkpoint.py to get the pretrain model.
  72. 3. BACKBONE_MODEL is a checkpoint file trained with [resnet50](https://gitee.com/qujianwei/mindspore/tree/master/model_zoo/official/cv/resnet) scripts in modelzoo.PRETRAINED_MODEL is a checkpoint file after convert.VALIDATION_JSON_FILE is label file. CHECKPOINT_PATH is a checkpoint file after trained.
  73. ## Run on Ascend
  74. ```shell
  75. # convert checkpoint
  76. python convert_checkpoint.py --ckpt_file=[BACKBONE_MODEL]
  77. # standalone training
  78. sh run_standalone_train_ascend.sh [PRETRAINED_MODEL]
  79. # distributed training
  80. sh run_distribute_train_ascend.sh [RANK_TABLE_FILE] [PRETRAINED_MODEL]
  81. # eval
  82. sh run_eval_ascend.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH]
  83. # inference
  84. sh run_infer_310.sh [AIR_PATH] [DATA_PATH] [ANN_FILE_PATH]
  85. ```
  86. ## Run on GPU
  87. ```shell
  88. # convert checkpoint
  89. python convert_checkpoint.py --ckpt_file=[BACKBONE_MODEL]
  90. # standalone training
  91. sh run_standalone_train_gpu.sh [PRETRAINED_MODEL]
  92. # distributed training
  93. sh run_distribute_train_gpu.sh [DEVICE_NUM] [PRETRAINED_MODEL]
  94. # eval
  95. sh run_eval_gpu.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH]
  96. ```
  97. ## Run in docker
  98. 1. Build docker images
  99. ```shell
  100. # build docker
  101. docker build -t fasterrcnn:20.1.0 . --build-arg FROM_IMAGE_NAME=ascend-mindspore-arm:20.1.0
  102. ```
  103. 2. Create a container layer over the created image and start it
  104. ```shell
  105. # start docker
  106. bash scripts/docker_start.sh fasterrcnn:20.1.0 [DATA_DIR] [MODEL_DIR]
  107. ```
  108. 3. Train
  109. ```shell
  110. # standalone training
  111. sh run_standalone_train_ascend.sh [PRETRAINED_MODEL]
  112. # distributed training
  113. sh run_distribute_train_ascend.sh [RANK_TABLE_FILE] [PRETRAINED_MODEL]
  114. ```
  115. 4. Eval
  116. ```shell
  117. # eval
  118. sh run_eval_ascend.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH]
  119. ```
  120. 5. Inference
  121. ```shell
  122. # inference
  123. sh run_infer_310.sh [AIR_PATH] [DATA_PATH] [ANN_FILE_PATH]
  124. ```
  125. # Script Description
  126. ## Script and Sample Code
  127. ```shell
  128. .
  129. └─faster_rcnn
  130. ├─README.md // descriptions about fasterrcnn
  131. ├─ascend310_infer //application for 310 inference
  132. ├─scripts
  133. ├─run_standalone_train_ascend.sh // shell script for standalone on ascend
  134. ├─run_standalone_train_gpu.sh // shell script for standalone on GPU
  135. ├─run_distribute_train_ascend.sh // shell script for distributed on ascend
  136. ├─run_distribute_train_gpu.sh // shell script for distributed on GPU
  137. ├─run_infer_310.sh // shell script for 310 inference
  138. └─run_eval_ascend.sh // shell script for eval on ascend
  139. └─run_eval_gpu.sh // shell script for eval on GPU
  140. ├─src
  141. ├─FasterRcnn
  142. ├─__init__.py // init file
  143. ├─anchor_generator.py // anchor generator
  144. ├─bbox_assign_sample.py // first stage sampler
  145. ├─bbox_assign_sample_stage2.py // second stage sampler
  146. ├─faster_rcnn_r50.py // fasterrcnn network
  147. ├─fpn_neck.py //feature pyramid network
  148. ├─proposal_generator.py // proposal generator
  149. ├─rcnn.py // rcnn network
  150. ├─resnet50.py // backbone network
  151. ├─roi_align.py // roi align network
  152. └─rpn.py // region proposal network
  153. ├─aipp.cfg // aipp config file
  154. ├─config.py // total config
  155. ├─dataset.py // create dataset and process dataset
  156. ├─lr_schedule.py // learning ratio generator
  157. ├─network_define.py // network define for fasterrcnn
  158. └─util.py // routine operation
  159. ├─export.py // script to export AIR,MINDIR,ONNX model
  160. ├─eval.py //eval scripts
  161. ├─postprogress.py // post process for 310 inference
  162. └─train.py // train scripts
  163. ```
  164. ## Training Process
  165. ### Usage
  166. #### on Ascend
  167. ```shell
  168. # standalone training on ascend
  169. sh run_standalone_train_ascend.sh [PRETRAINED_MODEL]
  170. # distributed training on ascend
  171. sh run_distribute_train_ascend.sh [RANK_TABLE_FILE] [PRETRAINED_MODEL]
  172. ```
  173. #### on GPU
  174. ```shell
  175. # standalone training on gpu
  176. sh run_standalone_train_gpu.sh [PRETRAINED_MODEL]
  177. # distributed training on gpu
  178. sh run_distribute_train_gpu.sh [DEVICE_NUM] [PRETRAINED_MODEL]
  179. ```
  180. Notes:
  181. 1. Rank_table.json which is specified by RANK_TABLE_FILE is needed when you are running a distribute task. You can generate it by using the [hccl_tools](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools).
  182. 2. As for PRETRAINED_MODEL,it should be a trained ResNet50 checkpoint. If you need to load Ready-made pretrained FasterRcnn checkpoint, you may make changes to the train.py script as follows.
  183. ```python
  184. # Comment out the following code
  185. # load_path = args_opt.pre_trained
  186. # if load_path != "":
  187. # param_dict = load_checkpoint(load_path)
  188. # for item in list(param_dict.keys()):
  189. # if not item.startswith('backbone'):
  190. # param_dict.pop(item)
  191. # load_param_into_net(net, param_dict)
  192. # Add the following codes after optimizer definition since the FasterRcnn checkpoint includes optimizer parameters:
  193. lr = Tensor(dynamic_lr(config, rank_size=device_num), mstype.float32)
  194. opt = SGD(params=net.trainable_params(), learning_rate=lr, momentum=config.momentum,
  195. weight_decay=config.weight_decay, loss_scale=config.loss_scale)
  196. if load_path != "":
  197. param_dict = load_checkpoint(load_path)
  198. for item in list(param_dict.keys()):
  199. if item in ("global_step", "learning_rate") or "rcnn.reg_scores" in item or "rcnn.cls_scores" in item:
  200. param_dict.pop(item)
  201. load_param_into_net(opt, param_dict)
  202. load_param_into_net(net, param_dict)
  203. ```
  204. 3. The original dataset path needs to be in the config.py,you can select "coco_root" or "image_dir".
  205. ### Result
  206. Training result will be stored in the example path, whose folder name begins with "train" or "train_parallel". You can find checkpoint file together with result like the following in loss_rankid.log.
  207. ```log
  208. # distribute training result(8p)
  209. epoch: 1 step: 7393, rpn_loss: 0.12054, rcnn_loss: 0.40601, rpn_cls_loss: 0.04025, rpn_reg_loss: 0.08032, rcnn_cls_loss: 0.25854, rcnn_reg_loss: 0.14746, total_loss: 0.52655
  210. epoch: 2 step: 7393, rpn_loss: 0.06561, rcnn_loss: 0.50293, rpn_cls_loss: 0.02587, rpn_reg_loss: 0.03967, rcnn_cls_loss: 0.35669, rcnn_reg_loss: 0.14624, total_loss: 0.56854
  211. epoch: 3 step: 7393, rpn_loss: 0.06940, rcnn_loss: 0.49658, rpn_cls_loss: 0.03769, rpn_reg_loss: 0.03165, rcnn_cls_loss: 0.36353, rcnn_reg_loss: 0.13318, total_loss: 0.56598
  212. ...
  213. epoch: 10 step: 7393, rpn_loss: 0.03555, rcnn_loss: 0.32666, rpn_cls_loss: 0.00697, rpn_reg_loss: 0.02859, rcnn_cls_loss: 0.16125, rcnn_reg_loss: 0.16541, total_loss: 0.36221
  214. epoch: 11 step: 7393, rpn_loss: 0.19849, rcnn_loss: 0.47827, rpn_cls_loss: 0.11639, rpn_reg_loss: 0.08209, rcnn_cls_loss: 0.29712, rcnn_reg_loss: 0.18115, total_loss: 0.67676
  215. epoch: 12 step: 7393, rpn_loss: 0.00691, rcnn_loss: 0.10168, rpn_cls_loss: 0.00529, rpn_reg_loss: 0.00162, rcnn_cls_loss: 0.05426, rcnn_reg_loss: 0.04745, total_loss: 0.10859
  216. ```
  217. ## Evaluation Process
  218. ### Usage
  219. #### on Ascend
  220. ```shell
  221. # eval on ascend
  222. sh run_eval_ascend.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH]
  223. ```
  224. #### on GPU
  225. ```shell
  226. # eval on GPU
  227. sh run_eval_gpu.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH]
  228. ```
  229. > checkpoint can be produced in training process.
  230. >
  231. > Images size in dataset should be equal to the annotation size in VALIDATION_JSON_FILE, otherwise the evaluation result cannot be displayed properly.
  232. ### Result
  233. Eval result will be stored in the example path, whose folder name is "eval". Under this, you can find result like the following in log.
  234. ```log
  235. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.360
  236. Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.586
  237. Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.385
  238. Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.229
  239. Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.402
  240. Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.441
  241. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.299
  242. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.487
  243. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.515
  244. Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.346
  245. Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.562
  246. Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.631
  247. ```
  248. ## Model Export
  249. ```shell
  250. python export.py --ckpt_file [CKPT_PATH] --device_target [DEVICE_TARGET] --file_format[EXPORT_FORMAT]
  251. ```
  252. `EXPORT_FORMAT` should be in ["AIR", "ONNX", "MINDIR"]
  253. ## Inference Process
  254. ### Usage
  255. Before performing inference, the air file must bu exported by export script on the Ascend910 environment.
  256. ```shell
  257. # Ascend310 inference
  258. sh run_infer_310.sh [AIR_PATH] [DATA_PATH] [ANN_FILE_PATH]
  259. ```
  260. ### result
  261. Inference result is saved in current path, you can find result like this in acc.log file.
  262. ```log
  263. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.349
  264. Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.570
  265. Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.369
  266. Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.211
  267. Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.391
  268. Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.435
  269. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.295
  270. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.476
  271. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.503
  272. Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.330
  273. Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.547
  274. Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.622
  275. ```
  276. # Model Description
  277. ## Performance
  278. ### Evaluation Performance
  279. | Parameters | Ascend | GPU |
  280. | -------------------------- | ----------------------------------------------------------- |----------------------------------------------------------- |
  281. | Model Version | V1 | V1 |
  282. | Resource | Ascend 910 ;CPU 2.60GHz,192cores;Memory,755G |V100-PCIE 32G |
  283. | uploaded Date | 08/31/2020 (month/day/year) |02/10/2021 (month/day/year) |
  284. | MindSpore Version | 1.0.0 |1.2.0 |
  285. | Dataset | COCO2017 |COCO2017 |
  286. | Training Parameters | epoch=12, batch_size=2 |epoch=12, batch_size=2 |
  287. | Optimizer | SGD |SGD |
  288. | Loss Function | Softmax Cross Entropy ,Sigmoid Cross Entropy,SmoothL1Loss|Softmax Cross Entropy ,Sigmoid Cross Entropy,SmoothL1Loss|
  289. | Speed | 1pc: 190 ms/step; 8pcs: 200 ms/step | 1pc: 320 ms/step; 8pcs: 335 ms/step |
  290. | Total time | 1pc: 37.17 hours; 8pcs: 4.89 hours |1pc: 63.09 hours; 8pcs: 8.25 hours |
  291. | Parameters (M) | 250 |250 |
  292. | Scripts | [fasterrcnn script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/faster_rcnn) | [fasterrcnn script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/faster_rcnn) |
  293. ### Inference Performance
  294. | Parameters | Ascend |GPU |
  295. | ------------------- | --------------------------- |--------------------------- |
  296. | Model Version | V1 | V1 |
  297. | Resource | Ascend 910 |GPU |
  298. | Uploaded Date | 08/31/2020 (month/day/year) |02/10/2021 (month/day/year) |
  299. | MindSpore Version | 1.0.0 | 1.2.0 |
  300. | Dataset | COCO2017 |COCO2017 |
  301. | batch_size | 2 |2 |
  302. | outputs | mAP |mAP |
  303. | Accuracy | IoU=0.50: 58.6% | IoU=0.50: 59.1% |
  304. | Model for inference | 250M (.ckpt file) |250M (.ckpt file) |
  305. # [ModelZoo Homepage](#contents)
  306. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).