You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 32 kB

5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
4 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600
  1. # Contents
  2. - [MaskRCNN Description](#maskrcnn-description)
  3. - [Model Architecture](#model-architecture)
  4. - [Dataset](#dataset)
  5. - [Environment Requirements](#environment-requirements)
  6. - [Quick Start](#quick-start)
  7. - [Run in docker](#Run-in-docker)
  8. - [Script Description](#script-description)
  9. - [Script and Sample Code](#script-and-sample-code)
  10. - [Script Parameters](#script-parameters)
  11. - [Training Script Parameters](#training-script-parameters)
  12. - [Parameters Configuration](#parameters-configuration)
  13. - [Training Process](#training-process)
  14. - [Training](#training)
  15. - [Distributed Training](#distributed-training)
  16. - [Training Result](#training-result)
  17. - [Evaluation Process](#evaluation-process)
  18. - [Evaluation](#evaluation)
  19. - [Evaluation Result](#evaluation-result)
  20. - [Model Description](#model-description)
  21. - [Performance](#performance)
  22. - [Evaluation Performance](#evaluation-performance)
  23. - [Inference Performance](#inference-performance)
  24. - [Description of Random Situation](#description-of-random-situation)
  25. - [ModelZoo Homepage](#modelzoo-homepage)
  26. # [MaskRCNN Description](#contents)
  27. MaskRCNN is a conceptually simple, flexible, and general framework for object instance segmentation. The approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in
  28. parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing to estimate human poses in the same framework.
  29. It shows top results in all three tracks of the COCO suite of challenges, including instance segmentation, boundingbox object detection, and person keypoint detection. Without bells and whistles, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners.
  30. # [Model Architecture](#contents)
  31. MaskRCNN is a two-stage target detection network. It extends FasterRCNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.This network uses a region proposal network (RPN), which can share the convolution features of the whole image with the detection network, so that the calculation of region proposal is almost cost free. The whole network further combines RPN and mask branch into a network by sharing the convolution features.
  32. [Paper](http://cn.arxiv.org/pdf/1703.06870v3): Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick. "MaskRCNN"
  33. # [Dataset](#contents)
  34. Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
  35. - [COCO2017](https://cocodataset.org/) is a popular dataset with bounding-box and pixel-level stuff annotations. These annotations can be used for scene understanding tasks like semantic segmentation, object detection and image captioning. There are 118K/5K images for train/val.
  36. - Dataset size: 19G
  37. - Train: 18G, 118000 images
  38. - Val: 1G, 5000 images
  39. - Annotations: 241M, instances, captions, person_keypoints, etc.
  40. - Data format: image and json files (Note: Data will be processed in dataset.py)
  41. # [Environment Requirements](#contents)
  42. - Hardware(Ascend)
  43. - Prepare hardware environment with Ascend processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
  44. - Framework
  45. - [MindSpore](https://gitee.com/mindspore/mindspore)
  46. - Docker base image
  47. - [Ascend Hub](ascend.huawei.com/ascendhub/#/home)
  48. - For more information, please check the resources below:
  49. - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
  50. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
  51. - third-party libraries
  52. ```bash
  53. pip install Cython
  54. pip install pycocotools
  55. pip install mmcv=0.2.14
  56. ```
  57. # [Quick Start](#contents)
  58. 1. Download the dataset COCO2017.
  59. 2. Change the COCO_ROOT and other settings you need in `config.py`. The directory structure should look like the follows:
  60. ```
  61. .
  62. └─cocodataset
  63. ├─annotations
  64. ├─instance_train2017.json
  65. └─instance_val2017.json
  66. ├─val2017
  67. └─train2017
  68. ```
  69. If you use your own dataset to train the network, **Select dataset to other when run script.**
  70. Create a txt file to store dataset information organized in the way as shown as following:
  71. ```
  72. train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2
  73. ```
  74. Each row is an image annotation split by spaces. The first column is a relative path of image, followed by columns containing box and class information in the format [xmin,ymin,xmax,ymax,class]. We read image from an image path joined by the `IMAGE_DIR`(dataset directory) and the relative path in `ANNO_PATH`(the TXT file path), which can be set in `config.py`.
  75. 3. Execute train script.
  76. After dataset preparation, you can start training as follows:
  77. ```
  78. # distributed training
  79. bash run_distribute_train.sh [RANK_TABLE_FILE] [PRETRAINED_CKPT]
  80. # standalone training
  81. bash run_standalone_train.sh [PRETRAINED_CKPT]
  82. ```
  83. Note:
  84. 1. To speed up data preprocessing, MindSpore provide a data format named MindRecord, hence the first step is to generate MindRecord files based on COCO2017 dataset before training. The process of converting raw COCO2017 dataset to MindRecord format may take about 4 hours.
  85. 2. For distributed training, a [hccl configuration file](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools) with JSON format needs to be created in advance.
  86. 3. PRETRAINED_CKPT is a resnet50 checkpoint that trained over ImageNet2012.you can train it with [resnet50](https://gitee.com/qujianwei/mindspore/tree/master/model_zoo/official/cv/resnet) scripts in modelzoo, and use src/convert_checkpoint.py to get the pretrain checkpoint file.
  87. 4. For large models like MaskRCNN, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size.
  88. 4. Execute eval script.
  89. After training, you can start evaluation as follows:
  90. ```shell
  91. # Evaluation
  92. bash run_eval.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH]
  93. ```
  94. Note:
  95. 1. VALIDATION_JSON_FILE is a label json file for evaluation.
  96. 5. Execute inference script.
  97. After training, you can start inference as follows:
  98. ```shell
  99. # inference
  100. bash run_infer_310.sh [AIR_PATH] [DATA_PATH] [ANN_FILE_PATH]
  101. ```
  102. Note:
  103. 1. AIR_PATH is a model file, exported by export script file on the Ascend910 environment.
  104. 2. ANN_FILE_PATH is a annotation file for inference.
  105. # Run in docker
  106. 1. Build docker images
  107. ```shell
  108. # build docker
  109. docker build -t maskrcnn:20.1.0 . --build-arg FROM_IMAGE_NAME=ascend-mindspore-arm:20.1.0
  110. ```
  111. 2. Create a container layer over the created image and start it
  112. ```shell
  113. # start docker
  114. bash scripts/docker_start.sh maskrcnn:20.1.0 [DATA_DIR] [MODEL_DIR]
  115. ```
  116. 3. Train
  117. ```shell
  118. # standalone training
  119. bash run_standalone_train.sh [PRETRAINED_CKPT]
  120. # distributed training
  121. bash run_distribute_train.sh [RANK_TABLE_FILE] [PRETRAINED_CKPT]
  122. ```
  123. 4. Eval
  124. ```shell
  125. # Evaluation
  126. bash run_eval.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH]
  127. ```
  128. 5. Inference.
  129. ```shell
  130. # inference
  131. bash run_infer_310.sh [AIR_PATH] [DATA_PATH] [ANN_FILE_PATH]
  132. ```
  133. # [Script Description](#contents)
  134. ## [Script and Sample Code](#contents)
  135. ```shell
  136. .
  137. └─MaskRcnn
  138. ├─README.md # README
  139. ├─ascend310_infer #application for 310 inference
  140. ├─scripts # shell script
  141. ├─run_standalone_train.sh # training in standalone mode(1pcs)
  142. ├─run_distribute_train.sh # training in parallel mode(8 pcs)
  143. ├─run_infer_310.sh #shell script for 310 inference
  144. └─run_eval.sh # evaluation
  145. ├─src
  146. ├─maskrcnn
  147. ├─__init__.py
  148. ├─anchor_generator.py # generate base bounding box anchors
  149. ├─bbox_assign_sample.py # filter positive and negative bbox for the first stage learning
  150. ├─bbox_assign_sample_stage2.py # filter positive and negative bbox for the second stage learning
  151. ├─mask_rcnn_r50.py # main network architecture of maskrcnn
  152. ├─fpn_neck.py # fpn network
  153. ├─proposal_generator.py # generate proposals based on feature map
  154. ├─rcnn_cls.py # rcnn bounding box regression branch
  155. ├─rcnn_mask.py # rcnn mask branch
  156. ├─resnet50.py # backbone network
  157. ├─roi_align.py # roi align network
  158. └─rpn.py # reagion proposal network
  159. ├─aipp.cfg #aipp config file
  160. ├─config.py # network configuration
  161. ├─convert_checkpoint.py # convert resnet50 backbone checkpoint
  162. ├─dataset.py # dataset utils
  163. ├─lr_schedule.py # leanring rate geneatore
  164. ├─network_define.py # network define for maskrcnn
  165. └─util.py # routine operation
  166. ├─mindspore_hub_conf.py # mindspore hub interface
  167. ├─export.py #script to export AIR,MINDIR,ONNX model
  168. ├─eval.py # evaluation scripts
  169. ├─postprogress.py #post process for 310 inference
  170. └─train.py # training scripts
  171. ```
  172. ## [Script Parameters](#contents)
  173. ### [Training Script Parameters](#contents)
  174. ```shell
  175. # distributed training
  176. Usage: bash run_distribute_train.sh [RANK_TABLE_FILE] [PRETRAINED_MODEL]
  177. # standalone training
  178. Usage: bash run_standalone_train.sh [PRETRAINED_MODEL]
  179. ```
  180. ### [Parameters Configuration](#contents)
  181. ```txt
  182. "img_width": 1280, # width of the input images
  183. "img_height": 768, # height of the input images
  184. # random threshold in data augmentation
  185. "keep_ratio": True,
  186. "flip_ratio": 0.5,
  187. "expand_ratio": 1.0,
  188. "max_instance_count": 128, # max number of bbox for each image
  189. "mask_shape": (28, 28), # shape of mask in rcnn_mask
  190. # anchor
  191. "feature_shapes": [(192, 320), (96, 160), (48, 80), (24, 40), (12, 20)], # shape of fpn feaure maps
  192. "anchor_scales": [8], # area of base anchor
  193. "anchor_ratios": [0.5, 1.0, 2.0], # ratio between width of height of base anchors
  194. "anchor_strides": [4, 8, 16, 32, 64], # stride size of each feature map levels
  195. "num_anchors": 3, # anchor number for each pixel
  196. # resnet
  197. "resnet_block": [3, 4, 6, 3], # block number in each layer
  198. "resnet_in_channels": [64, 256, 512, 1024], # in channel size for each layer
  199. "resnet_out_channels": [256, 512, 1024, 2048], # out channel size for each layer
  200. # fpn
  201. "fpn_in_channels": [256, 512, 1024, 2048], # in channel size for each layer
  202. "fpn_out_channels": 256, # out channel size for every layer
  203. "fpn_num_outs": 5, # out feature map size
  204. # rpn
  205. "rpn_in_channels": 256, # in channel size
  206. "rpn_feat_channels": 256, # feature out channel size
  207. "rpn_loss_cls_weight": 1.0, # weight of bbox classification in rpn loss
  208. "rpn_loss_reg_weight": 1.0, # weight of bbox regression in rpn loss
  209. "rpn_cls_out_channels": 1, # classification out channel size
  210. "rpn_target_means": [0., 0., 0., 0.], # bounding box decode/encode means
  211. "rpn_target_stds": [1.0, 1.0, 1.0, 1.0], # bounding box decode/encode stds
  212. # bbox_assign_sampler
  213. "neg_iou_thr": 0.3, # negative sample threshold after IOU
  214. "pos_iou_thr": 0.7, # positive sample threshold after IOU
  215. "min_pos_iou": 0.3, # minimal positive sample threshold after IOU
  216. "num_bboxes": 245520, # total bbox number
  217. "num_gts": 128, # total ground truth number
  218. "num_expected_neg": 256, # negative sample number
  219. "num_expected_pos": 128, # positive sample number
  220. # proposal
  221. "activate_num_classes": 2, # class number in rpn classification
  222. "use_sigmoid_cls": True, # whethre use sigmoid as loss function in rpn classification
  223. # roi_alignj
  224. "roi_layer": dict(type='RoIAlign', out_size=7, mask_out_size=14, sample_num=2), # ROIAlign parameters
  225. "roi_align_out_channels": 256, # ROIAlign out channels size
  226. "roi_align_featmap_strides": [4, 8, 16, 32], # stride size for different level of ROIAling feature map
  227. "roi_align_finest_scale": 56, # finest scale ofr ROIAlign
  228. "roi_sample_num": 640, # sample number in ROIAling layer
  229. # bbox_assign_sampler_stage2 # bbox assign sample for the second stage, parameter meaning is similar with bbox_assign_sampler
  230. "neg_iou_thr_stage2": 0.5,
  231. "pos_iou_thr_stage2": 0.5,
  232. "min_pos_iou_stage2": 0.5,
  233. "num_bboxes_stage2": 2000,
  234. "num_expected_pos_stage2": 128,
  235. "num_expected_neg_stage2": 512,
  236. "num_expected_total_stage2": 512,
  237. # rcnn # rcnn parameter for the second stage, parameter meaning is similar with fpn
  238. "rcnn_num_layers": 2,
  239. "rcnn_in_channels": 256,
  240. "rcnn_fc_out_channels": 1024,
  241. "rcnn_mask_out_channels": 256,
  242. "rcnn_loss_cls_weight": 1,
  243. "rcnn_loss_reg_weight": 1,
  244. "rcnn_loss_mask_fb_weight": 1,
  245. "rcnn_target_means": [0., 0., 0., 0.],
  246. "rcnn_target_stds": [0.1, 0.1, 0.2, 0.2],
  247. # train proposal
  248. "rpn_proposal_nms_across_levels": False,
  249. "rpn_proposal_nms_pre": 2000, # proposal number before nms in rpn
  250. "rpn_proposal_nms_post": 2000, # proposal number after nms in rpn
  251. "rpn_proposal_max_num": 2000, # max proposal number in rpn
  252. "rpn_proposal_nms_thr": 0.7, # nms threshold for nms in rpn
  253. "rpn_proposal_min_bbox_size": 0, # min size of box in rpn
  254. # test proposal # part of parameters are similar with train proposal
  255. "rpn_nms_across_levels": False,
  256. "rpn_nms_pre": 1000,
  257. "rpn_nms_post": 1000,
  258. "rpn_max_num": 1000,
  259. "rpn_nms_thr": 0.7,
  260. "rpn_min_bbox_min_size": 0,
  261. "test_score_thr": 0.05, # score threshold
  262. "test_iou_thr": 0.5, # IOU threshold
  263. "test_max_per_img": 100, # max number of instance
  264. "test_batch_size": 2, # batch size
  265. "rpn_head_use_sigmoid": True, # whether use sigmoid or not in rpn
  266. "rpn_head_weight": 1.0, # rpn head weight in loss
  267. "mask_thr_binary": 0.5, # mask threshold for in rcnn
  268. # LR
  269. "base_lr": 0.02, # base learning rate
  270. "base_step": 58633, # bsae step in lr generator
  271. "total_epoch": 13, # total epoch in lr generator
  272. "warmup_step": 500, # warmp up step in lr generator
  273. "warmup_ratio": 1/3.0, # warpm up ratio
  274. "sgd_momentum": 0.9, # momentum in optimizer
  275. # train
  276. "batch_size": 2,
  277. "loss_scale": 1,
  278. "momentum": 0.91,
  279. "weight_decay": 1e-4,
  280. "pretrain_epoch_size": 0, # pretrained epoch size
  281. "epoch_size": 12, # total epoch size
  282. "save_checkpoint": True, # whether save checkpoint or not
  283. "save_checkpoint_epochs": 1, # save checkpoint interval
  284. "keep_checkpoint_max": 12, # max number of saved checkpoint
  285. "save_checkpoint_path": "./", # path of checkpoint
  286. "mindrecord_dir": "/home/maskrcnn/MindRecord_COCO2017_Train", # path of mindrecord
  287. "coco_root": "/home/maskrcnn/", # path of coco root dateset
  288. "train_data_type": "train2017", # name of train dataset
  289. "val_data_type": "val2017", # name of evaluation dataset
  290. "instance_set": "annotations/instances_{}.json", # name of annotation
  291. "coco_classes": ('background', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
  292. 'train', 'truck', 'boat', 'traffic light', 'fire hydrant',
  293. 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog',
  294. 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra',
  295. 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie',
  296. 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
  297. 'kite', 'baseball bat', 'baseball glove', 'skateboard',
  298. 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup',
  299. 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
  300. 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
  301. 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed',
  302. 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote',
  303. 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink',
  304. 'refrigerator', 'book', 'clock', 'vase', 'scissors',
  305. 'teddy bear', 'hair drier', 'toothbrush'),
  306. "num_classes": 81
  307. ```
  308. ## [Training Process](#contents)
  309. - Set options in `config.py`, including loss_scale, learning rate and network hyperparameters. Click [here](https://www.mindspore.cn/tutorial/training/zh-CN/master/use/data_preparation.html) for more information about dataset.
  310. ### [Training](#content)
  311. - Run `run_standalone_train.sh` for non-distributed training of MaskRCNN model.
  312. ```bash
  313. # standalone training
  314. bash run_standalone_train.sh [PRETRAINED_MODEL]
  315. ```
  316. ### [Distributed Training](#content)
  317. - Run `run_distribute_train.sh` for distributed training of Mask model.
  318. ```bash
  319. bash run_distribute_train.sh [RANK_TABLE_FILE] [PRETRAINED_MODEL]
  320. ```
  321. - Notes
  322. 1. hccl.json which is specified by RANK_TABLE_FILE is needed when you are running a distribute task. You can generate it by using the [hccl_tools](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools).
  323. 2. As for PRETRAINED_MODEL,it should be a trained ResNet50 checkpoint. If not set, the model will be trained from the very beginning. If you need to load Ready-made pretrained MaskRcnn checkpoint, you may make changes to the train.py script as follows.
  324. ```python
  325. # Comment out the following code
  326. # load_path = args_opt.pre_trained
  327. # if load_path != "":
  328. # param_dict = load_checkpoint(load_path)
  329. # for item in list(param_dict.keys()):
  330. # if not item.startswith('backbone'):
  331. # param_dict.pop(item)
  332. # load_param_into_net(net, param_dict)
  333. # Add the following codes after optimizer definition since the FasterRcnn checkpoint includes optimizer parameters:
  334. lr = Tensor(dynamic_lr(config, rank_size=device_num, start_steps=config.pretrain_epoch_size * dataset_size),
  335. mstype.float32)
  336. opt = Momentum(params=net.trainable_params(), learning_rate=lr, momentum=config.momentum,
  337. weight_decay=config.weight_decay, loss_scale=config.loss_scale)
  338. if load_path != "":
  339. param_dict = load_checkpoint(load_path)
  340. if config.pretrain_epoch_size == 0:
  341. for item in list(param_dict.keys()):
  342. if item in ("global_step", "learning_rate") or "rcnn.cls" in item or "rcnn.mask" in item:
  343. param_dict.pop(item)
  344. load_param_into_net(net, param_dict)
  345. load_param_into_net(opt, param_dict)
  346. ```
  347. 3. This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh`
  348. ### [Training Result](#content)
  349. Training result will be stored in the example path, whose folder name begins with "train" or "train_parallel". You can find checkpoint file together with result like the following in loss_rankid.log.
  350. ```bash
  351. # distribute training result(8p)
  352. epoch: 1 step: 7393 ,rpn_loss: 0.05716, rcnn_loss: 0.81152, rpn_cls_loss: 0.04828, rpn_reg_loss: 0.00889, rcnn_cls_loss: 0.28784, rcnn_reg_loss: 0.17590, rcnn_mask_loss: 0.34790, total_loss: 0.86868
  353. epoch: 2 step: 7393 ,rpn_loss: 0.00434, rcnn_loss: 0.36572, rpn_cls_loss: 0.00339, rpn_reg_loss: 0.00095, rcnn_cls_loss: 0.08240, rcnn_reg_loss: 0.05554, rcnn_mask_loss: 0.22778, total_loss: 0.37006
  354. epoch: 3 step: 7393 ,rpn_loss: 0.00996, rcnn_loss: 0.83789, rpn_cls_loss: 0.00701, rpn_reg_loss: 0.00294, rcnn_cls_loss: 0.39478, rcnn_reg_loss: 0.14917, rcnn_mask_loss: 0.29370, total_loss: 0.84785
  355. ...
  356. epoch: 10 step: 7393 ,rpn_loss: 0.00667, rcnn_loss: 0.65625, rpn_cls_loss: 0.00536, rpn_reg_loss: 0.00131, rcnn_cls_loss: 0.17590, rcnn_reg_loss: 0.16199, rcnn_mask_loss: 0.31812, total_loss: 0.66292
  357. epoch: 11 step: 7393 ,rpn_loss: 0.02003, rcnn_loss: 0.52051, rpn_cls_loss: 0.01761, rpn_reg_loss: 0.00241, rcnn_cls_loss: 0.16028, rcnn_reg_loss: 0.08411, rcnn_mask_loss: 0.27588, total_loss: 0.54054
  358. epoch: 12 step: 7393 ,rpn_loss: 0.00547, rcnn_loss: 0.39258, rpn_cls_loss: 0.00285, rpn_reg_loss: 0.00262, rcnn_cls_loss: 0.08002, rcnn_reg_loss: 0.04990, rcnn_mask_loss: 0.26245, total_loss: 0.39804
  359. ```
  360. ## [Evaluation Process](#contents)
  361. ### [Evaluation](#content)
  362. - Run `run_eval.sh` for evaluation.
  363. ```bash
  364. # infer
  365. bash run_eval.sh [VALIDATION_ANN_FILE_JSON] [CHECKPOINT_PATH]
  366. ```
  367. > As for the COCO2017 dataset, VALIDATION_ANN_FILE_JSON is refer to the annotations/instances_val2017.json in the dataset directory.
  368. > checkpoint can be produced and saved in training process, whose folder name begins with "train/checkpoint" or "train_parallel*/checkpoint".
  369. >
  370. > Images size in dataset should be equal to the annotation size in VALIDATION_ANN_FILE_JSON, otherwise the evaluation result cannot be displayed properly.
  371. ### [Evaluation result](#content)
  372. Inference result will be stored in the example path, whose folder name is "eval". Under this, you can find result like the following in log.
  373. ```bash
  374. Evaluate annotation type *bbox*
  375. Accumulating evaluation results...
  376. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.378
  377. Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.602
  378. Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.407
  379. Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.242
  380. Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.417
  381. Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.480
  382. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.311
  383. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.497
  384. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.524
  385. Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.363
  386. Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.567
  387. Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.647
  388. Evaluate annotation type *segm*
  389. Accumulating evaluation results...
  390. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.335
  391. Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.557
  392. Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.351
  393. Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.169
  394. Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.365
  395. Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.480
  396. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.284
  397. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.433
  398. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.451
  399. Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.285
  400. Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.490
  401. Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.586
  402. ```
  403. ## Model Export
  404. ```shell
  405. python export.py --ckpt_file [CKPT_PATH] --device_target [DEVICE_TARGET] --file_format[EXPORT_FORMAT]
  406. ```
  407. `EXPORT_FORMAT` should be in ["AIR", "ONNX", "MINDIR"]
  408. ## Inference Process
  409. ### Usage
  410. Before performing inference, the air file must bu exported by export script on the 910 environment.
  411. Current batch_ Size can only be set to 1. The inference process needs about 600G hard disk space to save the reasoning results.
  412. ```shell
  413. # Ascend310 inference
  414. sh run_infer_310.sh [AIR_PATH] [DATA_PATH] [ANN_FILE_PATH]
  415. ```
  416. ### result
  417. Inference result is saved in current path, you can find result like this in acc.log file.
  418. ```bash
  419. Evaluate annotation type *bbox*
  420. Accumulating evaluation results...
  421. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.3368
  422. Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.589
  423. Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.394
  424. Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.218
  425. Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.411
  426. Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.476
  427. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.305
  428. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.489
  429. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.514
  430. Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.323
  431. Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.562
  432. Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.657
  433. Evaluate annotation type *segm*
  434. Accumulating evaluation results...
  435. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.323
  436. Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.544
  437. Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.336
  438. Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.147
  439. Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.353
  440. Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.479
  441. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.278
  442. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.422
  443. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.439
  444. Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.248
  445. Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.478
  446. Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.594
  447. ```
  448. # Model Description
  449. ## Performance
  450. ### Evaluation Performance
  451. | Parameters | Ascend |
  452. | -------------------------- | ----------------------------------------------------------- |
  453. | Model Version | V1 |
  454. | Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory, 755G |
  455. | uploaded Date | 08/01/2020 (month/day/year) |
  456. | MindSpore Version | 1.0.0 |
  457. | Dataset | COCO2017 |
  458. | Training Parameters | epoch=12, batch_size = 2 |
  459. | Optimizer | SGD |
  460. | Loss Function | Softmax Cross Entropy, Sigmoid Cross Entropy, SmoothL1Loss |
  461. | Output | Probability |
  462. | Loss | 0.39804 |
  463. | Speed | 1pc: 193 ms/step; 8pcs: 207 ms/step |
  464. | Total time | 1pc: 46 hours; 8pcs: 5.38 hours |
  465. | Parameters (M) | 84.8 |
  466. | Checkpoint for Fine tuning | 85M(.ckpt file) |
  467. | Model for inference | 571M(.air file) |
  468. | Scripts | [maskrcnn script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/maskrcnn) |
  469. ### Inference Performance
  470. | Parameters | Ascend |
  471. | ------------------- | --------------------------- |
  472. | Model Version | V1 |
  473. | Resource | Ascend 910 |
  474. | Uploaded Date | 08/01/2020 (month/day/year) |
  475. | MindSpore Version | 1.0.0 |
  476. | Dataset | COCO2017 |
  477. | batch_size | 2 |
  478. | outputs | mAP |
  479. | Accuracy | IoU=0.50:0.95 (BoundingBox 37.0%, Mask 33.5) |
  480. | Model for inference | 170M (.ckpt file) |
  481. # [Description of Random Situation](#contents)
  482. In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py for weight initialization.
  483. # [ModelZoo Homepage](#contents)
  484. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).