You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 26 kB

5 years ago
5 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434
  1. # Contents
  2. - [MaskRCNN Description](#maskrcnn-description)
  3. - [Model Architecture](#model-architecture)
  4. - [Dataset](#dataset)
  5. - [Environment Requirements](#environment-requirements)
  6. - [Quick Start](#quick-start)
  7. - [Script Description](#script-description)
  8. - [Script and Sample Code](#script-and-sample-code)
  9. - [Script Parameters](#script-parameters)
  10. - [Training Script Parameters](#training-script-parameters)
  11. - [Parameters Configuration](#parameters-configuration)
  12. - [Training Process](#training-process)
  13. - [Training](#training)
  14. - [Distributed Training](#distributed-training)
  15. - [Training Result](#training-result)
  16. - [Evaluation Process](#evaluation-process)
  17. - [Evaluation](#evaluation)
  18. - [Evaluation Result](#evaluation-result)
  19. - [Model Description](#model-description)
  20. - [Performance](#performance)
  21. - [Training Performance](#training-performance)
  22. - [Evaluation Performance](#evaluation-performance)
  23. - [Description of Random Situation](#description-of-random-situation)
  24. - [ModelZoo Homepage](#modelzoo-homepage)
  25. # [MaskRCNN Description](#contents)
  26. MaskRCNN is a conceptually simple, flexible, and general framework for object instance segmentation. The approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in
  27. parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing to estimate human poses in the same framework.
  28. It shows top results in all three tracks of the COCO suite of challenges, including instance segmentation, boundingbox object detection, and person keypoint detection. Without bells and whistles, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners.
  29. # [Model Architecture](#contents)
  30. MaskRCNN is a two-stage target detection network. It extends FasterRCNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.This network uses a region proposal network (RPN), which can share the convolution features of the whole image with the detection network, so that the calculation of region proposal is almost cost free. The whole network further combines RPN and mask branch into a network by sharing the convolution features.
  31. [Paper](http://cn.arxiv.org/pdf/1703.06870v3): Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick. "MaskRCNN"
  32. # [Dataset](#contents)
  33. - [COCO2017](https://cocodataset.org/) is a popular dataset with bounding-box and pixel-level stuff annotations. These annotations can be used for scene understanding tasks like semantic segmentation, object detection and image captioning. There are 118K/5K images for train/val.
  34. - Dataset size: 19G
  35. - Train: 18G, 118000 images
  36. - Val: 1G, 5000 images
  37. - Annotations: 241M, instances, captions, person_keypoints, etc.
  38. - Data format: image and json files
  39. - Note: Data will be processed in dataset.py
  40. # [Environment Requirements](#contents)
  41. - Hardware(Ascend)
  42. - Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
  43. - Framework
  44. - [MindSpore](https://gitee.com/mindspore/mindspore)
  45. - For more information, please check the resources below:
  46. - [MindSpore tutorials](https://www.mindspore.cn/tutorial/en/master/index.html)
  47. - [MindSpore API](https://www.mindspore.cn/api/en/master/index.html)
  48. - third-party libraries
  49. ```bash
  50. pip install Cython
  51. pip install pycocotools
  52. pip install mmcv=0.2.14
  53. ```
  54. # [Quick Start](#contents)
  55. 1. Download the dataset COCO2017.
  56. 2. Change the COCO_ROOT and other settings you need in `config.py`. The directory structure should look like the follows:
  57. ```
  58. .
  59. └─cocodataset
  60. ├─annotations
  61. ├─instance_train2017.json
  62. └─instance_val2017.json
  63. ├─val2017
  64. └─train2017
  65. ```
  66. If you use your own dataset to train the network, **Select dataset to other when run script.**
  67. Create a txt file to store dataset information organized in the way as shown as following:
  68. ```
  69. train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2
  70. ```
  71. Each row is an image annotation split by spaces. The first column is a relative path of image, followed by columns containing box and class information in the format [xmin,ymin,xmax,ymax,class]. We read image from an image path joined by the `IMAGE_DIR`(dataset directory) and the relative path in `ANNO_PATH`(the TXT file path), which can be set in `config.py`.
  72. 3. Execute train script.
  73. After dataset preparation, you can start training as follows:
  74. ```
  75. # distributed training
  76. sh run_distribute_train.sh [RANK_TABLE_FILE] [PRETRAINED_CKPT]
  77. # standalone training
  78. sh run_standalone_train.sh [PRETRAINED_CKPT]
  79. ```
  80. Note:
  81. 1. To speed up data preprocessing, MindSpore provide a data format named MindRecord, hence the first step is to generate MindRecord files based on COCO2017 dataset before training. The process of converting raw COCO2017 dataset to MindRecord format may take about 4 hours.
  82. 2. For distributed training, a [hccl configuration file](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools) with JSON format needs to be created in advance.
  83. 3. PRETRAINED_CKPT is a resnet50 checkpoint that trained over ImageNet2012.
  84. 4. Execute eval script.
  85. After training, you can start evaluation as follows:
  86. ```bash
  87. # Evaluation
  88. sh run_eval.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH]
  89. ```
  90. Note:
  91. 1. VALIDATION_JSON_FILE is a label json file for evaluation.
  92. # [Script Description](#contents)
  93. ## [Script and Sample Code](#contents)
  94. ```shell
  95. .
  96. └─MaskRcnn
  97. ├─README.md # README
  98. ├─scripts # shell script
  99. ├─run_standalone_train.sh # training in standalone mode(1pcs)
  100. ├─run_distribute_train.sh # training in parallel mode(8 pcs)
  101. └─run_eval.sh # evaluation
  102. ├─src
  103. ├─maskrcnn
  104. ├─__init__.py
  105. ├─anchor_generator.py # generate base bounding box anchors
  106. ├─bbox_assign_sample.py # filter positive and negative bbox for the first stage learning
  107. ├─bbox_assign_sample_stage2.py # filter positive and negative bbox for the second stage learning
  108. ├─mask_rcnn_r50.py # main network architecture of maskrcnn
  109. ├─fpn_neck.py # fpn network
  110. ├─proposal_generator.py # generate proposals based on feature map
  111. ├─rcnn_cls.py # rcnn bounding box regression branch
  112. ├─rcnn_mask.py # rcnn mask branch
  113. ├─resnet50.py # backbone network
  114. ├─roi_align.py # roi align network
  115. └─rpn.py # reagion proposal network
  116. ├─config.py # network configuration
  117. ├─dataset.py # dataset utils
  118. ├─lr_schedule.py # leanring rate geneatore
  119. ├─network_define.py # network define for maskrcnn
  120. └─util.py # routine operation
  121. ├─eval.py # evaluation scripts
  122. └─train.py # training scripts
  123. ```
  124. ## [Script Parameters](#contents)
  125. ### [Training Script Parameters](#contents)
  126. ```
  127. # distributed training
  128. Usage: sh run_distribute_train.sh [RANK_TABLE_FILE] [PRETRAINED_MODEL]
  129. # standalone training
  130. Usage: sh run_standalone_train.sh [PRETRAINED_MODEL]
  131. ```
  132. ### [Parameters Configuration](#contents)
  133. ```
  134. "img_width": 1280, # width of the input images
  135. "img_height": 768, # height of the input images
  136. # random threshold in data augmentation
  137. "keep_ratio": True,
  138. "flip_ratio": 0.5,
  139. "photo_ratio": 0.5,
  140. "expand_ratio": 1.0,
  141. "max_instance_count": 128, # max number of bbox for each image
  142. "mask_shape": (28, 28), # shape of mask in rcnn_mask
  143. # anchor
  144. "feature_shapes": [(192, 320), (96, 160), (48, 80), (24, 40), (12, 20)], # shape of fpn feaure maps
  145. "anchor_scales": [8], # area of base anchor
  146. "anchor_ratios": [0.5, 1.0, 2.0], # ratio between width of height of base anchors
  147. "anchor_strides": [4, 8, 16, 32, 64], # stride size of each feature map levels
  148. "num_anchors": 3, # anchor number for each pixel
  149. # resnet
  150. "resnet_block": [3, 4, 6, 3], # block number in each layer
  151. "resnet_in_channels": [64, 256, 512, 1024], # in channel size for each layer
  152. "resnet_out_channels": [256, 512, 1024, 2048], # out channel size for each layer
  153. # fpn
  154. "fpn_in_channels": [256, 512, 1024, 2048], # in channel size for each layer
  155. "fpn_out_channels": 256, # out channel size for every layer
  156. "fpn_num_outs": 5, # out feature map size
  157. # rpn
  158. "rpn_in_channels": 256, # in channel size
  159. "rpn_feat_channels": 256, # feature out channel size
  160. "rpn_loss_cls_weight": 1.0, # weight of bbox classification in rpn loss
  161. "rpn_loss_reg_weight": 1.0, # weight of bbox regression in rpn loss
  162. "rpn_cls_out_channels": 1, # classification out channel size
  163. "rpn_target_means": [0., 0., 0., 0.], # bounding box decode/encode means
  164. "rpn_target_stds": [1.0, 1.0, 1.0, 1.0], # bounding box decode/encode stds
  165. # bbox_assign_sampler
  166. "neg_iou_thr": 0.3, # negative sample threshold after IOU
  167. "pos_iou_thr": 0.7, # positive sample threshold after IOU
  168. "min_pos_iou": 0.3, # minimal positive sample threshold after IOU
  169. "num_bboxes": 245520, # total bbox numner
  170. "num_gts": 128, # total ground truth number
  171. "num_expected_neg": 256, # negative sample number
  172. "num_expected_pos": 128, # positive sample number
  173. # proposal
  174. "activate_num_classes": 2, # class number in rpn classification
  175. "use_sigmoid_cls": True, # whethre use sigmoid as loss function in rpn classification
  176. # roi_alignj
  177. "roi_layer": dict(type='RoIAlign', out_size=7, mask_out_size=14, sample_num=2), # ROIAlign parameters
  178. "roi_align_out_channels": 256, # ROIAlign out channels size
  179. "roi_align_featmap_strides": [4, 8, 16, 32], # stride size for differnt level of ROIAling feature map
  180. "roi_align_finest_scale": 56, # finest scale ofr ROIAlign
  181. "roi_sample_num": 640, # sample number in ROIAling layer
  182. # bbox_assign_sampler_stage2 # bbox assign sample for the second stage, parameter meaning is similar with bbox_assign_sampler
  183. "neg_iou_thr_stage2": 0.5,
  184. "pos_iou_thr_stage2": 0.5,
  185. "min_pos_iou_stage2": 0.5,
  186. "num_bboxes_stage2": 2000,
  187. "num_expected_pos_stage2": 128,
  188. "num_expected_neg_stage2": 512,
  189. "num_expected_total_stage2": 512,
  190. # rcnn # rcnn parameter for the second stage, parameter meaning is similar with fpn
  191. "rcnn_num_layers": 2,
  192. "rcnn_in_channels": 256,
  193. "rcnn_fc_out_channels": 1024,
  194. "rcnn_mask_out_channels": 256,
  195. "rcnn_loss_cls_weight": 1,
  196. "rcnn_loss_reg_weight": 1,
  197. "rcnn_loss_mask_fb_weight": 1,
  198. "rcnn_target_means": [0., 0., 0., 0.],
  199. "rcnn_target_stds": [0.1, 0.1, 0.2, 0.2],
  200. # train proposal
  201. "rpn_proposal_nms_across_levels": False,
  202. "rpn_proposal_nms_pre": 2000, # proposal number before nms in rpn
  203. "rpn_proposal_nms_post": 2000, # proposal number after nms in rpn
  204. "rpn_proposal_max_num": 2000, # max proposal number in rpn
  205. "rpn_proposal_nms_thr": 0.7, # nms threshold for nms in rpn
  206. "rpn_proposal_min_bbox_size": 0, # min size of box in rpn
  207. # test proposal # part of parameters are similar with train proposal
  208. "rpn_nms_across_levels": False,
  209. "rpn_nms_pre": 1000,
  210. "rpn_nms_post": 1000,
  211. "rpn_max_num": 1000,
  212. "rpn_nms_thr": 0.7,
  213. "rpn_min_bbox_min_size": 0,
  214. "test_score_thr": 0.05, # score threshold
  215. "test_iou_thr": 0.5, # IOU threshold
  216. "test_max_per_img": 100, # max number of instance
  217. "test_batch_size": 2, # batch size
  218. "rpn_head_loss_type": "CrossEntropyLoss", # loss type in rpn
  219. "rpn_head_use_sigmoid": True, # whether use sigmoid or not in rpn
  220. "rpn_head_weight": 1.0, # rpn head weight in loss
  221. "mask_thr_binary": 0.5, # mask threshold for in rcnn
  222. # LR
  223. "base_lr": 0.02, # base learning rate
  224. "base_step": 58633, # bsae step in lr generator
  225. "total_epoch": 13, # total epoch in lr generator
  226. "warmup_step": 500, # warmp up step in lr generator
  227. "warmup_mode": "linear", # warmp up mode
  228. "warmup_ratio": 1/3.0, # warpm up ratio
  229. "sgd_momentum": 0.9, # momentum in optimizer
  230. # train
  231. "batch_size": 2,
  232. "loss_scale": 1,
  233. "momentum": 0.91,
  234. "weight_decay": 1e-4,
  235. "pretrain_epoch_size": 0, # pretrained epoch size
  236. "epoch_size": 12, # total epoch size
  237. "save_checkpoint": True, # whether save checkpoint or not
  238. "save_checkpoint_epochs": 1, # save checkpoint interval
  239. "keep_checkpoint_max": 12, # max number of saved checkpoint
  240. "save_checkpoint_path": "./checkpoint", # path of checkpoint
  241. "mindrecord_dir": "/home/maskrcnn/MindRecord_COCO2017_Train", # path of mindrecord
  242. "coco_root": "/home/maskrcnn/", # path of coco root dateset
  243. "train_data_type": "train2017", # name of train dataset
  244. "val_data_type": "val2017", # name of evaluation dataset
  245. "instance_set": "annotations/instances_{}.json", # name of annotation
  246. "coco_classes": ('background', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
  247. 'train', 'truck', 'boat', 'traffic light', 'fire hydrant',
  248. 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog',
  249. 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra',
  250. 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie',
  251. 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
  252. 'kite', 'baseball bat', 'baseball glove', 'skateboard',
  253. 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup',
  254. 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
  255. 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
  256. 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed',
  257. 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote',
  258. 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink',
  259. 'refrigerator', 'book', 'clock', 'vase', 'scissors',
  260. 'teddy bear', 'hair drier', 'toothbrush'),
  261. "num_classes": 81
  262. ```
  263. ## [Training Process](#contents)
  264. - Set options in `config.py`, including loss_scale, learning rate and network hyperparameters. Click [here](https://www.mindspore.cn/tutorial/zh-CN/master/use/data_preparation/loading_the_datasets.html#mindspore) for more information about dataset.
  265. ### [Training](#content)
  266. - Run `run_standalone_train.sh` for non-distributed training of MaskRCNN model.
  267. ```
  268. # standalone training
  269. sh run_standalone_train.sh [PRETRAINED_MODEL]
  270. ```
  271. ### [Distributed Training](#content)
  272. - Run `run_distribute_train.sh` for distributed training of Mask model.
  273. ```
  274. sh run_distribute_train.sh [RANK_TABLE_FILE] [PRETRAINED_MODEL]
  275. ```
  276. > hccl.json which is specified by RANK_TABLE_FILE is needed when you are running a distribute task. You can generate it by using the [hccl_tools](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools).
  277. > As for PRETRAINED_MODEL, if not set, the model will be trained from the very beginning. Ready-made pretrained_models are not available now. Stay tuned.
  278. ### [Training Result](#content)
  279. Training result will be stored in the example path, whose folder name begins with "train" or "train_parallel". You can find checkpoint file together with result like the followings in loss_rankid.log.
  280. ```
  281. # distribute training result(8p)
  282. epoch: 1 step: 7393 ,rpn_loss: 0.10626, rcnn_loss: 0.81592, rpn_cls_loss: 0.05862, rpn_reg_loss: 0.04761, rcnn_cls_loss: 0.32642, rcnn_reg_loss: 0.15503, rcnn_mask_loss: 0.33447, total_loss: 0.92218
  283. epoch: 2 step: 7393 ,rpn_loss: 0.00911, rcnn_loss: 0.34082, rpn_cls_loss: 0.00341, rpn_reg_loss: 0.00571, rcnn_cls_loss: 0.07440, rcnn_reg_loss: 0.05872, rcnn_mask_loss: 0.20764, total_loss: 0.34993
  284. epoch: 3 step: 7393 ,rpn_loss: 0.02087, rcnn_loss: 0.98633, rpn_cls_loss: 0.00665, rpn_reg_loss: 0.01422, rcnn_cls_loss: 0.35913, rcnn_reg_loss: 0.21375, rcnn_mask_loss: 0.41382, total_loss: 1.00720
  285. ...
  286. epoch: 10 step: 7393 ,rpn_loss: 0.02122, rcnn_loss: 0.55176, rpn_cls_loss: 0.00620, rpn_reg_loss: 0.01503, rcnn_cls_loss: 0.12708, rcnn_reg_loss: 0.10254, rcnn_mask_loss: 0.32227, total_loss: 0.57298
  287. epoch: 11 step: 7393 ,rpn_loss: 0.03772, rcnn_loss: 0.60791, rpn_cls_loss: 0.03058, rpn_reg_loss: 0.00713, rcnn_cls_loss: 0.23987, rcnn_reg_loss: 0.11743, rcnn_mask_loss: 0.25049, total_loss: 0.64563
  288. epoch: 12 step: 7393 ,rpn_loss: 0.06482, rcnn_loss: 0.47681, rpn_cls_loss: 0.04770, rpn_reg_loss: 0.01709, rcnn_cls_loss: 0.16492, rcnn_reg_loss: 0.04990, rcnn_mask_loss: 0.26196, total_loss: 0.54163
  289. ```
  290. ## [Evaluation Process](#contents)
  291. ### [Evaluation](#content)
  292. - Run `run_eval.sh` for evaluation.
  293. ```
  294. # infer
  295. sh run_eval.sh [VALIDATION_ANN_FILE_JSON] [CHECKPOINT_PATH]
  296. ```
  297. > As for the COCO2017 dataset, VALIDATION_ANN_FILE_JSON is refer to the annotations/instances_val2017.json in the dataset directory.
  298. > checkpoint can be produced and saved in training process, whose folder name begins with "train/checkpoint" or "train_parallel*/checkpoint".
  299. ### [Evaluation result](#content)
  300. Inference result will be stored in the example path, whose folder name is "eval". Under this, you can find result like the followings in log.
  301. ```
  302. Evaluate annotation type *bbox*
  303. Accumulating evaluation results...
  304. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.376
  305. Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.598
  306. Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.405
  307. Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.239
  308. Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.414
  309. Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.475
  310. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.311
  311. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.500
  312. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.528
  313. Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.371
  314. Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.572
  315. Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.653
  316. Evaluate annotation type *segm*
  317. Accumulating evaluation results...
  318. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.326
  319. Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.553
  320. Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.344
  321. Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.169
  322. Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.356
  323. Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.462
  324. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.278
  325. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.426
  326. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.445
  327. Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.294
  328. Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.484
  329. Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.558
  330. ```
  331. # Model Description
  332. ## Performance
  333. ### Training Performance
  334. | Parameters | MaskRCNN |
  335. | -------------------------- | ----------------------------------------------------------- |
  336. | Model Version | V1 |
  337. | Resource | Ascend 910; CPU 2.60GHz, 56cores; Memory, 314G |
  338. | uploaded Date | 08/01/2020 (month/day/year) |
  339. | MindSpore Version | 0.6.0-alpha |
  340. | Dataset | COCO2017 |
  341. | Training Parameters | epoch=12, batch_size = 2 |
  342. | Optimizer | SGD |
  343. | Loss Function | Softmax Cross Entropy ,Sigmoid Cross Entropy,SmoothL1Loss |
  344. | Speed | 1pc: 250 ms/step; 8pcs: 260 ms/step |
  345. | Total time | 1pc: 52 hours; 8pcs: 6.6 hours |
  346. | Parameters (M) | 280 |
  347. | Scripts | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/maskrcnn |
  348. ### Evaluation Performance
  349. | Parameters | MaskRCNN |
  350. | ------------------- | --------------------------- |
  351. | Model Version | V1 |
  352. | Resource | Ascend 910 |
  353. | Uploaded Date | 08/01/2020 (month/day/year) |
  354. | MindSpore Version | 0.6.0-alpha |
  355. | Dataset | COCO2017 |
  356. | batch_size | 2 |
  357. | outputs | mAP |
  358. | Accuracy | IoU=0.50:0.95 32.4% |
  359. | Model for inference | 254M (.ckpt file) |
  360. # [Description of Random Situation](#contents)
  361. In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py for weight initialization.
  362. # [ModelZoo Homepage](#contents)
  363. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).