You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

config.md 33 kB

2 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544
  1. # Tutorial 1: Learn about Configs
  2. We incorporate modular and inheritance design into our config system, which is convenient to conduct various experiments.
  3. If you wish to inspect the config file, you may run `python tools/misc/print_config.py /PATH/TO/CONFIG` to see the complete config.
  4. ## Modify config through script arguments
  5. When submitting jobs using "tools/train.py" or "tools/test.py", you may specify `--cfg-options` to in-place modify the config.
  6. - Update config keys of dict chains.
  7. The config options can be specified following the order of the dict keys in the original config.
  8. For example, `--cfg-options model.backbone.norm_eval=False` changes the all BN modules in model backbones to `train` mode.
  9. - Update keys inside a list of configs.
  10. Some config dicts are composed as a list in your config. For example, the training pipeline `data.train.pipeline` is normally a list
  11. e.g. `[dict(type='LoadImageFromFile'), ...]`. If you want to change `'LoadImageFromFile'` to `'LoadImageFromWebcam'` in the pipeline,
  12. you may specify `--cfg-options data.train.pipeline.0.type=LoadImageFromWebcam`.
  13. - Update values of list/tuples.
  14. If the value to be updated is a list or a tuple. For example, the config file normally sets `workflow=[('train', 1)]`. If you want to
  15. change this key, you may specify `--cfg-options workflow="[(train,1),(val,1)]"`. Note that the quotation mark \" is necessary to
  16. support list/tuple data types, and that **NO** white space is allowed inside the quotation marks in the specified value.
  17. ## Config File Structure
  18. There are 4 basic component types under `config/_base_`, dataset, model, schedule, default_runtime.
  19. Many methods could be easily constructed with one of each like Faster R-CNN, Mask R-CNN, Cascade R-CNN, RPN, SSD.
  20. The configs that are composed by components from `_base_` are called _primitive_.
  21. For all configs under the same folder, it is recommended to have only **one** _primitive_ config. All other configs should inherit from the _primitive_ config. In this way, the maximum of inheritance level is 3.
  22. For easy understanding, we recommend contributors to inherit from existing methods.
  23. For example, if some modification is made base on Faster R-CNN, user may first inherit the basic Faster R-CNN structure by specifying `_base_ = ../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py`, then modify the necessary fields in the config files.
  24. If you are building an entirely new method that does not share the structure with any of the existing methods, you may create a folder `xxx_rcnn` under `configs`,
  25. Please refer to [mmcv](https://mmcv.readthedocs.io/en/latest/understand_mmcv/config.html) for detailed documentation.
  26. ## Config Name Style
  27. We follow the below style to name config files. Contributors are advised to follow the same style.
  28. ```
  29. {model}_[model setting]_{backbone}_{neck}_[norm setting]_[misc]_[gpu x batch_per_gpu]_{schedule}_{dataset}
  30. ```
  31. `{xxx}` is required field and `[yyy]` is optional.
  32. - `{model}`: model type like `faster_rcnn`, `mask_rcnn`, etc.
  33. - `[model setting]`: specific setting for some model, like `without_semantic` for `htc`, `moment` for `reppoints`, etc.
  34. - `{backbone}`: backbone type like `r50` (ResNet-50), `x101` (ResNeXt-101).
  35. - `{neck}`: neck type like `fpn`, `pafpn`, `nasfpn`, `c4`.
  36. - `[norm_setting]`: `bn` (Batch Normalization) is used unless specified, other norm layer type could be `gn` (Group Normalization), `syncbn` (Synchronized Batch Normalization).
  37. `gn-head`/`gn-neck` indicates GN is applied in head/neck only, while `gn-all` means GN is applied in the entire model, e.g. backbone, neck, head.
  38. - `[misc]`: miscellaneous setting/plugins of model, e.g. `dconv`, `gcb`, `attention`, `albu`, `mstrain`.
  39. - `[gpu x batch_per_gpu]`: GPUs and samples per GPU, `8x2` is used by default.
  40. - `{schedule}`: training schedule, options are `1x`, `2x`, `20e`, etc.
  41. `1x` and `2x` means 12 epochs and 24 epochs respectively.
  42. `20e` is adopted in cascade models, which denotes 20 epochs.
  43. For `1x`/`2x`, initial learning rate decays by a factor of 10 at the 8/16th and 11/22th epochs.
  44. For `20e`, initial learning rate decays by a factor of 10 at the 16th and 19th epochs.
  45. - `{dataset}`: dataset like `coco`, `cityscapes`, `voc_0712`, `wider_face`.
  46. ## Deprecated train_cfg/test_cfg
  47. The `train_cfg` and `test_cfg` are deprecated in config file, please specify them in the model config. The original config structure is as below.
  48. ```python
  49. # deprecated
  50. model = dict(
  51. type=...,
  52. ...
  53. )
  54. train_cfg=dict(...)
  55. test_cfg=dict(...)
  56. ```
  57. The migration example is as below.
  58. ```python
  59. # recommended
  60. model = dict(
  61. type=...,
  62. ...
  63. train_cfg=dict(...),
  64. test_cfg=dict(...),
  65. )
  66. ```
  67. ## An Example of Mask R-CNN
  68. To help the users have a basic idea of a complete config and the modules in a modern detection system,
  69. we make brief comments on the config of Mask R-CNN using ResNet50 and FPN as the following.
  70. For more detailed usage and the corresponding alternative for each modules, please refer to the API documentation.
  71. ```python
  72. model = dict(
  73. type='MaskRCNN', # The name of detector
  74. backbone=dict( # The config of backbone
  75. type='ResNet', # The type of the backbone, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/backbones/resnet.py#L308 for more details.
  76. depth=50, # The depth of backbone, usually it is 50 or 101 for ResNet and ResNext backbones.
  77. num_stages=4, # Number of stages of the backbone.
  78. out_indices=(0, 1, 2, 3), # The index of output feature maps produced in each stages
  79. frozen_stages=1, # The weights in the first 1 stage are fronzen
  80. norm_cfg=dict( # The config of normalization layers.
  81. type='BN', # Type of norm layer, usually it is BN or GN
  82. requires_grad=True), # Whether to train the gamma and beta in BN
  83. norm_eval=True, # Whether to freeze the statistics in BN
  84. style='pytorch', # The style of backbone, 'pytorch' means that stride 2 layers are in 3x3 conv, 'caffe' means stride 2 layers are in 1x1 convs.
  85. init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')), # The ImageNet pretrained backbone to be loaded
  86. neck=dict(
  87. type='FPN', # The neck of detector is FPN. We also support 'NASFPN', 'PAFPN', etc. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/necks/fpn.py#L10 for more details.
  88. in_channels=[256, 512, 1024, 2048], # The input channels, this is consistent with the output channels of backbone
  89. out_channels=256, # The output channels of each level of the pyramid feature map
  90. num_outs=5), # The number of output scales
  91. rpn_head=dict(
  92. type='RPNHead', # The type of RPN head is 'RPNHead', we also support 'GARPNHead', etc. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/dense_heads/rpn_head.py#L12 for more details.
  93. in_channels=256, # The input channels of each input feature map, this is consistent with the output channels of neck
  94. feat_channels=256, # Feature channels of convolutional layers in the head.
  95. anchor_generator=dict( # The config of anchor generator
  96. type='AnchorGenerator', # Most of methods use AnchorGenerator, SSD Detectors uses `SSDAnchorGenerator`. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/anchor/anchor_generator.py#L10 for more details
  97. scales=[8], # Basic scale of the anchor, the area of the anchor in one position of a feature map will be scale * base_sizes
  98. ratios=[0.5, 1.0, 2.0], # The ratio between height and width.
  99. strides=[4, 8, 16, 32, 64]), # The strides of the anchor generator. This is consistent with the FPN feature strides. The strides will be taken as base_sizes if base_sizes is not set.
  100. bbox_coder=dict( # Config of box coder to encode and decode the boxes during training and testing
  101. type='DeltaXYWHBBoxCoder', # Type of box coder. 'DeltaXYWHBBoxCoder' is applied for most of methods. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/coder/delta_xywh_bbox_coder.py#L9 for more details.
  102. target_means=[0.0, 0.0, 0.0, 0.0], # The target means used to encode and decode boxes
  103. target_stds=[1.0, 1.0, 1.0, 1.0]), # The standard variance used to encode and decode boxes
  104. loss_cls=dict( # Config of loss function for the classification branch
  105. type='CrossEntropyLoss', # Type of loss for classification branch, we also support FocalLoss etc.
  106. use_sigmoid=True, # RPN usually perform two-class classification, so it usually uses sigmoid function.
  107. loss_weight=1.0), # Loss weight of the classification branch.
  108. loss_bbox=dict( # Config of loss function for the regression branch.
  109. type='L1Loss', # Type of loss, we also support many IoU Losses and smooth L1-loss, etc. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/losses/smooth_l1_loss.py#L56 for implementation.
  110. loss_weight=1.0)), # Loss weight of the regression branch.
  111. roi_head=dict( # RoIHead encapsulates the second stage of two-stage/cascade detectors.
  112. type='StandardRoIHead', # Type of the RoI head. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/standard_roi_head.py#L10 for implementation.
  113. bbox_roi_extractor=dict( # RoI feature extractor for bbox regression.
  114. type='SingleRoIExtractor', # Type of the RoI feature extractor, most of methods uses SingleRoIExtractor. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/roi_extractors/single_level.py#L10 for details.
  115. roi_layer=dict( # Config of RoI Layer
  116. type='RoIAlign', # Type of RoI Layer, DeformRoIPoolingPack and ModulatedDeformRoIPoolingPack are also supported. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/ops/roi_align/roi_align.py#L79 for details.
  117. output_size=7, # The output size of feature maps.
  118. sampling_ratio=0), # Sampling ratio when extracting the RoI features. 0 means adaptive ratio.
  119. out_channels=256, # output channels of the extracted feature.
  120. featmap_strides=[4, 8, 16, 32]), # Strides of multi-scale feature maps. It should be consistent to the architecture of the backbone.
  121. bbox_head=dict( # Config of box head in the RoIHead.
  122. type='Shared2FCBBoxHead', # Type of the bbox head, Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/bbox_heads/convfc_bbox_head.py#L177 for implementation details.
  123. in_channels=256, # Input channels for bbox head. This is consistent with the out_channels in roi_extractor
  124. fc_out_channels=1024, # Output feature channels of FC layers.
  125. roi_feat_size=7, # Size of RoI features
  126. num_classes=80, # Number of classes for classification
  127. bbox_coder=dict( # Box coder used in the second stage.
  128. type='DeltaXYWHBBoxCoder', # Type of box coder. 'DeltaXYWHBBoxCoder' is applied for most of methods.
  129. target_means=[0.0, 0.0, 0.0, 0.0], # Means used to encode and decode box
  130. target_stds=[0.1, 0.1, 0.2, 0.2]), # Standard variance for encoding and decoding. It is smaller since the boxes are more accurate. [0.1, 0.1, 0.2, 0.2] is a conventional setting.
  131. reg_class_agnostic=False, # Whether the regression is class agnostic.
  132. loss_cls=dict( # Config of loss function for the classification branch
  133. type='CrossEntropyLoss', # Type of loss for classification branch, we also support FocalLoss etc.
  134. use_sigmoid=False, # Whether to use sigmoid.
  135. loss_weight=1.0), # Loss weight of the classification branch.
  136. loss_bbox=dict( # Config of loss function for the regression branch.
  137. type='L1Loss', # Type of loss, we also support many IoU Losses and smooth L1-loss, etc.
  138. loss_weight=1.0)), # Loss weight of the regression branch.
  139. mask_roi_extractor=dict( # RoI feature extractor for mask generation.
  140. type='SingleRoIExtractor', # Type of the RoI feature extractor, most of methods uses SingleRoIExtractor.
  141. roi_layer=dict( # Config of RoI Layer that extracts features for instance segmentation
  142. type='RoIAlign', # Type of RoI Layer, DeformRoIPoolingPack and ModulatedDeformRoIPoolingPack are also supported
  143. output_size=14, # The output size of feature maps.
  144. sampling_ratio=0), # Sampling ratio when extracting the RoI features.
  145. out_channels=256, # Output channels of the extracted feature.
  146. featmap_strides=[4, 8, 16, 32]), # Strides of multi-scale feature maps.
  147. mask_head=dict( # Mask prediction head
  148. type='FCNMaskHead', # Type of mask head, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/mask_heads/fcn_mask_head.py#L21 for implementation details.
  149. num_convs=4, # Number of convolutional layers in mask head.
  150. in_channels=256, # Input channels, should be consistent with the output channels of mask roi extractor.
  151. conv_out_channels=256, # Output channels of the convolutional layer.
  152. num_classes=80, # Number of class to be segmented.
  153. loss_mask=dict( # Config of loss function for the mask branch.
  154. type='CrossEntropyLoss', # Type of loss used for segmentation
  155. use_mask=True, # Whether to only train the mask in the correct class.
  156. loss_weight=1.0)))) # Loss weight of mask branch.
  157. train_cfg = dict( # Config of training hyperparameters for rpn and rcnn
  158. rpn=dict( # Training config of rpn
  159. assigner=dict( # Config of assigner
  160. type='MaxIoUAssigner', # Type of assigner, MaxIoUAssigner is used for many common detectors. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/assigners/max_iou_assigner.py#L10 for more details.
  161. pos_iou_thr=0.7, # IoU >= threshold 0.7 will be taken as positive samples
  162. neg_iou_thr=0.3, # IoU < threshold 0.3 will be taken as negative samples
  163. min_pos_iou=0.3, # The minimal IoU threshold to take boxes as positive samples
  164. match_low_quality=True, # Whether to match the boxes under low quality (see API doc for more details).
  165. ignore_iof_thr=-1), # IoF threshold for ignoring bboxes
  166. sampler=dict( # Config of positive/negative sampler
  167. type='RandomSampler', # Type of sampler, PseudoSampler and other samplers are also supported. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/samplers/random_sampler.py#L8 for implementation details.
  168. num=256, # Number of samples
  169. pos_fraction=0.5, # The ratio of positive samples in the total samples.
  170. neg_pos_ub=-1, # The upper bound of negative samples based on the number of positive samples.
  171. add_gt_as_proposals=False), # Whether add GT as proposals after sampling.
  172. allowed_border=-1, # The border allowed after padding for valid anchors.
  173. pos_weight=-1, # The weight of positive samples during training.
  174. debug=False), # Whether to set the debug mode
  175. rpn_proposal=dict( # The config to generate proposals during training
  176. nms_across_levels=False, # Whether to do NMS for boxes across levels. Only work in `GARPNHead`, naive rpn does not support do nms cross levels.
  177. nms_pre=2000, # The number of boxes before NMS
  178. nms_post=1000, # The number of boxes to be kept by NMS, Only work in `GARPNHead`.
  179. max_per_img=1000, # The number of boxes to be kept after NMS.
  180. nms=dict( # Config of NMS
  181. type='nms', # Type of NMS
  182. iou_threshold=0.7 # NMS threshold
  183. ),
  184. min_bbox_size=0), # The allowed minimal box size
  185. rcnn=dict( # The config for the roi heads.
  186. assigner=dict( # Config of assigner for second stage, this is different for that in rpn
  187. type='MaxIoUAssigner', # Type of assigner, MaxIoUAssigner is used for all roi_heads for now. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/assigners/max_iou_assigner.py#L10 for more details.
  188. pos_iou_thr=0.5, # IoU >= threshold 0.5 will be taken as positive samples
  189. neg_iou_thr=0.5, # IoU < threshold 0.5 will be taken as negative samples
  190. min_pos_iou=0.5, # The minimal IoU threshold to take boxes as positive samples
  191. match_low_quality=False, # Whether to match the boxes under low quality (see API doc for more details).
  192. ignore_iof_thr=-1), # IoF threshold for ignoring bboxes
  193. sampler=dict(
  194. type='RandomSampler', # Type of sampler, PseudoSampler and other samplers are also supported. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/samplers/random_sampler.py#L8 for implementation details.
  195. num=512, # Number of samples
  196. pos_fraction=0.25, # The ratio of positive samples in the total samples.
  197. neg_pos_ub=-1, # The upper bound of negative samples based on the number of positive samples.
  198. add_gt_as_proposals=True
  199. ), # Whether add GT as proposals after sampling.
  200. mask_size=28, # Size of mask
  201. pos_weight=-1, # The weight of positive samples during training.
  202. debug=False)) # Whether to set the debug mode
  203. test_cfg = dict( # Config for testing hyperparameters for rpn and rcnn
  204. rpn=dict( # The config to generate proposals during testing
  205. nms_across_levels=False, # Whether to do NMS for boxes across levels. Only work in `GARPNHead`, naive rpn does not support do nms cross levels.
  206. nms_pre=1000, # The number of boxes before NMS
  207. nms_post=1000, # The number of boxes to be kept by NMS, Only work in `GARPNHead`.
  208. max_per_img=1000, # The number of boxes to be kept after NMS.
  209. nms=dict( # Config of NMS
  210. type='nms', #Type of NMS
  211. iou_threshold=0.7 # NMS threshold
  212. ),
  213. min_bbox_size=0), # The allowed minimal box size
  214. rcnn=dict( # The config for the roi heads.
  215. score_thr=0.05, # Threshold to filter out boxes
  216. nms=dict( # Config of NMS in the second stage
  217. type='nms', # Type of NMS
  218. iou_thr=0.5), # NMS threshold
  219. max_per_img=100, # Max number of detections of each image
  220. mask_thr_binary=0.5)) # Threshold of mask prediction
  221. dataset_type = 'CocoDataset' # Dataset type, this will be used to define the dataset
  222. data_root = 'data/coco/' # Root path of data
  223. img_norm_cfg = dict( # Image normalization config to normalize the input images
  224. mean=[123.675, 116.28, 103.53], # Mean values used to pre-training the pre-trained backbone models
  225. std=[58.395, 57.12, 57.375], # Standard variance used to pre-training the pre-trained backbone models
  226. to_rgb=True
  227. ) # The channel orders of image used to pre-training the pre-trained backbone models
  228. train_pipeline = [ # Training pipeline
  229. dict(type='LoadImageFromFile'), # First pipeline to load images from file path
  230. dict(
  231. type='LoadAnnotations', # Second pipeline to load annotations for current image
  232. with_bbox=True, # Whether to use bounding box, True for detection
  233. with_mask=True, # Whether to use instance mask, True for instance segmentation
  234. poly2mask=False), # Whether to convert the polygon mask to instance mask, set False for acceleration and to save memory
  235. dict(
  236. type='Resize', # Augmentation pipeline that resize the images and their annotations
  237. img_scale=(1333, 800), # The largest scale of image
  238. keep_ratio=True
  239. ), # whether to keep the ratio between height and width.
  240. dict(
  241. type='RandomFlip', # Augmentation pipeline that flip the images and their annotations
  242. flip_ratio=0.5), # The ratio or probability to flip
  243. dict(
  244. type='Normalize', # Augmentation pipeline that normalize the input images
  245. mean=[123.675, 116.28, 103.53], # These keys are the same of img_norm_cfg since the
  246. std=[58.395, 57.12, 57.375], # keys of img_norm_cfg are used here as arguments
  247. to_rgb=True),
  248. dict(
  249. type='Pad', # Padding config
  250. size_divisor=32), # The number the padded images should be divisible
  251. dict(type='DefaultFormatBundle'), # Default format bundle to gather data in the pipeline
  252. dict(
  253. type='Collect', # Pipeline that decides which keys in the data should be passed to the detector
  254. keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])
  255. ]
  256. test_pipeline = [
  257. dict(type='LoadImageFromFile'), # First pipeline to load images from file path
  258. dict(
  259. type='MultiScaleFlipAug', # An encapsulation that encapsulates the testing augmentations
  260. img_scale=(1333, 800), # Decides the largest scale for testing, used for the Resize pipeline
  261. flip=False, # Whether to flip images during testing
  262. transforms=[
  263. dict(type='Resize', # Use resize augmentation
  264. keep_ratio=True), # Whether to keep the ratio between height and width, the img_scale set here will be suppressed by the img_scale set above.
  265. dict(type='RandomFlip'), # Thought RandomFlip is added in pipeline, it is not used because flip=False
  266. dict(
  267. type='Normalize', # Normalization config, the values are from img_norm_cfg
  268. mean=[123.675, 116.28, 103.53],
  269. std=[58.395, 57.12, 57.375],
  270. to_rgb=True),
  271. dict(
  272. type='Pad', # Padding config to pad images divisible by 32.
  273. size_divisor=32),
  274. dict(
  275. type='ImageToTensor', # convert image to tensor
  276. keys=['img']),
  277. dict(
  278. type='Collect', # Collect pipeline that collect necessary keys for testing.
  279. keys=['img'])
  280. ])
  281. ]
  282. data = dict(
  283. samples_per_gpu=2, # Batch size of a single GPU
  284. workers_per_gpu=2, # Worker to pre-fetch data for each single GPU
  285. train=dict( # Train dataset config
  286. type='CocoDataset', # Type of dataset, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/datasets/coco.py#L19 for details.
  287. ann_file='data/coco/annotations/instances_train2017.json', # Path of annotation file
  288. img_prefix='data/coco/train2017/', # Prefix of image path
  289. pipeline=[ # pipeline, this is passed by the train_pipeline created before.
  290. dict(type='LoadImageFromFile'),
  291. dict(
  292. type='LoadAnnotations',
  293. with_bbox=True,
  294. with_mask=True,
  295. poly2mask=False),
  296. dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
  297. dict(type='RandomFlip', flip_ratio=0.5),
  298. dict(
  299. type='Normalize',
  300. mean=[123.675, 116.28, 103.53],
  301. std=[58.395, 57.12, 57.375],
  302. to_rgb=True),
  303. dict(type='Pad', size_divisor=32),
  304. dict(type='DefaultFormatBundle'),
  305. dict(
  306. type='Collect',
  307. keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])
  308. ]),
  309. val=dict( # Validation dataset config
  310. type='CocoDataset',
  311. ann_file='data/coco/annotations/instances_val2017.json',
  312. img_prefix='data/coco/val2017/',
  313. pipeline=[ # Pipeline is passed by test_pipeline created before
  314. dict(type='LoadImageFromFile'),
  315. dict(
  316. type='MultiScaleFlipAug',
  317. img_scale=(1333, 800),
  318. flip=False,
  319. transforms=[
  320. dict(type='Resize', keep_ratio=True),
  321. dict(type='RandomFlip'),
  322. dict(
  323. type='Normalize',
  324. mean=[123.675, 116.28, 103.53],
  325. std=[58.395, 57.12, 57.375],
  326. to_rgb=True),
  327. dict(type='Pad', size_divisor=32),
  328. dict(type='ImageToTensor', keys=['img']),
  329. dict(type='Collect', keys=['img'])
  330. ])
  331. ]),
  332. test=dict( # Test dataset config, modify the ann_file for test-dev/test submission
  333. type='CocoDataset',
  334. ann_file='data/coco/annotations/instances_val2017.json',
  335. img_prefix='data/coco/val2017/',
  336. pipeline=[ # Pipeline is passed by test_pipeline created before
  337. dict(type='LoadImageFromFile'),
  338. dict(
  339. type='MultiScaleFlipAug',
  340. img_scale=(1333, 800),
  341. flip=False,
  342. transforms=[
  343. dict(type='Resize', keep_ratio=True),
  344. dict(type='RandomFlip'),
  345. dict(
  346. type='Normalize',
  347. mean=[123.675, 116.28, 103.53],
  348. std=[58.395, 57.12, 57.375],
  349. to_rgb=True),
  350. dict(type='Pad', size_divisor=32),
  351. dict(type='ImageToTensor', keys=['img']),
  352. dict(type='Collect', keys=['img'])
  353. ])
  354. ],
  355. samples_per_gpu=2 # Batch size of a single GPU used in testing
  356. ))
  357. evaluation = dict( # The config to build the evaluation hook, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/evaluation/eval_hooks.py#L7 for more details.
  358. interval=1, # Evaluation interval
  359. metric=['bbox', 'segm']) # Metrics used during evaluation
  360. optimizer = dict( # Config used to build optimizer, support all the optimizers in PyTorch whose arguments are also the same as those in PyTorch
  361. type='SGD', # Type of optimizers, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/optimizer/default_constructor.py#L13 for more details
  362. lr=0.02, # Learning rate of optimizers, see detail usages of the parameters in the documentation of PyTorch
  363. momentum=0.9, # Momentum
  364. weight_decay=0.0001) # Weight decay of SGD
  365. optimizer_config = dict( # Config used to build the optimizer hook, refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/optimizer.py#L8 for implementation details.
  366. grad_clip=None) # Most of the methods do not use gradient clip
  367. lr_config = dict( # Learning rate scheduler config used to register LrUpdater hook
  368. policy='step', # The policy of scheduler, also support CosineAnnealing, Cyclic, etc. Refer to details of supported LrUpdater from https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py#L9.
  369. warmup='linear', # The warmup policy, also support `exp` and `constant`.
  370. warmup_iters=500, # The number of iterations for warmup
  371. warmup_ratio=
  372. 0.001, # The ratio of the starting learning rate used for warmup
  373. step=[8, 11]) # Steps to decay the learning rate
  374. runner = dict(
  375. type='EpochBasedRunner', # Type of runner to use (i.e. IterBasedRunner or EpochBasedRunner)
  376. max_epochs=12) # Runner that runs the workflow in total max_epochs. For IterBasedRunner use `max_iters`
  377. checkpoint_config = dict( # Config to set the checkpoint hook, Refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/checkpoint.py for implementation.
  378. interval=1) # The save interval is 1
  379. log_config = dict( # config to register logger hook
  380. interval=50, # Interval to print the log
  381. hooks=[
  382. # dict(type='TensorboardLoggerHook') # The Tensorboard logger is also supported
  383. dict(type='TextLoggerHook')
  384. ]) # The logger used to record the training process.
  385. dist_params = dict(backend='nccl') # Parameters to setup distributed training, the port can also be set.
  386. log_level = 'INFO' # The level of logging.
  387. load_from = None # load models as a pre-trained model from a given path. This will not resume training.
  388. resume_from = None # Resume checkpoints from a given path, the training will be resumed from the epoch when the checkpoint's is saved.
  389. workflow = [('train', 1)] # Workflow for runner. [('train', 1)] means there is only one workflow and the workflow named 'train' is executed once. The workflow trains the model by 12 epochs according to the total_epochs.
  390. work_dir = 'work_dir' # Directory to save the model checkpoints and logs for the current experiments.
  391. ```
  392. ## FAQ
  393. ### Ignore some fields in the base configs
  394. Sometimes, you may set `_delete_=True` to ignore some of fields in base configs.
  395. You may refer to [mmcv](https://mmcv.readthedocs.io/en/latest/understand_mmcv/config.html#inherit-from-base-config-with-ignored-fields) for simple illustration.
  396. In MMDetection, for example, to change the backbone of Mask R-CNN with the following config.
  397. ```python
  398. model = dict(
  399. type='MaskRCNN',
  400. pretrained='torchvision://resnet50',
  401. backbone=dict(
  402. type='ResNet',
  403. depth=50,
  404. num_stages=4,
  405. out_indices=(0, 1, 2, 3),
  406. frozen_stages=1,
  407. norm_cfg=dict(type='BN', requires_grad=True),
  408. norm_eval=True,
  409. style='pytorch'),
  410. neck=dict(...),
  411. rpn_head=dict(...),
  412. roi_head=dict(...))
  413. ```
  414. `ResNet` and `HRNet` use different keywords to construct.
  415. ```python
  416. _base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
  417. model = dict(
  418. pretrained='open-mmlab://msra/hrnetv2_w32',
  419. backbone=dict(
  420. _delete_=True,
  421. type='HRNet',
  422. extra=dict(
  423. stage1=dict(
  424. num_modules=1,
  425. num_branches=1,
  426. block='BOTTLENECK',
  427. num_blocks=(4, ),
  428. num_channels=(64, )),
  429. stage2=dict(
  430. num_modules=1,
  431. num_branches=2,
  432. block='BASIC',
  433. num_blocks=(4, 4),
  434. num_channels=(32, 64)),
  435. stage3=dict(
  436. num_modules=4,
  437. num_branches=3,
  438. block='BASIC',
  439. num_blocks=(4, 4, 4),
  440. num_channels=(32, 64, 128)),
  441. stage4=dict(
  442. num_modules=3,
  443. num_branches=4,
  444. block='BASIC',
  445. num_blocks=(4, 4, 4, 4),
  446. num_channels=(32, 64, 128, 256)))),
  447. neck=dict(...))
  448. ```
  449. The `_delete_=True` would replace all old keys in `backbone` field with new keys.
  450. ### Use intermediate variables in configs
  451. Some intermediate variables are used in the configs files, like `train_pipeline`/`test_pipeline` in datasets.
  452. It's worth noting that when modifying intermediate variables in the children configs, user need to pass the intermediate variables into corresponding fields again.
  453. For example, we would like to use multi scale strategy to train a Mask R-CNN. `train_pipeline`/`test_pipeline` are intermediate variable we would like modify.
  454. ```python
  455. _base_ = './mask_rcnn_r50_fpn_1x_coco.py'
  456. img_norm_cfg = dict(
  457. mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
  458. train_pipeline = [
  459. dict(type='LoadImageFromFile'),
  460. dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
  461. dict(
  462. type='Resize',
  463. img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
  464. (1333, 768), (1333, 800)],
  465. multiscale_mode="value",
  466. keep_ratio=True),
  467. dict(type='RandomFlip', flip_ratio=0.5),
  468. dict(type='Normalize', **img_norm_cfg),
  469. dict(type='Pad', size_divisor=32),
  470. dict(type='DefaultFormatBundle'),
  471. dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
  472. ]
  473. test_pipeline = [
  474. dict(type='LoadImageFromFile'),
  475. dict(
  476. type='MultiScaleFlipAug',
  477. img_scale=(1333, 800),
  478. flip=False,
  479. transforms=[
  480. dict(type='Resize', keep_ratio=True),
  481. dict(type='RandomFlip'),
  482. dict(type='Normalize', **img_norm_cfg),
  483. dict(type='Pad', size_divisor=32),
  484. dict(type='ImageToTensor', keys=['img']),
  485. dict(type='Collect', keys=['img']),
  486. ])
  487. ]
  488. data = dict(
  489. train=dict(pipeline=train_pipeline),
  490. val=dict(pipeline=test_pipeline),
  491. test=dict(pipeline=test_pipeline))
  492. ```
  493. We first define the new `train_pipeline`/`test_pipeline` and pass them into `data`.
  494. Similarly, if we would like to switch from `SyncBN` to `BN` or `MMSyncBN`, we need to substitute every `norm_cfg` in the config.
  495. ```python
  496. _base_ = './mask_rcnn_r50_fpn_1x_coco.py'
  497. norm_cfg = dict(type='BN', requires_grad=True)
  498. model = dict(
  499. backbone=dict(norm_cfg=norm_cfg),
  500. neck=dict(norm_cfg=norm_cfg),
  501. ...)
  502. ```

No Description

Contributors (2)