You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

data_pipeline.md 5.6 kB

2 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199
  1. # Tutorial 3: Customize Data Pipelines
  2. ## Design of Data pipelines
  3. Following typical conventions, we use `Dataset` and `DataLoader` for data loading
  4. with multiple workers. `Dataset` returns a dict of data items corresponding
  5. the arguments of models' forward method.
  6. Since the data in object detection may not be the same size (image size, gt bbox size, etc.),
  7. we introduce a new `DataContainer` type in MMCV to help collect and distribute
  8. data of different size.
  9. See [here](https://github.com/open-mmlab/mmcv/blob/master/mmcv/parallel/data_container.py) for more details.
  10. The data preparation pipeline and the dataset is decomposed. Usually a dataset
  11. defines how to process the annotations and a data pipeline defines all the steps to prepare a data dict.
  12. A pipeline consists of a sequence of operations. Each operation takes a dict as input and also output a dict for the next transform.
  13. We present a classical pipeline in the following figure. The blue blocks are pipeline operations. With the pipeline going on, each operator can add new keys (marked as green) to the result dict or update the existing keys (marked as orange).
  14. ![pipeline figure](../../resources/data_pipeline.png)
  15. The operations are categorized into data loading, pre-processing, formatting and test-time augmentation.
  16. Here is a pipeline example for Faster R-CNN.
  17. ```python
  18. img_norm_cfg = dict(
  19. mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
  20. train_pipeline = [
  21. dict(type='LoadImageFromFile'),
  22. dict(type='LoadAnnotations', with_bbox=True),
  23. dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
  24. dict(type='RandomFlip', flip_ratio=0.5),
  25. dict(type='Normalize', **img_norm_cfg),
  26. dict(type='Pad', size_divisor=32),
  27. dict(type='DefaultFormatBundle'),
  28. dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
  29. ]
  30. test_pipeline = [
  31. dict(type='LoadImageFromFile'),
  32. dict(
  33. type='MultiScaleFlipAug',
  34. img_scale=(1333, 800),
  35. flip=False,
  36. transforms=[
  37. dict(type='Resize', keep_ratio=True),
  38. dict(type='RandomFlip'),
  39. dict(type='Normalize', **img_norm_cfg),
  40. dict(type='Pad', size_divisor=32),
  41. dict(type='ImageToTensor', keys=['img']),
  42. dict(type='Collect', keys=['img']),
  43. ])
  44. ]
  45. ```
  46. For each operation, we list the related dict fields that are added/updated/removed.
  47. ### Data loading
  48. `LoadImageFromFile`
  49. - add: img, img_shape, ori_shape
  50. `LoadAnnotations`
  51. - add: gt_bboxes, gt_bboxes_ignore, gt_labels, gt_masks, gt_semantic_seg, bbox_fields, mask_fields
  52. `LoadProposals`
  53. - add: proposals
  54. ### Pre-processing
  55. `Resize`
  56. - add: scale, scale_idx, pad_shape, scale_factor, keep_ratio
  57. - update: img, img_shape, *bbox_fields, *mask_fields, *seg_fields
  58. `RandomFlip`
  59. - add: flip
  60. - update: img, *bbox_fields, *mask_fields, *seg_fields
  61. `Pad`
  62. - add: pad_fixed_size, pad_size_divisor
  63. - update: img, pad_shape, *mask_fields, *seg_fields
  64. `RandomCrop`
  65. - update: img, pad_shape, gt_bboxes, gt_labels, gt_masks, *bbox_fields
  66. `Normalize`
  67. - add: img_norm_cfg
  68. - update: img
  69. `SegRescale`
  70. - update: gt_semantic_seg
  71. `PhotoMetricDistortion`
  72. - update: img
  73. `Expand`
  74. - update: img, gt_bboxes
  75. `MinIoURandomCrop`
  76. - update: img, gt_bboxes, gt_labels
  77. `Corrupt`
  78. - update: img
  79. ### Formatting
  80. `ToTensor`
  81. - update: specified by `keys`.
  82. `ImageToTensor`
  83. - update: specified by `keys`.
  84. `Transpose`
  85. - update: specified by `keys`.
  86. `ToDataContainer`
  87. - update: specified by `fields`.
  88. `DefaultFormatBundle`
  89. - update: img, proposals, gt_bboxes, gt_bboxes_ignore, gt_labels, gt_masks, gt_semantic_seg
  90. `Collect`
  91. - add: img_meta (the keys of img_meta is specified by `meta_keys`)
  92. - remove: all other keys except for those specified by `keys`
  93. ### Test time augmentation
  94. `MultiScaleFlipAug`
  95. ## Extend and use custom pipelines
  96. 1. Write a new pipeline in a file, e.g., in `my_pipeline.py`. It takes a dict as input and returns a dict.
  97. ```python
  98. import random
  99. from mmdet.datasets import PIPELINES
  100. @PIPELINES.register_module()
  101. class MyTransform:
  102. """Add your transform
  103. Args:
  104. p (float): Probability of shifts. Default 0.5.
  105. """
  106. def __init__(self, p=0.5):
  107. self.p = p
  108. def __call__(self, results):
  109. if random.random() > self.p:
  110. results['dummy'] = True
  111. return results
  112. ```
  113. 2. Import and use the pipeline in your config file.
  114. Make sure the import is relative to where your train script is located.
  115. ```python
  116. custom_imports = dict(imports=['path.to.my_pipeline'], allow_failed_imports=False)
  117. img_norm_cfg = dict(
  118. mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
  119. train_pipeline = [
  120. dict(type='LoadImageFromFile'),
  121. dict(type='LoadAnnotations', with_bbox=True),
  122. dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
  123. dict(type='RandomFlip', flip_ratio=0.5),
  124. dict(type='Normalize', **img_norm_cfg),
  125. dict(type='Pad', size_divisor=32),
  126. dict(type='MyTransform', p=0.2),
  127. dict(type='DefaultFormatBundle'),
  128. dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
  129. ]
  130. ```
  131. 3. Visualize the output of your augmentation pipeline
  132. To visualize the output of your agmentation pipeline, `tools/misc/browse_dataset.py`
  133. can help the user to browse a detection dataset (both images and bounding box annotations)
  134. visually, or save the image to a designated directory. More detials can refer to
  135. [useful_tools](../useful_tools.md)

No Description

Contributors (2)