You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

init_cfg.md 6.3 kB

2 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160
  1. # Tutorial 10: Weight initialization
  2. During training, a proper initialization strategy is beneficial to speeding up the training or obtaining a higher performance. [MMCV](https://github.com/open-mmlab/mmcv/blob/master/mmcv/cnn/utils/weight_init.py) provide some commonly used methods for initializing modules like `nn.Conv2d`. Model initialization in MMdetection mainly uses `init_cfg`. Users can initialize models with following two steps:
  3. 1. Define `init_cfg` for a model or its components in `model_cfg`, but `init_cfg` of children components have higher priority and will override `init_cfg` of parents modules.
  4. 2. Build model as usual, but call `model.init_weights()` method explicitly, and model parameters will be initialized as configuration.
  5. The high-level workflow of initialization in MMdetection is :
  6. model_cfg(init_cfg) -> build_from_cfg -> model -> init_weight() -> initialize(self, self.init_cfg) -> children's init_weight()
  7. ### Description
  8. It is dict or list[dict], and contains the following keys and values:
  9. - `type` (str), containing the initializer name in `INTIALIZERS`, and followed by arguments of the initializer.
  10. - `layer` (str or list[str]), containing the names of basiclayers in Pytorch or MMCV with learnable parameters that will be initialized, e.g. `'Conv2d'`,`'DeformConv2d'`.
  11. - `override` (dict or list[dict]), containing the sub-modules that not inherit from BaseModule and whose initialization configuration is different from other layers' which are in `'layer'` key. Initializer defined in `type` will work for all layers defined in `layer`, so if sub-modules are not derived Classes of `BaseModule` but can be initialized as same ways of layers in `layer`, it does not need to use `override`. `override` contains:
  12. - `type` followed by arguments of initializer;
  13. - `name` to indicate sub-module which will be initialized.
  14. ### Initialize parameters
  15. Inherit a new model from `mmcv.runner.BaseModule` or `mmdet.models` Here we show an example of FooModel.
  16. ```python
  17. import torch.nn as nn
  18. from mmcv.runner import BaseModule
  19. class FooModel(BaseModule)
  20. def __init__(self,
  21. arg1,
  22. arg2,
  23. init_cfg=None):
  24. super(FooModel, self).__init__(init_cfg)
  25. ...
  26. ```
  27. - Initialize model by using `init_cfg` directly in code
  28. ```python
  29. import torch.nn as nn
  30. from mmcv.runner import BaseModule
  31. # or directly inherit mmdet models
  32. class FooModel(BaseModule)
  33. def __init__(self,
  34. arg1,
  35. arg2,
  36. init_cfg=XXX):
  37. super(FooModel, self).__init__(init_cfg)
  38. ...
  39. ```
  40. - Initialize model by using `init_cfg` directly in `mmcv.Sequential` or `mmcv.ModuleList` code
  41. ```python
  42. from mmcv.runner import BaseModule, ModuleList
  43. class FooModel(BaseModule)
  44. def __init__(self,
  45. arg1,
  46. arg2,
  47. init_cfg=None):
  48. super(FooModel, self).__init__(init_cfg)
  49. ...
  50. self.conv1 = ModuleList(init_cfg=XXX)
  51. ```
  52. - Initialize model by using `init_cfg` in config file
  53. ```python
  54. model = dict(
  55. ...
  56. model = dict(
  57. type='FooModel',
  58. arg1=XXX,
  59. arg2=XXX,
  60. init_cfg=XXX),
  61. ...
  62. ```
  63. ### Usage of init_cfg
  64. 1. Initialize model by `layer` key
  65. If we only define `layer`, it just initialize the layer in `layer` key.
  66. NOTE: Value of `layer` key is the class name with attributes weights and bias of Pytorch, (so such as `MultiheadAttention layer` is not supported).
  67. - Define `layer` key for initializing module with same configuration.
  68. ```python
  69. init_cfg = dict(type='Constant', layer=['Conv1d', 'Conv2d', 'Linear'], val=1)
  70. # initialize whole module with same configuration
  71. ```
  72. - Define `layer` key for initializing layer with different configurations.
  73. ```python
  74. init_cfg = [dict(type='Constant', layer='Conv1d', val=1),
  75. dict(type='Constant', layer='Conv2d', val=2),
  76. dict(type='Constant', layer='Linear', val=3)]
  77. # nn.Conv1d will be initialized with dict(type='Constant', val=1)
  78. # nn.Conv2d will be initialized with dict(type='Constant', val=2)
  79. # nn.Linear will be initialized with dict(type='Constant', val=3)
  80. ```
  81. 2. Initialize model by `override` key
  82. - When initializing some specific part with its attribute name, we can use `override` key, and the value in `override` will ignore the value in init_cfg.
  83. ```python
  84. # layers:
  85. # self.feat = nn.Conv1d(3, 1, 3)
  86. # self.reg = nn.Conv2d(3, 3, 3)
  87. # self.cls = nn.Linear(1,2)
  88. init_cfg = dict(type='Constant',
  89. layer=['Conv1d','Conv2d'], val=1, bias=2,
  90. override=dict(type='Constant', name='reg', val=3, bias=4))
  91. # self.feat and self.cls will be initialized with dict(type='Constant', val=1, bias=2)
  92. # The module called 'reg' will be initialized with dict(type='Constant', val=3, bias=4)
  93. ```
  94. - If `layer` is None in init_cfg, only sub-module with the name in override will be initialized, and type and other args in override can be omitted.
  95. ```python
  96. # layers:
  97. # self.feat = nn.Conv1d(3, 1, 3)
  98. # self.reg = nn.Conv2d(3, 3, 3)
  99. # self.cls = nn.Linear(1,2)
  100. init_cfg = dict(type='Constant', val=1, bias=2, override=dict(name='reg'))
  101. # self.feat and self.cls will be initialized by Pytorch
  102. # The module called 'reg' will be initialized with dict(type='Constant', val=1, bias=2)
  103. ```
  104. - If we don't define `layer` key or `override` key, it will not initialize anything.
  105. - Invalid usage
  106. ```python
  107. # It is invalid that override don't have name key
  108. init_cfg = dict(type='Constant', layer ['Conv1d','Conv2d'], val=1, bias=2,
  109. override=dict(type='Constant', val=3, bias=4))
  110. # It is also invalid that override has name and other args except type
  111. init_cfg = dict(type='Constant', layer ['Conv1d','Conv2d'], val=1, bias=2,
  112. override=dict(name='reg', val=3, bias=4))
  113. ```
  114. 3. Initialize model with the pretrained model
  115. ```python
  116. init_cfg = dict(type='Pretrained',
  117. checkpoint='torchvision://resnet50')
  118. ```
  119. More details can refer to the documentation in [MMCV](https://mmcv.readthedocs.io/en/latest/cnn.html#weight-initialization) and MMCV [PR #780](https://github.com/open-mmlab/mmcv/pull/780)

No Description

Contributors (2)