You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

t_model.rst 13 kB

4 years ago
4 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268
  1. .. _model:
  2. AutoGL Model
  3. ============
  4. In AutoGL, we use ``model`` and ``automodel`` to define the logic of graph nerual networks and make it compatible with hyper parameter optimization. Currently we support the following models for given tasks.
  5. +----------------------+----------------------------+
  6. | Tasks | Models |
  7. +======================+============================+
  8. | Node Classification | ``gcn``, ``gat``, ``sage`` |
  9. +----------------------+----------------------------+
  10. | Graph Classification | ``gin``, ``topk`` |
  11. +----------------------+----------------------------+
  12. | Link Prediction | ``gcn``, ``gat``, ``sage`` |
  13. +----------------------+----------------------------+
  14. Lazy Initialization
  15. -------------------
  16. In current AutoGL pipeline, some important hyper-parameters related with model cannot be set outside before the pipeline (e.g. input dimensions, which can only be caluclated during running after feature engineered). Therefore, in ``automodel``, we use lazy initialization to initialize the core ``model``. When the ``automodel`` initialization method ``__init__()`` is called with argument ``init`` be ``False``, only (part of) the hyper-parameters will be set. The ``automodel`` will have its core ``model`` only after ``initialize()`` is explicitly called, which will be done automatically in ``solver`` and ``from_hyper_parameter()``, after all the hyper-parameters are set properly.
  17. Define your own model and automodel
  18. -----------------------------------
  19. We highly recommend you to define both ``model`` and ``automodel``, although you only need your ``automodel`` to communicate with ``solver`` and ``trainer``. The ``model`` will be responsible for the parameters initialization and forward logic declaration, while the ``automodel`` will be responsible for the hyper-parameter definiton and organization.
  20. General customization
  21. ^^^^^^^^^^^^^^^^^^^^^
  22. Let's say you want to implement a simple MLP for node classification and want to let AutoGL find the best hyper-parameters for you. You can first define the logics assuming all the hyper-parameters are given.
  23. .. code-block:: python
  24. import torch
  25. # define mlp model, need to inherit from torch.nn.Module
  26. class MyMLP(torch.nn.Module):
  27. # assume you already get all the hyper-parameters
  28. def __init__(self, in_channels, num_classes, layer_num, dim):
  29. super().__init__()
  30. if layer_num == 1:
  31. ops = [torch.nn.Linear(in_channels, num_classes)]
  32. else:
  33. ops = [torch.nn.Linear(in_channels, dim)]
  34. for i in range(layer_num - 2):
  35. ops.append(torch.nn.Linear(dim, dim))
  36. ops.append(torch.nn.Linear(dim, num_classes))
  37. self.core = torch.nn.Sequential(*ops)
  38. # this method is required
  39. def forward(self, data):
  40. # data: torch_geometric.data.Data
  41. assert hasattr(data, 'x'), 'MLP only support graph data with features'
  42. x = data.x
  43. return torch.nn.functional.log_softmax(self.core(x))
  44. After you define the logic of ``model``, you can now define your ``automodel`` to manage the hyper-parameters.
  45. .. code-block:: python
  46. from autogl.module.model import BaseModel
  47. # define your automodel, need to inherit from BaseModel
  48. class MyAutoMLP(BaseModel):
  49. def __init__(self):
  50. # (required) make sure you call __init__ of super with init argument properly set.
  51. # if you do not want to initialize inside __init__, please pass False.
  52. super().__init__(init=False)
  53. # (required) define the search space
  54. self.space = [
  55. {'parameterName': 'layer_num', 'type': 'INTEGER', 'minValue': 1, 'maxValue': 5, 'scalingType': 'LINEAR'},
  56. {'parameterName': 'dim', 'type': 'INTEGER', 'minValue': 64, 'maxValue': 128, 'scalingType': 'LINEAR'}
  57. ]
  58. # set default hyper-parameters
  59. self.layer_num = 2
  60. self.dim = 72
  61. # for the hyper-parameters that are related with dataset, you can just set them to None
  62. self.num_classes = None
  63. self.num_features = None
  64. # (required) since we don't know the num_classes and num_features until we see the dataset,
  65. # we cannot initialize the models when instantiated. the initialized will be set to False.
  66. self.initialized = False
  67. # (required) set the device of current auto model
  68. self.device = torch.device('cuda')
  69. # (required) get current hyper-parameters of this automodel
  70. # need to return a dictionary whose keys are the same with self.space
  71. def get_hyper_parameter(self):
  72. return {
  73. 'layer_num': self.layer_num,
  74. 'dim': self.dim
  75. }
  76. # (required) override to interact with num_classes
  77. def get_num_classes(self):
  78. return self.num_classes
  79. # (required) override to interact with num_classes
  80. def set_num_classes(self, n_classes):
  81. self.num_classes = n_classes
  82. # (required) override to interact with num_features
  83. def get_num_features(self):
  84. return self.num_features
  85. # (required) override to interact with num_features
  86. def set_num_features(self, n_features):
  87. self.num_features = n_features
  88. # (required) instantiate the core MLP model using corresponding hyper-parameters
  89. def initialize(self):
  90. # (required) you need to make sure the core model is named as `self.model`
  91. self.model = MyMLP(
  92. in_channels = self.num_features,
  93. num_classes = self.num_classes,
  94. layer_num = self.layer_num,
  95. dim = self.dim
  96. ).to(self.device)
  97. self.initialized = True
  98. # (required) override to create a copy of model using provided hyper-parameters
  99. def from_hyper_parameter(self, hp):
  100. # hp is a dictionary that contains keys and values corrsponding to your self.space
  101. # in this case, it will be in form {'layer_num': XX, 'dim': XX}
  102. # create a new instance
  103. ret = self.__class__()
  104. # set the hyper-parameters related to dataset and device
  105. ret.num_classes = self.num_classes
  106. ret.num_features = self.num_features
  107. ret.device = self.device
  108. # set the hyper-parameters according to hp
  109. ret.layer_num = hp['layer_num']
  110. ret.dim = hp['dim']
  111. # initialize it before returning
  112. ret.initialize()
  113. return ret
  114. Then, you can use this node classification model as part of AutoNodeClassifier ``solver``.
  115. .. code-block :: python
  116. from autogl.solver import AutoNodeClassifier
  117. solver = AutoNodeClassifier(graph_models=(MyAutoMLP(),))
  118. The model for graph classification is generally the same, except that you can now also receive the ``num_graph_features`` (the dimension of the graph-level feature) through overriding ``set_num_graph_features(self, n_graph_features)`` of ``BaseModel``. Also, please remember to return graph-level logits instead of node-level one in ``forward()`` of ``model``.
  119. Model for link prediction
  120. ^^^^^^^^^^^^^^^^^^^^^^^^^
  121. For link prediction, the definition of model is a bit different with the common forward definition. You need to implement the ``lp_encode(self, data)`` and ``lp_decode(self, x, pos_edge_index, neg_edge_index)`` to interact with ``LinkPredictionTrainer`` and ``AutoLinkPredictor``. Taking the class ``MyMLP`` defined above for example, if you want to perform link prediction:
  122. .. code-block:: python
  123. class MyMLPForLP(torch.nn.Module):
  124. # num_classes is removed since it is invalid for link prediction
  125. def __init__(self, in_channels, layer_num, dim):
  126. super().__init__()
  127. ops = [torch.nn.Linear(in_channels, dim)]
  128. for i in range(layer_num - 1):
  129. ops.append(torch.nn.Linear(dim, dim))
  130. self.core = torch.nn.Sequential(*ops)
  131. # (required) for interaction with link prediction trainer and solver
  132. def lp_encode(self, data):
  133. return self.core(data.x)
  134. # (required) for interaction with link prediction trainer and solver
  135. def lp_decode(self, x, pos_edge_index, neg_edge_index):
  136. # first, get all the edge_index need calculated
  137. edge_index = torch.cat([pos_edge_index, neg_edge_index], dim=-1)
  138. # then, use dot-products to calculate logits, you can use whatever decode method you want
  139. logits = (x[edge_index[0]] * x[edge_index[1]]).sum(dim=-1)
  140. return logits
  141. class MyAutoMLPForLP(MyAutoMLP):
  142. def initialize(self):
  143. # init MyMLPForLP instead of MyMLP
  144. self.model = MyMLPForLP(
  145. in_channels = self.num_features,
  146. layer_num = self.layer_num,
  147. dim = self.dim
  148. ).to(self.device)
  149. self.initialized = True
  150. Model with sampling support
  151. ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  152. Towards efficient representation learning on large-scale graph, AutoGL currently support node classification using sampling techniques including node-wise sampling, layer-wise sampling, and graph-wise sampling. See more about sampling in :ref:`trainer`.
  153. In order to conduct node classification using sampling technique with your custom model, further adaptation and modification are generally required.
  154. According to the Message Passing mechanism of Graph Neural Network (GNN), numerous nodes in the multi-hop neighborhood of evaluation set or test set are potentially involved to evaluate the GNN model on large-scale graph dataset.
  155. As the representations for those numerous nodes are likely to occupy large amount of computational resource, the common forwarding process is generally infeasible for model evaluation on large-scale graph.
  156. An iterative representation learning mechanism is a practical and feasible way to evaluate **Sequential Model**,
  157. which only consists of multiple sequential layers, with each layer taking a ``Data`` aggregate as input. The input ``Data`` has the same functionality with ``torch_geometric.data.Data``, which conventionally provides properties ``x``, ``edge_index``, and optional ``edge_weight``.
  158. If your custom model is composed of concatenated layers, you would better make your model inherit ``ClassificationSupportedSequentialModel`` to utilize the layer-wise representation learning mechanism to efficiently conduct representation learning for your custom sequential model.
  159. .. code-block:: python
  160. import autogl
  161. from autogl.module.model.base import ClassificationSupportedSequentialModel
  162. # override Linear so that it can take graph data as input
  163. class Linear(torch.nn.Linear):
  164. def forward(self, data):
  165. return super().forward(data.x)
  166. class MyMLPSampling(ClassificationSupportedSequentialModel):
  167. def __init__(self, in_channels, num_classes, layer_num, dim):
  168. super().__init__()
  169. if layer_num == 1:
  170. ops = [Linear(in_channels, num_classes)]
  171. else:
  172. ops = [Linear(in_channels, dim)]
  173. for i in range(layer_num - 2):
  174. ops.append(Linear(dim, dim))
  175. ops.append(Linear(dim, num_classes))
  176. self.core = torch.nn.ModuleList(ops)
  177. # (required) override sequential_encoding_layers property to interact with sampling
  178. @property
  179. def sequential_encoding_layers(self) -> torch.nn.ModuleList:
  180. return self.core
  181. # (required) define the encode logic of classification for sampling
  182. def cls_encode(self, data):
  183. # if you use sampling, the data will be passed in two possible ways,
  184. # you can judge it use following rules
  185. if hasattr(data, 'edge_indexes'):
  186. # the edge_indexes are a list of edge_index, one for each layer
  187. edge_indexes = data.edge_indexes
  188. edge_weights = [None] * len(self.core) if getattr(data, 'edge_weights', None) is None else data.edge_weights
  189. else:
  190. # the edge_index and edge_weight will stay the same as default
  191. edge_indexes = [data.edge_index] * len(self.core)
  192. edge_weights = [getattr(data, 'edge_weight', None)] * len(self.core)
  193. x = data.x
  194. for i in range(len(self.core)):
  195. data = autogl.data.Data(x=x, edge_index=edge_indexes[i])
  196. data.edge_weight = edge_weights[i]
  197. x = self.sequential_encoding_layers[i](data)
  198. return x
  199. # (required) define the decode logic of classification for sampling
  200. def cls_decode(self, x):
  201. return torch.nn.functional.log_softmax(x)