| @@ -6,12 +6,18 @@ An autoML framework & toolkit for machine learning on graphs. | |||
| Feel free to open <a href="https://github.com/THUMNLab/AutoGL/issues">issues</a> or contact us at <a href="mailto:autogl@tsinghua.edu.cn">autogl@tsinghua.edu.cn</a> if you have any comments or suggestions! | |||
| [](https://github.com/psf/black) | |||
| [](https://autogl.readthedocs.io/en/latest/?badge=latest) | |||
| <!-- | |||
| [](https://github.com/psf/black) | |||
| % [](http://mn.cs.tsinghua.edu.cn/autogl/documentation/?badge=latest)--> | |||
| ## News! | |||
| - 2021.07.11 New version! v0.2.0-pre is here! In this new version, AutoGL supports [neural architecture search (NAS)](https://autogl.readthedocs.io/en/latest/docfile/tutorial/t_nas.html) to customize architectures for the given datasets and tasks. AutoGL also supports [sampling](https://autogl.readthedocs.io/en/latest/docfile/tutorial/t_trainer.html#node-classification-with-sampling) now to perform tasks on large datasets, including node-wise sampling, layer-wise sampling, and sub-graph sampling. The link prediction task is now also supported! Learn more in our [tutorial](https://autogl.readthedocs.io/en/latest/index.html). | |||
| - 2021.12.31 New Version! v0.3.0-pre is here! | |||
| - AutoGL now support [__Deep Graph Library (DGL)__](https://www.dgl.ai/) backend to be interface-friendly for DGL users! All the homogeneous node classification task, link prediction task, and graph classification task are currently supported under DGL backend. AutoGL is also compatible with PyG 2.0 now. | |||
| - The __heterogeneous__ node classification tasks are now supported! See [hetero tutorial](http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/tutorial/t_hetero_node_clf.html) for more details. | |||
| - To make the library more flexible, the module `model` now supports __decoupled__ to two additional sub-modules named `encoder` and `decoder`. Under the __decoupled__ design, one `encoder` can be used to solve all kinds of tasks, relieving burdens for developing and user expanding/contributing. | |||
| - We enrich our supported [NAS algorithms](http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/tutorial/t_nas.html) such as [AutoAttend](https://proceedings.mlr.press/v139/guan21a.html), [GASSO](https://proceedings.neurips.cc/paper/2021/hash/8c9f32e03aeb2e3000825c8c875c4edd-Abstract.html), [hardware-aware algorithm](http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/documentation/nas.html#autogl.module.nas.estimator.OneShotEstimator_HardwareAware), etc. | |||
| - 2021.07.11 New version! v0.2.0-pre is here! In this new version, AutoGL supports [neural architecture search (NAS)](http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/tutorial/t_nas.html) to customize architectures for the given datasets and tasks. AutoGL also supports [sampling](http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/tutorial/t_trainer.html#node-classification-with-sampling) now to perform tasks on large datasets, including node-wise sampling, layer-wise sampling, and sub-graph sampling. The link prediction task is now also supported! Learn more in our [tutorial](http://mn.cs.tsinghua.edu.cn/autogl/documentation/index.html). | |||
| - 2021.04.16 Our survey paper about automated machine learning on graphs is accepted by IJCAI! See more [here](http://arxiv.org/abs/2103.00742). | |||
| - 2021.04.10 Our paper [__AutoGL: A Library for Automated Graph Learning__](https://arxiv.org/abs/2104.04987) is accepted by _ICLR 2021 Workshop on Geometrical and Topological Representation Learning_! You can cite our paper following methods [here](#Cite). | |||
| @@ -23,7 +29,7 @@ The workflow below shows the overall framework of AutoGL. | |||
| <img src="./resources/workflow.svg"> | |||
| AutoGL uses `datasets` to maintain datasets for graph-based machine learning, which is based on Dataset in PyTorch Geometric with some functions added to support the auto solver framework. | |||
| AutoGL uses `datasets` to maintain datasets for graph-based machine learning, which is based on Dataset in PyTorch Geometric or Deep Graph Library with some functions added to support the auto solver framework. | |||
| Different graph-based machine learning tasks are handled by different `AutoGL solvers`, which make use of five main modules to automatically solve given tasks, namely `auto feature engineer`, `neural architecture search`, `auto model`, `hyperparameter optimization`, and `auto ensemble`. | |||
| @@ -42,15 +48,18 @@ Currently, the following algorithms are supported in AutoGL: | |||
| <tr valign="top"> | |||
| <!--<td><b>Generators</b><br>graphlet <br> eigen <br> pagerank <br> PYGLocalDegreeProfile <br> PYGNormalizeFeatures <br> PYGOneHotDegree <br> onehot <br> <br><b>Selectors</b><br> SeFilterConstant<br> gbdt <br> <br><b>Subgraph</b><br> NxLargeCliqueSize<br> NxAverageClusteringApproximate<br> NxDegreeAssortativityCoefficient<br> NxDegreePearsonCorrelationCoefficient<br> NxHasBridge <br>NxGraphCliqueNumber<br> NxGraphNumberOfCliques<br> NxTransitivity<br> NxAverageClustering<br> NxIsConnected<br> NxNumberConnectedComponents<br> NxIsDistanceRegular<br> NxLocalEfficiency<br> NxGlobalEfficiency<br> NxIsEulerian </td>--> | |||
| <td><b>Generators</b><br>Graphlets <br> EigenGNN <br> <a href="http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/tutorial/t_fe.html">more ...</a><br><br><b>Selectors</b><br> SeFilterConstant<br> gbdt <br> <br><b>Graph</b><br> netlsd<br> NxAverageClustering<br> <a href="http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/tutorial/t_fe.html">more ...</a></td> | |||
| <td><b>Node Classification</b><br> GCN <br> GAT <br> GraphSAGE <br><br><b>Graph Classification</b><br> GIN <br> TopKPool </td> | |||
| <td><b>Homo Encoders</b><br> GCNEncoder <br> GATEncoder <br> SAGEEncoder <br> GINEncoder <br> <br><b>Decoders</b><br>LogSoftmaxDecoder <br> DotProductDecoder <br> SumPoolMLPDecoder <br> JKSumPoolDecoder </td> | |||
| <td> | |||
| <b>Algorithms</b><br> | |||
| Random<br> | |||
| RL<br> | |||
| Evolution<br> | |||
| GASSO<br> | |||
| <a href='http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/documentation/nas.html'>more ...</a><br><br> | |||
| <b>Spaces</b><br> | |||
| SinglePath<br> | |||
| GraphNas<br> | |||
| AutoAttend<br> | |||
| <a href='http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/documentation/nas.html'>more ...</a><br><br> | |||
| <b>Estimators</b><br> | |||
| Oneshot<br> | |||
| @@ -76,10 +85,18 @@ Please make sure you meet the following requirements before installing AutoGL. | |||
| see <https://pytorch.org/> for installation. | |||
| 3. PyTorch Geometric (>=1.7.0) | |||
| 3. Graph Library Backend | |||
| You will need either PyTorch Geometric (PyG) or Deep Graph Library (DGL) as the backend. You can select a backend following [here](http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/tutorial/t_backend.html) if you install both. | |||
| 3.1 PyTorch Geometric (>=1.7.0) | |||
| see <https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html> for installation. | |||
| 3.2 Deep Graph Library (>=0.7.0) | |||
| see <https://dgl.ai> for installation. | |||
| ### Installation | |||
| #### Install from pip | |||
| @@ -151,4 +168,4 @@ You may also find our [survey paper](http://arxiv.org/abs/2103.00742) helpful: | |||
| ``` | |||
| ## License | |||
| Notice that we follow [Apache license](LICENSE) across the entire codebase from v0.2. | |||
| We follow [Apache license](LICENSE) across the entire codebase from v0.2. | |||
| @@ -16,4 +16,4 @@ from .module import ( | |||
| train, | |||
| ) | |||
| __version__ = "0.2.0-pre" | |||
| __version__ = "0.3.0-pre" | |||
| @@ -38,6 +38,8 @@ if _backend.DependentBackend.is_dgl(): | |||
| PTCMRDataset, | |||
| NCI1Dataset | |||
| ) | |||
| from ._heterogeneous_datasets import ACMHANDataset, ACMHGTDataset | |||
| elif _backend.DependentBackend.is_pyg(): | |||
| from ._pyg import ( | |||
| CoraDataset, | |||
| @@ -68,4 +70,91 @@ elif _backend.DependentBackend.is_pyg(): | |||
| ModelNet40TrainingDataset, | |||
| ModelNet40TestDataset | |||
| ) | |||
| from ._heterogeneous_datasets import * | |||
| if _backend.DependentBackend.is_pyg(): | |||
| __all__ = [ | |||
| "CoraDataset", | |||
| "CiteSeerDataset", | |||
| "PubMedDataset", | |||
| "FlickrDataset", | |||
| "RedditDataset", | |||
| "AmazonComputersDataset", | |||
| "AmazonPhotoDataset", | |||
| "CoauthorPhysicsDataset", | |||
| "CoauthorCSDataset", | |||
| "PPIDataset", | |||
| "QM9Dataset", | |||
| "MUTAGDataset", | |||
| "ENZYMESDataset", | |||
| "IMDBBinaryDataset", | |||
| "IMDBMultiDataset", | |||
| "RedditBinaryDataset", | |||
| "REDDITMulti5KDataset", | |||
| "REDDITMulti12KDataset", | |||
| "COLLABDataset", | |||
| "ProteinsDataset", | |||
| "PTCMRDataset", | |||
| "NCI1Dataset", | |||
| "NCI109Dataset", | |||
| "ModelNet10TrainingDataset", | |||
| "ModelNet10TestDataset", | |||
| "ModelNet40TrainingDataset", | |||
| "ModelNet40TestDataset", | |||
| "OGBNProductsDataset", | |||
| "OGBNProteinsDataset", | |||
| "OGBNArxivDataset", | |||
| "OGBNPapers100MDataset", | |||
| "OGBLPPADataset", | |||
| "OGBLCOLLABDataset", | |||
| "OGBLDDIDataset", | |||
| "OGBLCitation2Dataset", | |||
| "OGBGMOLHIVDataset", | |||
| "OGBGMOLPCBADataset", | |||
| "OGBGPPADataset", | |||
| "OGBGCode2Dataset", | |||
| "GTNACMDataset", | |||
| "GTNDBLPDataset", | |||
| "GTNIMDBDataset", | |||
| "BlogCatalogDataset", | |||
| "WIKIPEDIADataset" | |||
| ] | |||
| else: | |||
| __all__ = [ | |||
| "CoraDataset", | |||
| "CiteSeerDataset", | |||
| "PubMedDataset", | |||
| "RedditDataset", | |||
| "AmazonComputersDataset", | |||
| "AmazonPhotoDataset", | |||
| "CoauthorPhysicsDataset", | |||
| "CoauthorCSDataset", | |||
| "MUTAGDataset", | |||
| "ENZYMESDataset", | |||
| "IMDBBinaryDataset", | |||
| "IMDBMultiDataset", | |||
| "RedditBinaryDataset", | |||
| "REDDITMulti5KDataset", | |||
| "COLLABDataset", | |||
| "ProteinsDataset", | |||
| "PTCMRDataset", | |||
| "NCI1Dataset", | |||
| "ACMHANDataset", | |||
| "ACMHGTDataset", | |||
| "OGBNProductsDataset", | |||
| "OGBNProteinsDataset", | |||
| "OGBNArxivDataset", | |||
| "OGBNPapers100MDataset", | |||
| "OGBLPPADataset", | |||
| "OGBLCOLLABDataset", | |||
| "OGBLDDIDataset", | |||
| "OGBLCitation2Dataset", | |||
| "OGBGMOLHIVDataset", | |||
| "OGBGMOLPCBADataset", | |||
| "OGBGPPADataset", | |||
| "OGBGCode2Dataset", | |||
| "GTNACMDataset", | |||
| "GTNDBLPDataset", | |||
| "GTNIMDBDataset", | |||
| "BlogCatalogDataset", | |||
| "WIKIPEDIADataset" | |||
| ] | |||
| @@ -33,3 +33,33 @@ from ._graph import ( | |||
| from ._selectors import ( | |||
| FilterConstant, GBDTFeatureSelector | |||
| ) | |||
| __all__ = [ | |||
| "BaseFeatureEngineer", | |||
| "BaseFeature", | |||
| "FeatureEngineerUniversalRegistry", | |||
| "OneHotFeatureGenerator", | |||
| "EigenFeatureGenerator", | |||
| "GraphletGenerator", | |||
| "PageRankFeatureGenerator", | |||
| "LocalDegreeProfileGenerator", | |||
| "NormalizeFeatures", | |||
| "OneHotDegreeGenerator", | |||
| "NetLSD", | |||
| "NXLargeCliqueSize", | |||
| "NXDegreeAssortativityCoefficient", | |||
| "NXDegreePearsonCorrelationCoefficient", | |||
| "NXHasBridges", | |||
| "NXGraphCliqueNumber", | |||
| "NXGraphNumberOfCliques", | |||
| "NXTransitivity", | |||
| "NXAverageClustering", | |||
| "NXIsConnected", | |||
| "NXNumberConnectedComponents", | |||
| "NXIsDistanceRegular", | |||
| "NXLocalEfficiency", | |||
| "NXGlobalEfficiency", | |||
| "NXIsEulerian", | |||
| "FilterConstant", | |||
| "GBDTFeatureSelector" | |||
| ] | |||
| @@ -60,7 +60,7 @@ class EigenFeatureGenerator(BaseFeatureGenerator): | |||
| References | |||
| ---------- | |||
| .. [#] Ziwei Zhang, Peng Cui, Jian Pei, Xin Wang, Wenwu Zhu: | |||
| Eigen-GNN: A Graph Structure Preserving Plug-in for GNNs. CoRR abs/2006.04330 (2020) | |||
| Eigen-GNN: A Graph Structure Preserving Plug-in for GNNs. TKDE (2021) | |||
| https://arxiv.org/abs/2006.04330 | |||
| @@ -3,8 +3,33 @@ import sys | |||
| from ...backend import DependentBackend | |||
| from . import _utils | |||
| from .decoders import BaseDecoderMaintainer, DecoderUniversalRegistry | |||
| from .encoders import BaseEncoderMaintainer, AutoHomogeneousEncoderMaintainer, EncoderUniversalRegistry | |||
| from .decoders import ( | |||
| BaseDecoderMaintainer, | |||
| DecoderUniversalRegistry, | |||
| LogSoftmaxDecoderMaintainer, | |||
| DotProductLinkPredictionDecoderMaintainer | |||
| ) | |||
| from .encoders import ( | |||
| BaseEncoderMaintainer, | |||
| AutoHomogeneousEncoderMaintainer, | |||
| EncoderUniversalRegistry, | |||
| GCNEncoderMaintainer, | |||
| GATEncoderMaintainer, | |||
| GINEncoderMaintainer, | |||
| SAGEEncoderMaintainer | |||
| ) | |||
| if DependentBackend.is_dgl(): | |||
| from .decoders import ( | |||
| TopKDecoderMaintainer, | |||
| JKSumPoolDecoderMaintainer | |||
| ) | |||
| else: | |||
| from .decoders import ( | |||
| DiffPoolDecoderMaintainer, | |||
| SumPoolMLPDecoderMaintainer | |||
| ) | |||
| # load corresponding backend model of subclass | |||
| def _load_subclass_backend(backend): | |||
| @@ -14,3 +39,29 @@ def _load_subclass_backend(backend): | |||
| setattr(this, api, obj) | |||
| _load_subclass_backend(DependentBackend) | |||
| __all__.extend([ | |||
| "BaseDecoderMaintainer", | |||
| "DecoderUniversalRegistry", | |||
| "LogSoftmaxDecoderMaintainer", | |||
| "DotProductLinkPredictionDecoderMaintainer", | |||
| "BaseEncoderMaintainer", | |||
| "AutoHomogeneousEncoderMaintainer", | |||
| "EncoderUniversalRegistry", | |||
| "GCNEncoderMaintainer", | |||
| "GATEncoderMaintainer", | |||
| "GINEncoderMaintainer", | |||
| "SAGEEncoderMaintainer" | |||
| ]) | |||
| if DependentBackend.is_dgl(): | |||
| __all__.extend([ | |||
| "TopKDecoderMaintainer", | |||
| "JKSumPoolDecoderMaintainer", | |||
| ]) | |||
| else: | |||
| __all__.extend([ | |||
| "DiffPoolDecoderMaintainer", | |||
| "SumPoolMLPDecoderMaintainer" | |||
| ]) | |||
| @@ -7,7 +7,7 @@ if DependentBackend.is_pyg(): | |||
| LogSoftmaxDecoderMaintainer, | |||
| SumPoolMLPDecoderMaintainer, | |||
| DiffPoolDecoderMaintainer, | |||
| DotProductLinkPredictonDecoderMaintainer | |||
| DotProductLinkPredictionDecoderMaintainer | |||
| ) | |||
| else: | |||
| from ._dgl import ( | |||
| @@ -21,8 +21,16 @@ __all__ = [ | |||
| "BaseDecoderMaintainer", | |||
| "DecoderUniversalRegistry", | |||
| "LogSoftmaxDecoderMaintainer", | |||
| "JKSumPoolDecoderMaintainer", | |||
| "TopKDecoderMaintainer", | |||
| "DiffPoolDecoderMaintainer", | |||
| "DotProductLinkPredictonDecoderMaintainer" | |||
| "DotProductLinkPredictionDecoderMaintainer" | |||
| ] | |||
| if DependentBackend.is_pyg(): | |||
| __all__.extend([ | |||
| "DiffPoolDecoderMaintainer", | |||
| "SumPoolMLPDecoderMaintainer" | |||
| ]) | |||
| else: | |||
| __all__.extend([ | |||
| "JKSumPoolDecoderMaintainer", | |||
| "TopKDecoderMaintainer" | |||
| ]) | |||
| @@ -2,5 +2,5 @@ from ._pyg_decoders import ( | |||
| LogSoftmaxDecoderMaintainer, | |||
| SumPoolMLPDecoderMaintainer, | |||
| DiffPoolDecoderMaintainer, | |||
| DotProductLinkPredictonDecoderMaintainer | |||
| DotProductLinkPredictionDecoderMaintainer | |||
| ) | |||
| @@ -294,6 +294,6 @@ class _DotProductLinkPredictonDecoder(torch.nn.Module): | |||
| @decoder_registry.DecoderUniversalRegistry.register_decoder('dotproduct'.lower()) | |||
| @decoder_registry.DecoderUniversalRegistry.register_decoder('lp-decoder'.lower()) | |||
| @decoder_registry.DecoderUniversalRegistry.register_decoder('dot-product'.lower()) | |||
| class DotProductLinkPredictonDecoderMaintainer(base_decoder.BaseDecoderMaintainer): | |||
| class DotProductLinkPredictionDecoderMaintainer(base_decoder.BaseDecoderMaintainer): | |||
| def _initialize(self, *args, **kwargs): | |||
| self._decoder = _DotProductLinkPredictonDecoder() | |||
| @@ -181,7 +181,7 @@ class AutoHGT(BaseHeteroModelMaintainer): | |||
| r""" | |||
| AutoHGT. | |||
| The model used in this automodel is HGT, i.e., the graph convolutional network from the | |||
| `"Heterogeneous Graph Transformer" <https://arxiv.org/abs/2003.01332>`_paper. | |||
| `"Heterogeneous Graph Transformer" <https://arxiv.org/abs/2003.01332>`_ paper. | |||
| Parameters | |||
| ---------- | |||
| @@ -11,10 +11,10 @@ if DependentBackend.is_pyg(): | |||
| ) | |||
| else: | |||
| from ._dgl import ( | |||
| GCNMaintainer as GCNEncoderMaintainer, | |||
| GATMaintainer as GATEncoderMaintainer, | |||
| GCNEncoderMaintainer, | |||
| GATEncoderMaintainer, | |||
| GINEncoderMaintainer, | |||
| SAGEMaintainer as SAGEEncoderMaintainer, | |||
| SAGEEncoderMaintainer, | |||
| AutoTopKMaintainer | |||
| ) | |||
| @@ -26,5 +26,7 @@ __all__ = [ | |||
| "GATEncoderMaintainer", | |||
| "GINEncoderMaintainer", | |||
| "SAGEEncoderMaintainer", | |||
| "AutoTopKMaintainer" | |||
| ] | |||
| if DependentBackend.is_dgl(): | |||
| __all__.append("AutoTopKMaintainer") | |||
| @@ -1,5 +1,5 @@ | |||
| from ._gat import GATMaintainer | |||
| from ._gcn import GCNMaintainer | |||
| from ._gat import GATEncoderMaintainer | |||
| from ._gcn import GCNEncoderMaintainer | |||
| from ._gin import GINEncoderMaintainer | |||
| from ._sage import SAGEMaintainer | |||
| from ._sage import SAGEEncoderMaintainer | |||
| from ._topk import AutoTopKMaintainer | |||
| @@ -69,14 +69,16 @@ class GAT(torch.nn.Module): | |||
| @encoder_registry.EncoderUniversalRegistry.register_encoder('gat') | |||
| @encoder_registry.EncoderUniversalRegistry.register_encoder('gat_encoder') | |||
| class GATMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer): | |||
| class GATEncoderMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer): | |||
| r""" | |||
| AutoGAT. The model used in this automodel is GAT, i.e., the graph attentional network from the `"Graph Attention Networks" | |||
| <https://arxiv.org/abs/1710.10903>`_ paper. The layer is | |||
| .. math:: | |||
| \mathbf{x}^{\prime}_i = \alpha_{i,i}\mathbf{\Theta}\mathbf{x}_{i} + | |||
| \sum_{j \in \mathcal{N}(i)} \alpha_{i,j}\mathbf{\Theta}\mathbf{x}_{j} | |||
| where the attention coefficients :math:`\alpha_{i,j}` are computed as | |||
| .. math:: | |||
| \alpha_{i,j} = | |||
| \frac{ | |||
| @@ -87,6 +89,7 @@ class GATMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer): | |||
| \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top} | |||
| [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_k] | |||
| \right)\right)}. | |||
| Parameters | |||
| ---------- | |||
| input_dimension: `Optional[int]` | |||
| @@ -106,7 +109,7 @@ class GATMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer): | |||
| device: _typing.Union[torch.device, str, int, None] = ..., | |||
| *args, **kwargs | |||
| ): | |||
| super(GATMaintainer, self).__init__( | |||
| super(GATEncoderMaintainer, self).__init__( | |||
| input_dimension, final_dimension, device, *args, **kwargs | |||
| ) | |||
| self.hyper_parameters: _typing.Mapping[str, _typing.Any] = { | |||
| @@ -39,7 +39,7 @@ class _GCN(torch.nn.Module): | |||
| @encoder_registry.EncoderUniversalRegistry.register_encoder('gcn') | |||
| @encoder_registry.EncoderUniversalRegistry.register_encoder('gcn_encoder') | |||
| class GCNMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer): | |||
| class GCNEncoderMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer): | |||
| def __init__( | |||
| self, | |||
| input_dimension: _typing.Optional[int] = ..., | |||
| @@ -47,7 +47,7 @@ class GCNMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer): | |||
| device: _typing.Union[torch.device, str, int, None] = ..., | |||
| *args, **kwargs | |||
| ): | |||
| super(GCNMaintainer, self).__init__( | |||
| super(GCNEncoderMaintainer, self).__init__( | |||
| input_dimension, final_dimension, device, *args, **kwargs | |||
| ) | |||
| self.hyper_parameter_space = [ | |||
| @@ -43,7 +43,7 @@ class _SAGE(torch.nn.Module): | |||
| @encoder_registry.EncoderUniversalRegistry.register_encoder('sage') | |||
| @encoder_registry.EncoderUniversalRegistry.register_encoder('sage_encoder') | |||
| class SAGEMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer): | |||
| class SAGEEncoderMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer): | |||
| def __init__( | |||
| self, | |||
| input_dimension: _typing.Optional[int] = ..., | |||
| @@ -51,7 +51,7 @@ class SAGEMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer): | |||
| device: _typing.Union[torch.device, str, int, None] = ..., | |||
| *args, **kwargs | |||
| ): | |||
| super(SAGEMaintainer, self).__init__( | |||
| super(SAGEEncoderMaintainer, self).__init__( | |||
| input_dimension, final_dimension, device, *args, **kwargs | |||
| ) | |||
| self.hyper_parameter_space = [ | |||
| @@ -31,26 +31,53 @@ class GraphClassificationFullTrainer(BaseGraphClassificationTrainer): | |||
| Parameters | |||
| ---------- | |||
| model: ``BaseAutoModel`` or ``str`` | |||
| The (name of) model used to train and predict. | |||
| model: | |||
| Models can be ``str``, ``autogl.module.model.BaseAutoModel``, | |||
| ``autogl.module.model.encoders.BaseEncoderMaintainer`` or a tuple of (encoder, decoder) | |||
| if need to specify both encoder and decoder. Encoder can be ``str`` or | |||
| ``autogl.module.model.encoders.BaseEncoderMaintainer``, and decoder can be ``str`` | |||
| or ``autogl.module.model.decoders.BaseDecoderMaintainer``. | |||
| If only encoder is specified, decoder will be default to "logsoftmax" | |||
| num_features: int (Optional) | |||
| The number of features in dataset. default None | |||
| num_classes: int (Optional) | |||
| The number of classes. default None | |||
| num_graph_features: int (Optional) | |||
| The number of graph level features. default 0. | |||
| optimizer: ``Optimizer`` of ``str`` | |||
| The (name of) optimizer used to train and predict. | |||
| The (name of) optimizer used to train and predict. default torch.optim.Adam | |||
| lr: ``float`` | |||
| The learning rate of graph classification task. | |||
| The learning rate of node classification task. default 1e-4 | |||
| max_epoch: ``int`` | |||
| The max number of epochs in training. | |||
| The max number of epochs in training. default 100 | |||
| early_stopping_round: ``int`` | |||
| The round of early stop. | |||
| The round of early stop. default 100 | |||
| weight_decay: ``float`` | |||
| weight decay ratio, default 1e-4 | |||
| device: ``torch.device`` or ``str`` | |||
| The device where model will be running on. | |||
| init: ``bool`` | |||
| If True(False), the model will (not) be initialized. | |||
| feval: (Sequence of) ``Evaluation`` or ``str`` | |||
| The evaluation functions, default ``[LogLoss]`` | |||
| loss: ``str`` | |||
| The loss used. Default ``nll_loss``. | |||
| lr_scheduler_type: ``str`` (Optional) | |||
| The lr scheduler type used. Default None. | |||
| """ | |||
| space = None | |||
| @@ -516,9 +543,15 @@ class GraphClassificationFullTrainer(BaseGraphClassificationTrainer): | |||
| Parameters | |||
| ---------- | |||
| hp: ``dict``. | |||
| The hyperparameter used in the new instance. | |||
| model: The model used in the new instance of trainer. | |||
| The hyperparameter used in the new instance. Should contain 3 keys "trainer", "encoder" | |||
| "decoder", with corresponding hyperparameters as values. | |||
| model: The new model | |||
| Models can be ``str``, ``autogl.module.model.BaseAutoModel``, | |||
| ``autogl.module.model.encoders.BaseEncoderMaintainer`` or a tuple of (encoder, decoder) | |||
| if need to specify both encoder and decoder. Encoder can be ``str`` or | |||
| ``autogl.module.model.encoders.BaseEncoderMaintainer``, and decoder can be ``str`` | |||
| or ``autogl.module.model.decoders.BaseDecoderMaintainer``. | |||
| restricted: ``bool``. | |||
| If False(True), the hyperparameter should (not) be updated from origin hyperparameter. | |||
| @@ -56,26 +56,47 @@ class LinkPredictionTrainer(BaseLinkPredictionTrainer): | |||
| Parameters | |||
| ---------- | |||
| model: ``BaseModel`` or ``str`` | |||
| The (name of) model used to train and predict. | |||
| model: | |||
| Models can be ``str``, ``autogl.module.model.BaseAutoModel``, | |||
| ``autogl.module.model.encoders.BaseEncoderMaintainer`` or a tuple of (encoder, decoder) | |||
| if need to specify both encoder and decoder. Encoder can be ``str`` or | |||
| ``autogl.module.model.encoders.BaseEncoderMaintainer``, and decoder can be ``str`` | |||
| or ``autogl.module.model.decoders.BaseDecoderMaintainer``. | |||
| If only encoder is specified, decoder will be default to "logsoftmax" | |||
| num_features: int (Optional) | |||
| The number of features in dataset. default None | |||
| optimizer: ``Optimizer`` of ``str`` | |||
| The (name of) optimizer used to train and predict. | |||
| The (name of) optimizer used to train and predict. default torch.optim.Adam | |||
| lr: ``float`` | |||
| The learning rate of link prediction task. | |||
| The learning rate of node classification task. default 1e-4 | |||
| max_epoch: ``int`` | |||
| The max number of epochs in training. | |||
| The max number of epochs in training. default 100 | |||
| early_stopping_round: ``int`` | |||
| The round of early stop. | |||
| The round of early stop. default 100 | |||
| weight_decay: ``float`` | |||
| weight decay ratio, default 1e-4 | |||
| device: ``torch.device`` or ``str`` | |||
| The device where model will be running on. | |||
| init: ``bool`` | |||
| If True(False), the model will (not) be initialized. | |||
| feval: (Sequence of) ``Evaluation`` or ``str`` | |||
| The evaluation functions, default ``[LogLoss]`` | |||
| loss: ``str`` | |||
| The loss used. Default ``nll_loss``. | |||
| lr_scheduler_type: ``str`` (Optional) | |||
| The lr scheduler type used. Default None. | |||
| """ | |||
| space = None | |||
| @@ -592,9 +613,15 @@ class LinkPredictionTrainer(BaseLinkPredictionTrainer): | |||
| Parameters | |||
| ---------- | |||
| hp: ``dict``. | |||
| The hyperparameter used in the new instance. | |||
| model: The model used in the new instance of trainer. | |||
| The hyperparameter used in the new instance. Should contain 3 keys "trainer", "encoder" | |||
| "decoder", with corresponding hyperparameters as values. | |||
| model: The new model | |||
| Models can be ``str``, ``autogl.module.model.BaseAutoModel``, | |||
| ``autogl.module.model.encoders.BaseEncoderMaintainer`` or a tuple of (encoder, decoder) | |||
| if need to specify both encoder and decoder. Encoder can be ``str`` or | |||
| ``autogl.module.model.encoders.BaseEncoderMaintainer``, and decoder can be ``str`` | |||
| or ``autogl.module.model.decoders.BaseDecoderMaintainer``. | |||
| restricted: ``bool``. | |||
| If False(True), the hyperparameter should (not) be updated from origin hyperparameter. | |||
| @@ -34,26 +34,50 @@ class NodeClassificationFullTrainer(BaseNodeClassificationTrainer): | |||
| Parameters | |||
| ---------- | |||
| model: ``BaseModel`` or ``str`` | |||
| The (name of) model used to train and predict. | |||
| model: | |||
| Models can be ``str``, ``autogl.module.model.BaseAutoModel``, | |||
| ``autogl.module.model.encoders.BaseEncoderMaintainer`` or a tuple of (encoder, decoder) | |||
| if need to specify both encoder and decoder. Encoder can be ``str`` or | |||
| ``autogl.module.model.encoders.BaseEncoderMaintainer``, and decoder can be ``str`` | |||
| or ``autogl.module.model.decoders.BaseDecoderMaintainer``. | |||
| If only encoder is specified, decoder will be default to "logsoftmax" | |||
| num_features: int (Optional) | |||
| The number of features in dataset. default None | |||
| num_classes: int (Optional) | |||
| The number of classes. default None | |||
| optimizer: ``Optimizer`` of ``str`` | |||
| The (name of) optimizer used to train and predict. | |||
| The (name of) optimizer used to train and predict. default torch.optim.Adam | |||
| lr: ``float`` | |||
| The learning rate of node classification task. | |||
| The learning rate of node classification task. default 1e-4 | |||
| max_epoch: ``int`` | |||
| The max number of epochs in training. | |||
| The max number of epochs in training. default 100 | |||
| early_stopping_round: ``int`` | |||
| The round of early stop. | |||
| The round of early stop. default 100 | |||
| weight_decay: ``float`` | |||
| weight decay ratio, default 1e-4 | |||
| device: ``torch.device`` or ``str`` | |||
| The device where model will be running on. | |||
| init: ``bool`` | |||
| If True(False), the model will (not) be initialized. | |||
| feval: (Sequence of) ``Evaluation`` or ``str`` | |||
| The evaluation functions, default ``[LogLoss]`` | |||
| loss: ``str`` | |||
| The loss used. Default ``nll_loss``. | |||
| lr_scheduler_type: ``str`` (Optional) | |||
| The lr scheduler type used. Default None. | |||
| """ | |||
| def __init__( | |||
| @@ -161,6 +185,9 @@ class NodeClassificationFullTrainer(BaseNodeClassificationTrainer): | |||
| @classmethod | |||
| def get_task_name(cls): | |||
| """ | |||
| Derive the task name. (NodeClassification) | |||
| """ | |||
| return "NodeClassification" | |||
| def __train_only(self, data, train_mask=None): | |||
| @@ -437,16 +464,22 @@ class NodeClassificationFullTrainer(BaseNodeClassificationTrainer): | |||
| return res[0] | |||
| return res | |||
| def duplicate_from_hyper_parameter(self, hp: dict, encoder="same", decoder="same", restricted=True): | |||
| def duplicate_from_hyper_parameter(self, hp: dict, model=None, restricted=True): | |||
| """ | |||
| The function of duplicating a new instance from the given hyperparameter. | |||
| Parameters | |||
| ---------- | |||
| hp: ``dict``. | |||
| The hyperparameter used in the new instance. | |||
| The hyperparameter used in the new instance. Should contain 3 keys "trainer", "encoder" | |||
| "decoder", with corresponding hyperparameters as values. | |||
| model: The model used in the new instance of trainer. | |||
| model: | |||
| Models can be ``str``, ``autogl.module.model.BaseAutoModel``, | |||
| ``autogl.module.model.encoders.BaseEncoderMaintainer`` or a tuple of (encoder, decoder) | |||
| if need to specify both encoder and decoder. Encoder can be ``str`` or | |||
| ``autogl.module.model.encoders.BaseEncoderMaintainer``, and decoder can be ``str`` | |||
| or ``autogl.module.model.decoders.BaseDecoderMaintainer``. | |||
| restricted: ``bool``. | |||
| If False(True), the hyperparameter should (not) be updated from origin hyperparameter. | |||
| @@ -457,6 +490,17 @@ class NodeClassificationFullTrainer(BaseNodeClassificationTrainer): | |||
| A new instance of trainer. | |||
| """ | |||
| if isinstance(model, Tuple): | |||
| encoder, decoder = model | |||
| elif isinstance(model, BaseAutoModel): | |||
| encoder, decoder = model, None | |||
| elif isinstance(model, BaseEncoderMaintainer): | |||
| encoder, decoder = model, self.decoder | |||
| elif model is None: | |||
| encoder, decoder = self.encoder, self.decoder | |||
| else: | |||
| raise TypeError("Cannot parse model with type", type(model)) | |||
| hp_trainer = hp.get("trainer", {}) | |||
| hp_encoder = hp.get("encoder", {}) | |||
| hp_decoder = hp.get("decoder", {}) | |||
| @@ -466,8 +510,6 @@ class NodeClassificationFullTrainer(BaseNodeClassificationTrainer): | |||
| hp = origin_hp | |||
| else: | |||
| hp = hp_trainer | |||
| encoder = encoder if encoder != "same" else self.encoder | |||
| decoder = decoder if decoder != "same" else self.decoder | |||
| encoder = encoder.from_hyper_parameter(hp_encoder) | |||
| if isinstance(encoder, BaseEncoderMaintainer) and isinstance(decoder, BaseDecoderMaintainer): | |||
| decoder = decoder.from_hyper_parameter_and_encoder(hp_decoder, encoder) | |||
| @@ -39,24 +39,49 @@ def score(logits, labels): | |||
| @register_trainer("NodeClassificationHet") | |||
| class NodeClassificationHetTrainer(BaseNodeClassificationHetTrainer): | |||
| """ | |||
| The node classification trainer. | |||
| Used to automatically train the node classification problem. | |||
| The heterogeneous node classification trainer. | |||
| Parameters | |||
| ---------- | |||
| model: ``BaseAutoModel`` or ``str`` | |||
| The (name of) model used to train and predict. | |||
| model: ``autogl.module.model.BaseAutoModel`` | |||
| Currently Heterogeneous trainer doesn't support decoupled model setting. | |||
| num_features: ``int`` (Optional) | |||
| The number of features in dataset. default None | |||
| num_classes: ``int`` (Optional) | |||
| The number of classes. default None | |||
| optimizer: ``Optimizer`` of ``str`` | |||
| The (name of) optimizer used to train and predict. | |||
| The (name of) optimizer used to train and predict. default torch.optim.Adam | |||
| lr: ``float`` | |||
| The learning rate of node classification task. | |||
| The learning rate of node classification task. default 1e-4 | |||
| max_epoch: ``int`` | |||
| The max number of epochs in training. | |||
| The max number of epochs in training. default 100 | |||
| early_stopping_round: ``int`` | |||
| The round of early stop. | |||
| The round of early stop. default 100 | |||
| weight_decay: ``float`` | |||
| weight decay ratio, default 1e-4 | |||
| device: ``torch.device`` or ``str`` | |||
| The device where model will be running on. | |||
| init: ``bool`` | |||
| If True(False), the model will (not) be initialized. | |||
| feval: (Sequence of) ``Evaluation`` or ``str`` | |||
| The evaluation functions, default ``[LogLoss]`` | |||
| loss: ``str`` | |||
| The loss used. Default ``nll_loss``. | |||
| lr_scheduler_type: ``str`` (Optional) | |||
| The lr scheduler type used. Default None. | |||
| """ | |||
| def __init__( | |||
| @@ -163,6 +188,9 @@ class NodeClassificationHetTrainer(BaseNodeClassificationHetTrainer): | |||
| @classmethod | |||
| def get_task_name(cls): | |||
| """ | |||
| Get task name ("NodeClassificationHet") | |||
| """ | |||
| return "NodeClassificationHet" | |||
| def _train_only(self, dataset, train_mask="train"): | |||
| @@ -310,14 +338,17 @@ class NodeClassificationHetTrainer(BaseNodeClassificationHetTrainer): | |||
| def get_valid_score(self, return_major=True): | |||
| """ | |||
| The function of getting the valid score. | |||
| Parameters | |||
| ---------- | |||
| return_major: ``bool``. | |||
| If True, the return only consists of the major result. | |||
| If False, the return consists of the all results. | |||
| Returns | |||
| ------- | |||
| result: The valid score in training stage. | |||
| """ | |||
| if isinstance(self.feval, list): | |||
| if return_major: | |||
| @@ -398,17 +429,24 @@ class NodeClassificationHetTrainer(BaseNodeClassificationHetTrainer): | |||
| def duplicate_from_hyper_parameter(self, hp: dict, model=None, restricted=True): | |||
| """ | |||
| The function of duplicating a new instance from the given hyperparameter. | |||
| Parameters | |||
| ---------- | |||
| hp: ``dict``. | |||
| The hyperparameter used in the new instance. | |||
| model: The model used in the new instance of trainer. | |||
| The hyperparameter used in the new instance. Should contain 2 keys "trainer", "encoder" | |||
| with corresponding hyperparameters as values. | |||
| model: ``autogl.module.model.BaseAutoModel`` | |||
| Currently Heterogeneous trainer doesn't support decoupled model setting. | |||
| If only encoder is specified, decoder will be default to "logsoftmax" | |||
| restricted: ``bool``. | |||
| If False(True), the hyperparameter should (not) be updated from origin hyperparameter. | |||
| Returns | |||
| ------- | |||
| self: ``autogl.train.NodeClassificationTrainer`` | |||
| A new instance of trainer. | |||
| """ | |||
| trainer_hp = hp["trainer"] | |||
| model_hp = hp["encoder"] | |||
| @@ -50,15 +50,20 @@ class AutoGraphClassifier(BaseClassifier): | |||
| If given, will set the number eval times the hpo module will use. | |||
| Only be effective when hpo_module is ``str``. Default ``None``. | |||
| default_trainer: str (Optional) | |||
| The (name of) the trainer used in this solver. Default to ``NodeClassificationFull``. | |||
| trainer_hp_space: Iterable[dict] (Optional) | |||
| trainer hp space or list of trainer hp spaces configuration. | |||
| If a single trainer hp is given, will specify the hp space of trainer for | |||
| every model. If a list of trainer hp is given, will specify every model | |||
| with corrsponding trainer hp space. Default ``None``. | |||
| model_hp_spaces: Iterable[Iterable[dict]] (Optional) | |||
| model_hp_spaces: list of list of dict (Optional) | |||
| model hp space configuration. | |||
| If given, will specify every hp space of every passed model. Default ``None``. | |||
| If the encoder(-decoder) is passed, the space should be a dict containing keys "encoder" | |||
| and "decoder", specifying the detailed encoder decoder hp spaces. | |||
| size: int (Optional) | |||
| The max models ensemble module will use. Default ``None``. | |||
| @@ -605,6 +610,53 @@ class AutoGraphClassifier(BaseClassifier): | |||
| label=None, | |||
| metric="acc" | |||
| ): | |||
| """ | |||
| Evaluate the given dataset. | |||
| Parameters | |||
| ---------- | |||
| dataset: torch_geometric.data.dataset.Dataset or None | |||
| The dataset needed to predict. If ``None``, will use the processed dataset passed | |||
| to ``fit()`` instead. Default ``None``. | |||
| inplaced: bool | |||
| Whether the given dataset is processed. Only be effective when ``dataset`` | |||
| is not ``None``. If you pass the dataset to ``fit()`` with ``inplace=True``, and | |||
| you pass the dataset again to this method, you should set this argument to ``True``. | |||
| Otherwise ``False``. Default ``False``. | |||
| inplace: bool | |||
| Whether we process the given dataset in inplace manner. Default ``False``. Set it to | |||
| True if you want to save memory by modifying the given dataset directly. | |||
| use_ensemble: bool | |||
| Whether to use ensemble to do the predict. Default ``True``. | |||
| use_best: bool | |||
| Whether to use the best single model to do the predict. Will only be effective when | |||
| ``use_ensemble`` is ``False``. Default ``True``. | |||
| name: str or None | |||
| The name of model used to predict. Will only be effective when ``use_ensemble`` and | |||
| ``use_best`` both are ``False``. Default ``None``. | |||
| mask: str | |||
| The data split to give prediction on. Default ``test``. | |||
| label: torch.Tensor (Optional) | |||
| The groud truth label of the given predicted dataset split. If not passed, will extract | |||
| labels from the input dataset. | |||
| metric: str | |||
| The metric to be used for evaluating the model. Default ``acc``. | |||
| Returns | |||
| ------- | |||
| score(s): (list of) evaluation scores | |||
| the evaluation results according to the evaluator passed. | |||
| """ | |||
| predicted = self.predict_proba(dataset, inplaced, inplace, use_ensemble, use_best, name, mask) | |||
| if dataset is None: | |||
| dataset = self.dataset | |||
| @@ -47,6 +47,9 @@ class AutoHeteroNodeClassifier(BaseClassifier): | |||
| If given, will set the number eval times the hpo module will use. | |||
| Only be effective when hpo_module is ``str``. Default ``None``. | |||
| default_trainer: str (Optional) | |||
| The (name of) the trainer used in this solver. Default to ``NodeClassificationFull``. | |||
| trainer_hp_space: list of dict (Optional) | |||
| trainer hp space or list of trainer hp spaces configuration. | |||
| If a single trainer hp is given, will specify the hp space of trainer for every model. | |||
| @@ -57,6 +60,8 @@ class AutoHeteroNodeClassifier(BaseClassifier): | |||
| model_hp_spaces: list of list of dict (Optional) | |||
| model hp space configuration. | |||
| If given, will specify every hp space of every passed model. Default ``None``. | |||
| If the encoder(-decoder) is passed, the space should be a dict containing keys "encoder" | |||
| and "decoder", specifying the detailed encoder decoder hp spaces. | |||
| size: int (Optional) | |||
| The max models ensemble module will use. Default ``None``. | |||
| @@ -542,6 +547,53 @@ class AutoHeteroNodeClassifier(BaseClassifier): | |||
| label=None, | |||
| metric="acc" | |||
| ): | |||
| """ | |||
| Evaluate the given dataset. | |||
| Parameters | |||
| ---------- | |||
| dataset: torch_geometric.data.dataset.Dataset or None | |||
| The dataset needed to predict. If ``None``, will use the processed dataset passed | |||
| to ``fit()`` instead. Default ``None``. | |||
| inplaced: bool | |||
| Whether the given dataset is processed. Only be effective when ``dataset`` | |||
| is not ``None``. If you pass the dataset to ``fit()`` with ``inplace=True``, and | |||
| you pass the dataset again to this method, you should set this argument to ``True``. | |||
| Otherwise ``False``. Default ``False``. | |||
| inplace: bool | |||
| Whether we process the given dataset in inplace manner. Default ``False``. Set it to | |||
| True if you want to save memory by modifying the given dataset directly. | |||
| use_ensemble: bool | |||
| Whether to use ensemble to do the predict. Default ``True``. | |||
| use_best: bool | |||
| Whether to use the best single model to do the predict. Will only be effective when | |||
| ``use_ensemble`` is ``False``. Default ``True``. | |||
| name: str or None | |||
| The name of model used to predict. Will only be effective when ``use_ensemble`` and | |||
| ``use_best`` both are ``False``. Default ``None``. | |||
| mask: str | |||
| The data split to give prediction on. Default ``test``. | |||
| label: torch.Tensor (Optional) | |||
| The groud truth label of the given predicted dataset split. If not passed, will extract | |||
| labels from the input dataset. | |||
| metric: str | |||
| The metric to be used for evaluating the model. Default ``acc``. | |||
| Returns | |||
| ------- | |||
| score(s): (list of) evaluation scores | |||
| the evaluation results according to the evaluator passed. | |||
| """ | |||
| predicted = self.predict_proba(dataset, use_ensemble, use_best, name, mask) | |||
| if dataset is None: | |||
| dataset = self.dataset | |||
| @@ -67,6 +67,9 @@ class AutoLinkPredictor(BaseClassifier): | |||
| If given, will set the number eval times the hpo module will use. | |||
| Only be effective when hpo_module is ``str``. Default ``None``. | |||
| default_trainer: str (Optional) | |||
| The (name of) the trainer used in this solver. Default to ``NodeClassificationFull``. | |||
| trainer_hp_space: list of dict (Optional) | |||
| trainer hp space or list of trainer hp spaces configuration. | |||
| If a single trainer hp is given, will specify the hp space of trainer for every model. | |||
| @@ -77,6 +80,8 @@ class AutoLinkPredictor(BaseClassifier): | |||
| model_hp_spaces: list of list of dict (Optional) | |||
| model hp space configuration. | |||
| If given, will specify every hp space of every passed model. Default ``None``. | |||
| If the encoder(-decoder) is passed, the space should be a dict containing keys "encoder" | |||
| and "decoder", specifying the detailed encoder decoder hp spaces. | |||
| size: int (Optional) | |||
| The max models ensemble module will use. Default ``None``. | |||
| @@ -668,8 +673,55 @@ class AutoLinkPredictor(BaseClassifier): | |||
| name=None, | |||
| mask="test", | |||
| label=None, | |||
| metric="acc" | |||
| metric="auc" | |||
| ): | |||
| """ | |||
| Evaluate the given dataset. | |||
| Parameters | |||
| ---------- | |||
| dataset: torch_geometric.data.dataset.Dataset or None | |||
| The dataset needed to predict. If ``None``, will use the processed dataset passed | |||
| to ``fit()`` instead. Default ``None``. | |||
| inplaced: bool | |||
| Whether the given dataset is processed. Only be effective when ``dataset`` | |||
| is not ``None``. If you pass the dataset to ``fit()`` with ``inplace=True``, and | |||
| you pass the dataset again to this method, you should set this argument to ``True``. | |||
| Otherwise ``False``. Default ``False``. | |||
| inplace: bool | |||
| Whether we process the given dataset in inplace manner. Default ``False``. Set it to | |||
| True if you want to save memory by modifying the given dataset directly. | |||
| use_ensemble: bool | |||
| Whether to use ensemble to do the predict. Default ``True``. | |||
| use_best: bool | |||
| Whether to use the best single model to do the predict. Will only be effective when | |||
| ``use_ensemble`` is ``False``. Default ``True``. | |||
| name: str or None | |||
| The name of model used to predict. Will only be effective when ``use_ensemble`` and | |||
| ``use_best`` both are ``False``. Default ``None``. | |||
| mask: str | |||
| The data split to give prediction on. Default ``test``. | |||
| label: torch.Tensor (Optional) | |||
| The groud truth label of the given predicted dataset split. If not passed, will extract | |||
| labels from the input dataset. | |||
| metric: str | |||
| The metric to be used for evaluating the model. Default ``auc``. | |||
| Returns | |||
| ------- | |||
| score(s): (list of) evaluation scores | |||
| the evaluation results according to the evaluator passed. | |||
| """ | |||
| if dataset is None: | |||
| dataset = self.dataset | |||
| assert dataset is not None, ( | |||
| @@ -37,8 +37,12 @@ class AutoNodeClassifier(BaseClassifier): | |||
| The (name of) auto feature engineer used to process the given dataset. Default ``deepgl``. | |||
| Disable feature engineer by setting it to ``None``. | |||
| graph_models: list of autogl.module.model.BaseModel or list of str | |||
| The (name of) models to be optimized as backbone. Default ``['gat', 'gcn']``. | |||
| graph_models: Sequence of models | |||
| Models can be ``str``, ``autogl.module.model.BaseAutoModel``, | |||
| ``autogl.module.model.encoders.BaseEncoderMaintainer`` or a tuple of (encoder, decoder) | |||
| if need to specify both encoder and decoder. Encoder can be ``str`` or | |||
| ``autogl.module.model.encoders.BaseEncoderMaintainer``, and decoder can be ``str`` | |||
| or ``autogl.module.model.decoders.BaseDecoderMaintainer``. | |||
| nas_algorithms: (list of) autogl.module.nas.algorithm.BaseNAS or str (Optional) | |||
| The (name of) nas algorithms used. Default ``None``. | |||
| @@ -60,6 +64,9 @@ class AutoNodeClassifier(BaseClassifier): | |||
| max_evals: int (Optional) | |||
| If given, will set the number eval times the hpo module will use. | |||
| Only be effective when hpo_module is ``str``. Default ``None``. | |||
| default_trainer: str (Optional) | |||
| The (name of) the trainer used in this solver. Default to ``NodeClassificationFull``. | |||
| trainer_hp_space: list of dict (Optional) | |||
| trainer hp space or list of trainer hp spaces configuration. | |||
| @@ -71,6 +78,8 @@ class AutoNodeClassifier(BaseClassifier): | |||
| model_hp_spaces: list of list of dict (Optional) | |||
| model hp space configuration. | |||
| If given, will specify every hp space of every passed model. Default ``None``. | |||
| If the encoder(-decoder) is passed, the space should be a dict containing keys "encoder" | |||
| and "decoder", specifying the detailed encoder decoder hp spaces. | |||
| size: int (Optional) | |||
| The max models ensemble module will use. Default ``None``. | |||
| @@ -681,6 +690,53 @@ class AutoNodeClassifier(BaseClassifier): | |||
| label=None, | |||
| metric="acc" | |||
| ): | |||
| """ | |||
| Evaluate the given dataset. | |||
| Parameters | |||
| ---------- | |||
| dataset: torch_geometric.data.dataset.Dataset or None | |||
| The dataset needed to predict. If ``None``, will use the processed dataset passed | |||
| to ``fit()`` instead. Default ``None``. | |||
| inplaced: bool | |||
| Whether the given dataset is processed. Only be effective when ``dataset`` | |||
| is not ``None``. If you pass the dataset to ``fit()`` with ``inplace=True``, and | |||
| you pass the dataset again to this method, you should set this argument to ``True``. | |||
| Otherwise ``False``. Default ``False``. | |||
| inplace: bool | |||
| Whether we process the given dataset in inplace manner. Default ``False``. Set it to | |||
| True if you want to save memory by modifying the given dataset directly. | |||
| use_ensemble: bool | |||
| Whether to use ensemble to do the predict. Default ``True``. | |||
| use_best: bool | |||
| Whether to use the best single model to do the predict. Will only be effective when | |||
| ``use_ensemble`` is ``False``. Default ``True``. | |||
| name: str or None | |||
| The name of model used to predict. Will only be effective when ``use_ensemble`` and | |||
| ``use_best`` both are ``False``. Default ``None``. | |||
| mask: str | |||
| The data split to give prediction on. Default ``test``. | |||
| label: torch.Tensor (Optional) | |||
| The groud truth label of the given predicted dataset split. If not passed, will extract | |||
| labels from the input dataset. | |||
| metric: str | |||
| The metric to be used for evaluating the model. Default ``acc``. | |||
| Returns | |||
| ------- | |||
| score(s): (list of) evaluation scores | |||
| the evaluation results according to the evaluator passed. | |||
| """ | |||
| predicted = self.predict_proba(dataset, inplaced, inplace, use_ensemble, use_best, name, mask) | |||
| if dataset is None: | |||
| dataset = self.dataset | |||
| @@ -10,10 +10,18 @@ BUILDDIR = _build | |||
| # Put it first so that "make" without argument is like "make help". | |||
| help: | |||
| @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) | |||
| $(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) | |||
| .PHONY: help Makefile | |||
| pyg: | |||
| @AUTOGL_BACKEND=pyg $(SPHINXBUILD) -M html "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) | |||
| dgl: | |||
| @AUTOGL_BACKEND=dgl $(SPHINXBUILD) -M html "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) | |||
| html: Makefile pyg dgl | |||
| # Catch-all target: route all unknown targets to Sphinx using the new | |||
| # "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). | |||
| %: Makefile | |||
| @@ -23,7 +23,7 @@ copyright = '2020, THUMNLab/aglteam' | |||
| author = 'THUMNLab/aglteam' | |||
| # The full version, including alpha/beta/rc tags | |||
| release = 'v0.2.0rc0' | |||
| release = 'v0.3.0rc0' | |||
| # -- General configuration --------------------------------------------------- | |||
| @@ -3,8 +3,9 @@ | |||
| autogl.datasets | |||
| =============== | |||
| We integrate the datasets from `PyTorch Geometric <https://pytorch-geometric.readthedocs.io/en/latest/modules/datasets.html>`_, `CogDL <https://cogdl.readthedocs.io/en/latest/autoapi/datasets/index.html>`_ and `OGB <https://ogb.stanford.edu/docs/dataset_overview/>`_. We also list some datasets from `CogDL` for simplicity. | |||
| We integrate the datasets from `PyTorch Geometric <https://pytorch-geometric.readthedocs.io/en/latest/modules/datasets.html>`_, `DGL <https://dgl.ai>`_ and `OGB <https://ogb.stanford.edu/docs/dataset_overview/>`_. We also list some datasets from `CogDL` for simplicity. | |||
| .. toctree:: | |||
| .. automodule:: autogl.datasets | |||
| :members: | |||
| dataset/dgl.rst | |||
| dataset/pyg.rst | |||
| @@ -0,0 +1,5 @@ | |||
| Deep Graph Library Dataset | |||
| ========================== | |||
| .. automodule:: autogl.datasets | |||
| :members: | |||
| @@ -0,0 +1,5 @@ | |||
| PyTorch Geometric Dataset | |||
| ========================== | |||
| .. automodule:: autogl.datasets | |||
| :members: | |||
| @@ -3,7 +3,7 @@ | |||
| autogl.module.feature | |||
| ===================== | |||
| Several feature engineering operations are collected manually, or from PyTorch Geometric, NetworkX, etc. | |||
| We support feature engineering for both PyTorch Geometric and Deep Deep Graph Library backend. | |||
| .. automodule:: autogl.module.feature | |||
| :members: | |||
| @@ -3,5 +3,7 @@ | |||
| autogl.module.model | |||
| ------------------- | |||
| .. automodule:: autogl.module.model | |||
| :members: | |||
| .. toctree:: | |||
| model/dgl.rst | |||
| model/pyg.rst | |||
| @@ -0,0 +1,20 @@ | |||
| Deep Graph Library Backend | |||
| ========================== | |||
| Models | |||
| ~~~~~~ | |||
| .. automodule:: autogl.module.model.dgl | |||
| :members: | |||
| Encoders | |||
| ~~~~~~~~ | |||
| .. autoclass:: autogl.module.model.encoders.GCNEncoderMaintainer | |||
| :members: from_hyper_parameter, initialize, get_output_dimensions, hyper_parameter_space, hyper_parameters | |||
| Decoders | |||
| ~~~~~~~~ | |||
| .. automodule:: autogl.module.model.decoders | |||
| :members: | |||
| @@ -0,0 +1,21 @@ | |||
| PyTorch Geometric Backend | |||
| ========================= | |||
| Models | |||
| ~~~~~~ | |||
| .. automodule:: autogl.module.model.pyg | |||
| :members: | |||
| Encoders | |||
| ~~~~~~~~ | |||
| .. automodule:: autogl.module.model.encoders | |||
| :members: | |||
| Decoders | |||
| ~~~~~~~~ | |||
| .. automodule:: autogl.module.model.decoders | |||
| :members: | |||
| @@ -0,0 +1,35 @@ | |||
| .. _backend: | |||
| Backend Support | |||
| =============== | |||
| Currently, AutoGL support both pytorch geometric backend and deep graph library backend to | |||
| enable users from both end benifiting the automation of graph learning. | |||
| To specify one specific backend, you can declare the backend using environment variables | |||
| ``AUTOGL_BACKEND``. For example: | |||
| .. code-block :: shell | |||
| AUTOGL_BACKEND=pyg python xxx.py | |||
| or | |||
| .. code-block :: python | |||
| import os | |||
| os.environ["AUTOGL_BACKEND"] = "pyg" | |||
| import autogl | |||
| ... | |||
| If no backend is specified, AutoGL will use the backend in your environment. If you have both | |||
| Deep Graph Library and PyTorch Geometric installed, the default backend will be Deep Graph Library. | |||
| You can also get current backend in the code by: | |||
| .. code-block :: python | |||
| from autogl.backend import DependentBackend | |||
| print(DependentBackend.get_backend_name()) | |||
| @@ -1,12 +1,12 @@ | |||
| .. _hetero_node_clf: | |||
| Node Classification for Heterogeneous Graph | |||
| ============== | |||
| =========================================== | |||
| This tutorial introduces how to use AutoGL to automate the learning of heterogeneous graphs in Deep Graph Library (DGL). | |||
| Creating a Heterogeneous Graph | |||
| ------------------- | |||
| ------------------------------ | |||
| AutoGL supports datasets created in DGL. We provide two datasets named "hetero-acm-han" and "hetero-acm-hgt" for HAN and HGT models, respectively [1]. | |||
| The following code snippet is an example for loading a heterogeneous graph. | |||
| @@ -33,7 +33,7 @@ You can also access to data stored in the dataset object for more details: | |||
| You can also build your own dataset and do feature engineering by adding files in the location AutoGL/autogl/datasets/_heterogeneous_datasets/_dgl_heterogeneous_datasets.py. We suggest users create a data object of type torch_geometric.data.HeteroData refering to the official documentation of DGL. | |||
| Building Heterogeneous GNN Modules | |||
| ------------------- | |||
| ---------------------------------- | |||
| AutoGL integrates commonly used heterogeneous graph neural network models such as HeteroRGCN (Schlichtkrull et al., 2018) [2], HAN (Wang et al., 2019) [3] and HGT (Hu et al., 2029) [4]. | |||
| .. code-block:: python | |||
| @@ -78,7 +78,7 @@ Finally, evaluate the model. | |||
| You can also define your own heterogeneous graph neural network models by adding files in the location AutoGL/autogl/module/model/dgl/hetero. | |||
| Automatic Search for Node Classification Tasks | |||
| ------------------- | |||
| ---------------------------------------------- | |||
| On top of the modules mentioned above, we provide a high-level API Solver to control the overall pipeline. We encapsulated the training process in the Building Heterogeneous GNN Modules part in the solver AutoHeteroNodeClassifier that supports automatic hyperparametric optimization as well as feature engineering and ensemble. | |||
| In this part, we will show you how to use AutoHeteroNodeClassifier to automatically predict the publishing conference of a paper using the ACM academic graph dataset. | |||
| @@ -19,7 +19,7 @@ The estimation strategy gives the performance of certain architectures when it i | |||
| The simplest option is to perform a standard training and validation of the architecture on data. | |||
| Since there are lots of architectures need estimating in the whole searching process, estimation strategy is desired to be very efficient to save computational resources. | |||
| .. image:: ../resources/nas.svg | |||
| .. image:: ../../../resources/nas.svg | |||
| :align: center | |||
| To be more flexible, we modulize NAS process with three part: algorithm, space and estimator, corresponding to the three module search space, search strategy and estimation strategy. | |||
| @@ -13,7 +13,7 @@ The workflow below shows the overall framework of AutoGL. | |||
| .. image:: ../resources/workflow.svg | |||
| :align: center | |||
| AutoGL uses ``AutoGL Dataset`` to maintain datasets for graph-based machine learning, which is based on the dataset in PyTorch Geometric with some support added to corporate with the auto solver framework. | |||
| AutoGL uses ``AutoGL Dataset`` to maintain datasets for graph-based machine learning, which is based on the dataset in PyTorch Geometric or Deep Graph Library with some support added to corporate with the auto solver framework. | |||
| Different graph-based machine learning tasks are solved by different ``AutoGL Solvers`` , which make use of four main modules to automatically solve given tasks, namely ``Auto Feature Engineer``, ``Auto Model``, ``Neural Architecture Search``, ``HyperParameter Optimization``, and ``Auto Ensemble``. | |||
| @@ -31,9 +31,17 @@ Please make sure you meet the following requirements before installing AutoGL. | |||
| see `PyTorch <https://pytorch.org/>`_ for installation. | |||
| 3. PyTorch Geometric (>=1.7.0) | |||
| 3. Graph Library Backend | |||
| see `PyTorch Geometric <https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html>`_ for installation. | |||
| You will need either PyTorch Geometric (PyG) or Deep Graph Library (DGL) as the backend. | |||
| 3.1 PyTorch Geometric (>=1.7.0) | |||
| see <https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html> for installation. | |||
| 3.2 Deep Graph Library (>=0.7.0) | |||
| see <https://dgl.ai> for installation. | |||
| Installation | |||
| ~~~~~~~~~~~~ | |||
| @@ -86,8 +94,14 @@ In AutoGL, the tasks are solved by corresponding solvers, which in general do th | |||
| :caption: Tutorial | |||
| docfile/tutorial/t_quickstart | |||
| docfile/tutorial/t_dataset | |||
| docfile/tutorial/t_fe | |||
| docfile/tutorial/t_hetero_node_clf | |||
| docfile/tutorial/t_homo_graph_classification_gin | |||
| docfile/tutorial/t_backend | |||
| .. | |||
| docfile/tutorial/t_dataset | |||
| docfile/tutorial/t_fe | |||
| docfile/tutorial/t_model | |||
| docfile/tutorial/t_trainer | |||
| docfile/tutorial/t_hpo | |||
| @@ -99,9 +113,9 @@ In AutoGL, the tasks are solved by corresponding solvers, which in general do th | |||
| :maxdepth: 2 | |||
| :caption: Documentation | |||
| docfile/documentation/data | |||
| docfile/documentation/data | |||
| docfile/documentation/dataset | |||
| docfile/documentation/feature | |||
| docfile/documentation/feature | |||
| docfile/documentation/model | |||
| docfile/documentation/train | |||
| docfile/documentation/hpo | |||
| @@ -13,11 +13,5 @@ requests | |||
| scikit-learn | |||
| scipy | |||
| tabulate | |||
| # https://download.pytorch.org/whl/lts/1.8/cpu/torch-1.8.1%2Bcpu-cp36-cp36m-linux_x86_64.whl | |||
| # https://pytorch-geometric.com/whl/torch-1.8.0+cpu/torch_cluster-1.5.9-cp36-cp36m-linux_x86_64.whl | |||
| # https://pytorch-geometric.com/whl/torch-1.8.0+cpu/torch_scatter-2.0.6-cp36-cp36m-linux_x86_64.whl | |||
| # https://pytorch-geometric.com/whl/torch-1.8.0+cpu/torch_sparse-0.6.10-cp36-cp36m-linux_x86_64.whl | |||
| # https://pytorch-geometric.com/whl/torch-1.8.0+cpu/torch_spline_conv-1.2.1-cp36-cp36m-linux_x86_64.whl | |||
| # torch-geometric | |||
| tqdm | |||
| nni | |||
| @@ -16,7 +16,7 @@ with open("README.md", 'r') as fh: | |||
| ''' https://setuptools.readthedocs.io/en/latest/ ''' | |||
| setup( | |||
| name='autogl', | |||
| version='0.2.0-pre', | |||
| version='0.3.0-pre', | |||
| author='THUMNLab/aglteam', | |||
| maintainer='THUMNLab/aglteam', | |||
| author_email='autogl@tsinghua.edu.cn', | |||
| @@ -10,7 +10,7 @@ import random | |||
| from torch_geometric.datasets import Planetoid | |||
| from torch_geometric.data import Data | |||
| from autogl.module.model.encoders import GCNEncoderMaintainer, GATEncoderMaintainer, SAGEEncoderMaintainer | |||
| from autogl.module.model.decoders import DotProductLinkPredictonDecoderMaintainer | |||
| from autogl.module.model.decoders import DotProductLinkPredictionDecoderMaintainer | |||
| import torch_geometric.transforms as T | |||
| from torch_geometric.utils import train_test_split_edges | |||
| from torch_geometric.utils import negative_sampling | |||
| @@ -26,7 +26,7 @@ class DummyModel(torch.nn.Module): | |||
| self.encoder = encoder | |||
| self.decoder = decoder | |||
| if self.decoder is None: | |||
| self.decoder = DotProductLinkPredictonDecoderMaintainer() | |||
| self.decoder = DotProductLinkPredictionDecoderMaintainer() | |||
| self.decoder.initialize() | |||
| self.decoder = self.decoder.decoder | |||