PR [#90] doc -> dev

Finish Doc
4 years ago · aa6b9510ad
--- a/README.md
+++ b/README.md
@@ -6,12 +6,18 @@ An autoML framework & toolkit for machine learning on graphs.

 Feel free to open <a href="https://github.com/THUMNLab/AutoGL/issues">issues</a> or contact us at <a href="mailto:autogl@tsinghua.edu.cn">autogl@tsinghua.edu.cn</a> if you have any comments or suggestions!

 [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
 [![Documentation Status](https://readthedocs.org/projects/autogl/badge/?version=latest)](https://autogl.readthedocs.io/en/latest/?badge=latest)
 <!--
 [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
 % [![Documentation Status](http://mn.cs.tsinghua.edu.cn/autogl/documentation/?badge=latest)](http://mn.cs.tsinghua.edu.cn/autogl/documentation/?badge=latest)-->

 ## News!

 - 2021.07.11 New version! v0.2.0-pre is here! In this new version, AutoGL supports [neural architecture search (NAS)](https://autogl.readthedocs.io/en/latest/docfile/tutorial/t_nas.html) to customize architectures for the given datasets and tasks. AutoGL also supports [sampling](https://autogl.readthedocs.io/en/latest/docfile/tutorial/t_trainer.html#node-classification-with-sampling) now to perform tasks on large datasets, including node-wise sampling, layer-wise sampling, and sub-graph sampling. The link prediction task is now also supported! Learn more in our [tutorial](https://autogl.readthedocs.io/en/latest/index.html).
 - 2021.12.31 New Version! v0.3.0-pre is here!
    - AutoGL now support [__Deep Graph Library (DGL)__](https://www.dgl.ai/) backend to be interface-friendly for DGL users! All the homogeneous node classification task, link prediction task, and graph classification task are currently supported under DGL backend. AutoGL is also compatible with PyG 2.0 now.
    - The __heterogeneous__ node classification tasks are now supported! See [hetero tutorial](http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/tutorial/t_hetero_node_clf.html) for more details.
    - To make the library more flexible, the module `model` now supports __decoupled__ to two additional sub-modules named `encoder` and `decoder`. Under the __decoupled__ design, one `encoder` can be used to solve all kinds of tasks, relieving burdens for developing and user expanding/contributing.
    - We enrich our supported [NAS algorithms](http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/tutorial/t_nas.html) such as [AutoAttend](https://proceedings.mlr.press/v139/guan21a.html), [GASSO](https://proceedings.neurips.cc/paper/2021/hash/8c9f32e03aeb2e3000825c8c875c4edd-Abstract.html), [hardware-aware algorithm](http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/documentation/nas.html#autogl.module.nas.estimator.OneShotEstimator_HardwareAware), etc. 
 - 2021.07.11 New version! v0.2.0-pre is here! In this new version, AutoGL supports [neural architecture search (NAS)](http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/tutorial/t_nas.html) to customize architectures for the given datasets and tasks. AutoGL also supports [sampling](http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/tutorial/t_trainer.html#node-classification-with-sampling) now to perform tasks on large datasets, including node-wise sampling, layer-wise sampling, and sub-graph sampling. The link prediction task is now also supported! Learn more in our [tutorial](http://mn.cs.tsinghua.edu.cn/autogl/documentation/index.html).
 - 2021.04.16 Our survey paper about automated machine learning on graphs is accepted by IJCAI! See more [here](http://arxiv.org/abs/2103.00742).
 - 2021.04.10 Our paper [__AutoGL: A Library for Automated Graph Learning__](https://arxiv.org/abs/2104.04987) is accepted by _ICLR 2021 Workshop on Geometrical and Topological Representation Learning_! You can cite our paper following methods [here](#Cite).

@@ -23,7 +29,7 @@ The workflow below shows the overall framework of AutoGL.

 <img src="./resources/workflow.svg">

 AutoGL uses `datasets` to maintain datasets for graph-based machine learning, which is based on Dataset in PyTorch Geometric with some functions added to support the auto solver framework.
 AutoGL uses `datasets` to maintain datasets for graph-based machine learning, which is based on Dataset in PyTorch Geometric or Deep Graph Library with some functions added to support the auto solver framework.

 Different graph-based machine learning tasks are handled by different `AutoGL solvers`, which make use of five main modules to automatically solve given tasks, namely `auto feature engineer`, `neural architecture search`, `auto model`, `hyperparameter optimization`, and `auto ensemble`. 

@@ -42,15 +48,18 @@ Currently, the following algorithms are supported in AutoGL:
    <tr valign="top">
        <!--<td><b>Generators</b><br>graphlet <br> eigen <br> pagerank <br> PYGLocalDegreeProfile <br> PYGNormalizeFeatures <br> PYGOneHotDegree <br> onehot <br> <br><b>Selectors</b><br> SeFilterConstant<br> gbdt <br> <br><b>Subgraph</b><br> NxLargeCliqueSize<br> NxAverageClusteringApproximate<br> NxDegreeAssortativityCoefficient<br> NxDegreePearsonCorrelationCoefficient<br> NxHasBridge <br>NxGraphCliqueNumber<br> NxGraphNumberOfCliques<br> NxTransitivity<br> NxAverageClustering<br> NxIsConnected<br> NxNumberConnectedComponents<br> NxIsDistanceRegular<br> NxLocalEfficiency<br> NxGlobalEfficiency<br> NxIsEulerian </td>-->
        <td><b>Generators</b><br>Graphlets <br> EigenGNN <br> <a href="http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/tutorial/t_fe.html">more ...</a><br><br><b>Selectors</b><br> SeFilterConstant<br> gbdt <br> <br><b>Graph</b><br> netlsd<br> NxAverageClustering<br> <a href="http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/tutorial/t_fe.html">more ...</a></td>
        <td><b>Node Classification</b><br> GCN <br> GAT <br> GraphSAGE <br><br><b>Graph Classification</b><br> GIN <br> TopKPool </td>
        <td><b>Homo Encoders</b><br> GCNEncoder <br> GATEncoder <br> SAGEEncoder <br> GINEncoder <br> <br><b>Decoders</b><br>LogSoftmaxDecoder <br> DotProductDecoder <br> SumPoolMLPDecoder <br> JKSumPoolDecoder </td>
        <td>
        <b>Algorithms</b><br>
        Random<br>
        RL<br>
        Evolution<br>
        GASSO<br>
        <a href='http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/documentation/nas.html'>more ...</a><br><br>
        <b>Spaces</b><br>
        SinglePath<br>
        GraphNas<br>
        AutoAttend<br>
        <a href='http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/documentation/nas.html'>more ...</a><br><br>
        <b>Estimators</b><br>
        Oneshot<br>
@@ -76,10 +85,18 @@ Please make sure you meet the following requirements before installing AutoGL.

    see <https://pytorch.org/> for installation.

 3. PyTorch Geometric (>=1.7.0)
 3. Graph Library Backend

    You will need either PyTorch Geometric (PyG) or Deep Graph Library (DGL) as the backend. You can select a backend following [here](http://mn.cs.tsinghua.edu.cn/autogl/documentation/docfile/tutorial/t_backend.html) if you install both.

 3.1 PyTorch Geometric (>=1.7.0)

    see <https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html> for installation.

 3.2 Deep Graph Library (>=0.7.0)

    see <https://dgl.ai> for installation.

 ### Installation

 #### Install from pip
@@ -151,4 +168,4 @@ You may also find our [survey paper](http://arxiv.org/abs/2103.00742) helpful:
 ```

 ## License
 Notice that we follow [Apache license](LICENSE) across the entire codebase from v0.2.
 We follow [Apache license](LICENSE) across the entire codebase from v0.2.
--- a/autogl/init.py
+++ b/autogl/init.py
@@ -16,4 +16,4 @@ from .module import (
    train,
 )

 __version__ = "0.2.0-pre"
 __version__ = "0.3.0-pre"
--- a/autogl/datasets/init.py
+++ b/autogl/datasets/init.py
@@ -38,6 +38,8 @@ if _backend.DependentBackend.is_dgl():
        PTCMRDataset,
        NCI1Dataset
    )
    from ._heterogeneous_datasets import ACMHANDataset, ACMHGTDataset

 elif _backend.DependentBackend.is_pyg():
    from ._pyg import (
        CoraDataset,
@@ -68,4 +70,91 @@ elif _backend.DependentBackend.is_pyg():
        ModelNet40TrainingDataset,
        ModelNet40TestDataset
    )
 from ._heterogeneous_datasets import *

 if _backend.DependentBackend.is_pyg():
    __all__ = [
        "CoraDataset",
        "CiteSeerDataset",
        "PubMedDataset",
        "FlickrDataset",
        "RedditDataset",
        "AmazonComputersDataset",
        "AmazonPhotoDataset",
        "CoauthorPhysicsDataset",
        "CoauthorCSDataset",
        "PPIDataset",
        "QM9Dataset",
        "MUTAGDataset",
        "ENZYMESDataset",
        "IMDBBinaryDataset",
        "IMDBMultiDataset",
        "RedditBinaryDataset",
        "REDDITMulti5KDataset",
        "REDDITMulti12KDataset",
        "COLLABDataset",
        "ProteinsDataset",
        "PTCMRDataset",
        "NCI1Dataset",
        "NCI109Dataset",
        "ModelNet10TrainingDataset",
        "ModelNet10TestDataset",
        "ModelNet40TrainingDataset",
        "ModelNet40TestDataset",
        "OGBNProductsDataset",
        "OGBNProteinsDataset",
        "OGBNArxivDataset",
        "OGBNPapers100MDataset",
        "OGBLPPADataset",
        "OGBLCOLLABDataset",
        "OGBLDDIDataset",
        "OGBLCitation2Dataset",
        "OGBGMOLHIVDataset",
        "OGBGMOLPCBADataset",
        "OGBGPPADataset",
        "OGBGCode2Dataset",
        "GTNACMDataset",
        "GTNDBLPDataset",
        "GTNIMDBDataset",
        "BlogCatalogDataset",
        "WIKIPEDIADataset"
    ]
 else:
    __all__ = [
        "CoraDataset",
        "CiteSeerDataset",
        "PubMedDataset",
        "RedditDataset",
        "AmazonComputersDataset",
        "AmazonPhotoDataset",
        "CoauthorPhysicsDataset",
        "CoauthorCSDataset",
        "MUTAGDataset",
        "ENZYMESDataset",
        "IMDBBinaryDataset",
        "IMDBMultiDataset",
        "RedditBinaryDataset",
        "REDDITMulti5KDataset",
        "COLLABDataset",
        "ProteinsDataset",
        "PTCMRDataset",
        "NCI1Dataset",
        "ACMHANDataset",
        "ACMHGTDataset",
        "OGBNProductsDataset",
        "OGBNProteinsDataset",
        "OGBNArxivDataset",
        "OGBNPapers100MDataset",
        "OGBLPPADataset",
        "OGBLCOLLABDataset",
        "OGBLDDIDataset",
        "OGBLCitation2Dataset",
        "OGBGMOLHIVDataset",
        "OGBGMOLPCBADataset",
        "OGBGPPADataset",
        "OGBGCode2Dataset",
        "GTNACMDataset",
        "GTNDBLPDataset",
        "GTNIMDBDataset",
        "BlogCatalogDataset",
        "WIKIPEDIADataset"
    ]
--- a/autogl/module/feature/init.py
+++ b/autogl/module/feature/init.py
@@ -33,3 +33,33 @@ from ._graph import (
 from ._selectors import (
    FilterConstant, GBDTFeatureSelector
 )

 __all__ = [
    "BaseFeatureEngineer",
    "BaseFeature",
    "FeatureEngineerUniversalRegistry",
    "OneHotFeatureGenerator",
    "EigenFeatureGenerator",
    "GraphletGenerator",
    "PageRankFeatureGenerator",
    "LocalDegreeProfileGenerator",
    "NormalizeFeatures",
    "OneHotDegreeGenerator",
    "NetLSD",
    "NXLargeCliqueSize",
    "NXDegreeAssortativityCoefficient",
    "NXDegreePearsonCorrelationCoefficient",
    "NXHasBridges",
    "NXGraphCliqueNumber",
    "NXGraphNumberOfCliques",
    "NXTransitivity",
    "NXAverageClustering",
    "NXIsConnected",
    "NXNumberConnectedComponents",
    "NXIsDistanceRegular",
    "NXLocalEfficiency",
    "NXGlobalEfficiency",
    "NXIsEulerian",
    "FilterConstant",
    "GBDTFeatureSelector"
 ]
--- a/autogl/module/feature/_generators/_eigen.py
+++ b/autogl/module/feature/_generators/_eigen.py
@@ -60,7 +60,7 @@ class EigenFeatureGenerator(BaseFeatureGenerator):
    References
    ----------
    .. [#] Ziwei Zhang, Peng Cui, Jian Pei, Xin Wang, Wenwu Zhu:
        Eigen-GNN: A Graph Structure Preserving Plug-in for GNNs. CoRR abs/2006.04330 (2020)
        Eigen-GNN: A Graph Structure Preserving Plug-in for GNNs. TKDE (2021)
        https://arxiv.org/abs/2006.04330


--- a/autogl/module/model/init.py
+++ b/autogl/module/model/init.py
@@ -3,8 +3,33 @@ import sys
 from ...backend import DependentBackend
 from . import _utils

 from .decoders import BaseDecoderMaintainer, DecoderUniversalRegistry
 from .encoders import BaseEncoderMaintainer, AutoHomogeneousEncoderMaintainer, EncoderUniversalRegistry
 from .decoders import (
    BaseDecoderMaintainer,
    DecoderUniversalRegistry,
    LogSoftmaxDecoderMaintainer,
    DotProductLinkPredictionDecoderMaintainer
 )

 from .encoders import (
    BaseEncoderMaintainer,
    AutoHomogeneousEncoderMaintainer,
    EncoderUniversalRegistry,
    GCNEncoderMaintainer,
    GATEncoderMaintainer,
    GINEncoderMaintainer,
    SAGEEncoderMaintainer
 )

 if DependentBackend.is_dgl():
    from .decoders import (
        TopKDecoderMaintainer,
        JKSumPoolDecoderMaintainer
    )
 else:
    from .decoders import (
        DiffPoolDecoderMaintainer,
        SumPoolMLPDecoderMaintainer
    )

 # load corresponding backend model of subclass
 def _load_subclass_backend(backend):
@@ -14,3 +39,29 @@ def _load_subclass_backend(backend):
        setattr(this, api, obj)

 _load_subclass_backend(DependentBackend)

 __all__.extend([
    "BaseDecoderMaintainer",
    "DecoderUniversalRegistry",
    "LogSoftmaxDecoderMaintainer",
    "DotProductLinkPredictionDecoderMaintainer",
    "BaseEncoderMaintainer",
    "AutoHomogeneousEncoderMaintainer",
    "EncoderUniversalRegistry",
    "GCNEncoderMaintainer",
    "GATEncoderMaintainer",
    "GINEncoderMaintainer",
    "SAGEEncoderMaintainer"
 ])

 if DependentBackend.is_dgl():
    __all__.extend([
        "TopKDecoderMaintainer",
        "JKSumPoolDecoderMaintainer",

    ])
 else:
    __all__.extend([
        "DiffPoolDecoderMaintainer",
        "SumPoolMLPDecoderMaintainer"
    ])
--- a/autogl/module/model/decoders/init.py
+++ b/autogl/module/model/decoders/init.py
@@ -7,7 +7,7 @@ if DependentBackend.is_pyg():
        LogSoftmaxDecoderMaintainer,
        SumPoolMLPDecoderMaintainer,
        DiffPoolDecoderMaintainer,
        DotProductLinkPredictonDecoderMaintainer
        DotProductLinkPredictionDecoderMaintainer
    )
 else:
    from ._dgl import (
@@ -21,8 +21,16 @@ __all__ = [
    "BaseDecoderMaintainer",
    "DecoderUniversalRegistry",
    "LogSoftmaxDecoderMaintainer",
    "JKSumPoolDecoderMaintainer",
    "TopKDecoderMaintainer",
    "DiffPoolDecoderMaintainer",
    "DotProductLinkPredictonDecoderMaintainer"
    "DotProductLinkPredictionDecoderMaintainer"
 ]

 if DependentBackend.is_pyg():
    __all__.extend([
        "DiffPoolDecoderMaintainer",
        "SumPoolMLPDecoderMaintainer"
    ])
 else:
    __all__.extend([
        "JKSumPoolDecoderMaintainer",
        "TopKDecoderMaintainer"
    ])
--- a/autogl/module/model/decoders/_pyg/init.py
+++ b/autogl/module/model/decoders/_pyg/init.py
@@ -2,5 +2,5 @@ from ._pyg_decoders import (
    LogSoftmaxDecoderMaintainer,
    SumPoolMLPDecoderMaintainer,
    DiffPoolDecoderMaintainer,
    DotProductLinkPredictonDecoderMaintainer
    DotProductLinkPredictionDecoderMaintainer
 )
--- a/autogl/module/model/decoders/_pyg/_pyg_decoders.py
+++ b/autogl/module/model/decoders/_pyg/_pyg_decoders.py
@@ -294,6 +294,6 @@ class _DotProductLinkPredictonDecoder(torch.nn.Module):
@decoder_registry.DecoderUniversalRegistry.register_decoder('dotproduct'.lower())
@decoder_registry.DecoderUniversalRegistry.register_decoder('lp-decoder'.lower())
@decoder_registry.DecoderUniversalRegistry.register_decoder('dot-product'.lower())
 class DotProductLinkPredictonDecoderMaintainer(base_decoder.BaseDecoderMaintainer):
 class DotProductLinkPredictionDecoderMaintainer(base_decoder.BaseDecoderMaintainer):
    def _initialize(self, *args, **kwargs):
        self._decoder = _DotProductLinkPredictonDecoder()
--- a/autogl/module/model/dgl/hetero/hgt.py
+++ b/autogl/module/model/dgl/hetero/hgt.py
@@ -181,7 +181,7 @@ class AutoHGT(BaseHeteroModelMaintainer):
    r"""
    AutoHGT.
    The model used in this automodel is HGT, i.e., the graph convolutional network from the
    `"Heterogeneous Graph Transformer" <https://arxiv.org/abs/2003.01332>`_paper.
    `"Heterogeneous Graph Transformer" <https://arxiv.org/abs/2003.01332>`_ paper.
        
    Parameters
    ----------
--- a/autogl/module/model/encoders/init.py
+++ b/autogl/module/model/encoders/init.py
@@ -11,10 +11,10 @@ if DependentBackend.is_pyg():
    )
 else:
    from ._dgl import (
        GCNMaintainer as GCNEncoderMaintainer,
        GATMaintainer as GATEncoderMaintainer,
        GCNEncoderMaintainer,
        GATEncoderMaintainer,
        GINEncoderMaintainer,
        SAGEMaintainer as SAGEEncoderMaintainer,
        SAGEEncoderMaintainer,
        AutoTopKMaintainer
    )

@@ -26,5 +26,7 @@ __all__ = [
    "GATEncoderMaintainer",
    "GINEncoderMaintainer",
    "SAGEEncoderMaintainer",
    "AutoTopKMaintainer"
 ]

 if DependentBackend.is_dgl():
    __all__.append("AutoTopKMaintainer")
--- a/autogl/module/model/encoders/_dgl/init.py
+++ b/autogl/module/model/encoders/_dgl/init.py
@@ -1,5 +1,5 @@
 from ._gat import GATMaintainer
 from ._gcn import GCNMaintainer
 from ._gat import GATEncoderMaintainer
 from ._gcn import GCNEncoderMaintainer
 from ._gin import GINEncoderMaintainer
 from ._sage import SAGEMaintainer
 from ._sage import SAGEEncoderMaintainer
 from ._topk import AutoTopKMaintainer
--- a/autogl/module/model/encoders/_dgl/_gat.py
+++ b/autogl/module/model/encoders/_dgl/_gat.py
@@ -69,14 +69,16 @@ class GAT(torch.nn.Module):

@encoder_registry.EncoderUniversalRegistry.register_encoder('gat')
@encoder_registry.EncoderUniversalRegistry.register_encoder('gat_encoder')
 class GATMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer):
 class GATEncoderMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer):
    r"""
    AutoGAT. The model used in this automodel is GAT, i.e., the graph attentional network from the `"Graph Attention Networks"
    <https://arxiv.org/abs/1710.10903>`_ paper. The layer is

    .. math::
        \mathbf{x}^{\prime}_i = \alpha_{i,i}\mathbf{\Theta}\mathbf{x}_{i} +
        \sum_{j \in \mathcal{N}(i)} \alpha_{i,j}\mathbf{\Theta}\mathbf{x}_{j}
    where the attention coefficients :math:`\alpha_{i,j}` are computed as
    
    .. math::
        \alpha_{i,j} =
        \frac{
@@ -87,6 +89,7 @@ class GATMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer):
        \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top}
        [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_k]
        \right)\right)}.
    
    Parameters
    ----------
    input_dimension: `Optional[int]`
@@ -106,7 +109,7 @@ class GATMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer):
            device: _typing.Union[torch.device, str, int, None] = ...,
            *args, **kwargs
    ):
        super(GATMaintainer, self).__init__(
        super(GATEncoderMaintainer, self).__init__(
            input_dimension, final_dimension, device, *args, **kwargs
        )
        self.hyper_parameters: _typing.Mapping[str, _typing.Any] = {
--- a/autogl/module/model/encoders/_dgl/_gcn.py
+++ b/autogl/module/model/encoders/_dgl/_gcn.py
@@ -39,7 +39,7 @@ class _GCN(torch.nn.Module):

@encoder_registry.EncoderUniversalRegistry.register_encoder('gcn')
@encoder_registry.EncoderUniversalRegistry.register_encoder('gcn_encoder')
 class GCNMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer):
 class GCNEncoderMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer):
    def __init__(
            self,
            input_dimension: _typing.Optional[int] = ...,
@@ -47,7 +47,7 @@ class GCNMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer):
            device: _typing.Union[torch.device, str, int, None] = ...,
            *args, **kwargs
    ):
        super(GCNMaintainer, self).__init__(
        super(GCNEncoderMaintainer, self).__init__(
            input_dimension, final_dimension, device, *args, **kwargs
        )
        self.hyper_parameter_space = [
--- a/autogl/module/model/encoders/_dgl/_sage.py
+++ b/autogl/module/model/encoders/_dgl/_sage.py
@@ -43,7 +43,7 @@ class _SAGE(torch.nn.Module):

@encoder_registry.EncoderUniversalRegistry.register_encoder('sage')
@encoder_registry.EncoderUniversalRegistry.register_encoder('sage_encoder')
 class SAGEMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer):
 class SAGEEncoderMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer):
    def __init__(
            self,
            input_dimension: _typing.Optional[int] = ...,
@@ -51,7 +51,7 @@ class SAGEMaintainer(base_encoder.AutoHomogeneousEncoderMaintainer):
            device: _typing.Union[torch.device, str, int, None] = ...,
            *args, **kwargs
    ):
        super(SAGEMaintainer, self).__init__(
        super(SAGEEncoderMaintainer, self).__init__(
            input_dimension, final_dimension, device, *args, **kwargs
        )
        self.hyper_parameter_space = [
--- a/autogl/module/train/graph_classification_full.py
+++ b/autogl/module/train/graph_classification_full.py
@@ -31,26 +31,53 @@ class GraphClassificationFullTrainer(BaseGraphClassificationTrainer):

    Parameters
    ----------
    model: ``BaseAutoModel`` or ``str``
        The (name of) model used to train and predict.
    model:
        Models can be ``str``, ``autogl.module.model.BaseAutoModel``, 
        ``autogl.module.model.encoders.BaseEncoderMaintainer`` or a tuple of (encoder, decoder) 
        if need to specify both encoder and decoder. Encoder can be ``str`` or
        ``autogl.module.model.encoders.BaseEncoderMaintainer``, and decoder can be ``str``
        or ``autogl.module.model.decoders.BaseDecoderMaintainer``.
        If only encoder is specified, decoder will be default to "logsoftmax"

    num_features: int (Optional)
        The number of features in dataset. default None
    
    num_classes: int (Optional)
        The number of classes. default None
    
    num_graph_features: int (Optional)
        The number of graph level features. default 0.

    optimizer: ``Optimizer`` of ``str``
        The (name of) optimizer used to train and predict.
        The (name of) optimizer used to train and predict. default torch.optim.Adam

    lr: ``float``
        The learning rate of graph classification task.
        The learning rate of node classification task. default 1e-4

    max_epoch: ``int``
        The max number of epochs in training.
        The max number of epochs in training. default 100

    early_stopping_round: ``int``
        The round of early stop.
        The round of early stop. default 100

    weight_decay: ``float``
        weight decay ratio, default 1e-4

    device: ``torch.device`` or ``str``
        The device where model will be running on.

    init: ``bool``
        If True(False), the model will (not) be initialized.

    feval: (Sequence of) ``Evaluation`` or ``str``
        The evaluation functions, default ``[LogLoss]``
    
    loss: ``str``
        The loss used. Default ``nll_loss``.

    lr_scheduler_type: ``str`` (Optional)
        The lr scheduler type used. Default None.

    """

    space = None
@@ -516,9 +543,15 @@ class GraphClassificationFullTrainer(BaseGraphClassificationTrainer):
        Parameters
        ----------
        hp: ``dict``.
            The hyperparameter used in the new instance.

        model: The model used in the new instance of trainer.
            The hyperparameter used in the new instance. Should contain 3 keys "trainer", "encoder"
            "decoder", with corresponding hyperparameters as values.

        model: The new model
            Models can be ``str``, ``autogl.module.model.BaseAutoModel``, 
            ``autogl.module.model.encoders.BaseEncoderMaintainer`` or a tuple of (encoder, decoder) 
            if need to specify both encoder and decoder. Encoder can be ``str`` or
            ``autogl.module.model.encoders.BaseEncoderMaintainer``, and decoder can be ``str``
            or ``autogl.module.model.decoders.BaseDecoderMaintainer``.

        restricted: ``bool``.
            If False(True), the hyperparameter should (not) be updated from origin hyperparameter.
--- a/autogl/module/train/link_prediction_full.py
+++ b/autogl/module/train/link_prediction_full.py
@@ -56,26 +56,47 @@ class LinkPredictionTrainer(BaseLinkPredictionTrainer):

    Parameters
    ----------
    model: ``BaseModel`` or ``str``
        The (name of) model used to train and predict.
    model:
        Models can be ``str``, ``autogl.module.model.BaseAutoModel``, 
        ``autogl.module.model.encoders.BaseEncoderMaintainer`` or a tuple of (encoder, decoder) 
        if need to specify both encoder and decoder. Encoder can be ``str`` or
        ``autogl.module.model.encoders.BaseEncoderMaintainer``, and decoder can be ``str``
        or ``autogl.module.model.decoders.BaseDecoderMaintainer``.
        If only encoder is specified, decoder will be default to "logsoftmax"

    num_features: int (Optional)
        The number of features in dataset. default None

    optimizer: ``Optimizer`` of ``str``
        The (name of) optimizer used to train and predict.
        The (name of) optimizer used to train and predict. default torch.optim.Adam

    lr: ``float``
        The learning rate of link prediction task.
        The learning rate of node classification task. default 1e-4

    max_epoch: ``int``
        The max number of epochs in training.
        The max number of epochs in training. default 100

    early_stopping_round: ``int``
        The round of early stop.
        The round of early stop. default 100

    weight_decay: ``float``
        weight decay ratio, default 1e-4

    device: ``torch.device`` or ``str``
        The device where model will be running on.

    init: ``bool``
        If True(False), the model will (not) be initialized.

    feval: (Sequence of) ``Evaluation`` or ``str``
        The evaluation functions, default ``[LogLoss]``
    
    loss: ``str``
        The loss used. Default ``nll_loss``.

    lr_scheduler_type: ``str`` (Optional)
        The lr scheduler type used. Default None.

    """

    space = None
@@ -592,9 +613,15 @@ class LinkPredictionTrainer(BaseLinkPredictionTrainer):
        Parameters
        ----------
        hp: ``dict``.
            The hyperparameter used in the new instance.

        model: The model used in the new instance of trainer.
            The hyperparameter used in the new instance. Should contain 3 keys "trainer", "encoder"
            "decoder", with corresponding hyperparameters as values.

        model: The new model
            Models can be ``str``, ``autogl.module.model.BaseAutoModel``, 
            ``autogl.module.model.encoders.BaseEncoderMaintainer`` or a tuple of (encoder, decoder) 
            if need to specify both encoder and decoder. Encoder can be ``str`` or
            ``autogl.module.model.encoders.BaseEncoderMaintainer``, and decoder can be ``str``
            or ``autogl.module.model.decoders.BaseDecoderMaintainer``.

        restricted: ``bool``.
            If False(True), the hyperparameter should (not) be updated from origin hyperparameter.
--- a/autogl/module/train/node_classification_full.py
+++ b/autogl/module/train/node_classification_full.py
@@ -34,26 +34,50 @@ class NodeClassificationFullTrainer(BaseNodeClassificationTrainer):

    Parameters
    ----------
    model: ``BaseModel`` or ``str``
        The (name of) model used to train and predict.
    model:
        Models can be ``str``, ``autogl.module.model.BaseAutoModel``, 
        ``autogl.module.model.encoders.BaseEncoderMaintainer`` or a tuple of (encoder, decoder) 
        if need to specify both encoder and decoder. Encoder can be ``str`` or
        ``autogl.module.model.encoders.BaseEncoderMaintainer``, and decoder can be ``str``
        or ``autogl.module.model.decoders.BaseDecoderMaintainer``.
        If only encoder is specified, decoder will be default to "logsoftmax"

    num_features: int (Optional)
        The number of features in dataset. default None
    
    num_classes: int (Optional)
        The number of classes. default None

    optimizer: ``Optimizer`` of ``str``
        The (name of) optimizer used to train and predict.
        The (name of) optimizer used to train and predict. default torch.optim.Adam

    lr: ``float``
        The learning rate of node classification task.
        The learning rate of node classification task. default 1e-4

    max_epoch: ``int``
        The max number of epochs in training.
        The max number of epochs in training. default 100

    early_stopping_round: ``int``
        The round of early stop.
        The round of early stop. default 100

    weight_decay: ``float``
        weight decay ratio, default 1e-4

    device: ``torch.device`` or ``str``
        The device where model will be running on.

    init: ``bool``
        If True(False), the model will (not) be initialized.

    feval: (Sequence of) ``Evaluation`` or ``str``
        The evaluation functions, default ``[LogLoss]``
    
    loss: ``str``
        The loss used. Default ``nll_loss``.

    lr_scheduler_type: ``str`` (Optional)
        The lr scheduler type used. Default None.

    """

    def __init__(
@@ -161,6 +185,9 @@ class NodeClassificationFullTrainer(BaseNodeClassificationTrainer):

    @classmethod
    def get_task_name(cls):
        """
        Derive the task name. (NodeClassification)
        """
        return "NodeClassification"

    def __train_only(self, data, train_mask=None):
@@ -437,16 +464,22 @@ class NodeClassificationFullTrainer(BaseNodeClassificationTrainer):
            return res[0]
        return res

    def duplicate_from_hyper_parameter(self, hp: dict, encoder="same", decoder="same", restricted=True):
    def duplicate_from_hyper_parameter(self, hp: dict, model=None, restricted=True):
        """
        The function of duplicating a new instance from the given hyperparameter.

        Parameters
        ----------
        hp: ``dict``.
            The hyperparameter used in the new instance.
            The hyperparameter used in the new instance. Should contain 3 keys "trainer", "encoder"
            "decoder", with corresponding hyperparameters as values.

        model: The model used in the new instance of trainer.
        model:
            Models can be ``str``, ``autogl.module.model.BaseAutoModel``, 
            ``autogl.module.model.encoders.BaseEncoderMaintainer`` or a tuple of (encoder, decoder) 
            if need to specify both encoder and decoder. Encoder can be ``str`` or
            ``autogl.module.model.encoders.BaseEncoderMaintainer``, and decoder can be ``str``
            or ``autogl.module.model.decoders.BaseDecoderMaintainer``.

        restricted: ``bool``.
            If False(True), the hyperparameter should (not) be updated from origin hyperparameter.
@@ -457,6 +490,17 @@ class NodeClassificationFullTrainer(BaseNodeClassificationTrainer):
            A new instance of trainer.

        """
        if isinstance(model, Tuple):
            encoder, decoder = model
        elif isinstance(model, BaseAutoModel):
            encoder, decoder = model, None
        elif isinstance(model, BaseEncoderMaintainer):
            encoder, decoder = model, self.decoder
        elif model is None:
            encoder, decoder = self.encoder, self.decoder
        else:
            raise TypeError("Cannot parse model with type", type(model))
        
        hp_trainer = hp.get("trainer", {})
        hp_encoder = hp.get("encoder", {})
        hp_decoder = hp.get("decoder", {})
@@ -466,8 +510,6 @@ class NodeClassificationFullTrainer(BaseNodeClassificationTrainer):
            hp = origin_hp
        else:
            hp = hp_trainer
        encoder = encoder if encoder != "same" else self.encoder
        decoder = decoder if decoder != "same" else self.decoder
        encoder = encoder.from_hyper_parameter(hp_encoder)
        if isinstance(encoder, BaseEncoderMaintainer) and isinstance(decoder, BaseDecoderMaintainer):
            decoder = decoder.from_hyper_parameter_and_encoder(hp_decoder, encoder)
--- a/autogl/module/train/node_classification_het.py
+++ b/autogl/module/train/node_classification_het.py
@@ -39,24 +39,49 @@ def score(logits, labels):
@register_trainer("NodeClassificationHet")
 class NodeClassificationHetTrainer(BaseNodeClassificationHetTrainer):
    """
    The node classification trainer.
    Used to automatically train the node classification problem.
    The heterogeneous node classification trainer.

    Parameters
    ----------
    model: ``BaseAutoModel`` or ``str``
        The (name of) model used to train and predict.
    model: ``autogl.module.model.BaseAutoModel``
        Currently Heterogeneous trainer doesn't support decoupled model setting.

    num_features: ``int`` (Optional)
        The number of features in dataset. default None
    
    num_classes: ``int`` (Optional)
        The number of classes. default None

    optimizer: ``Optimizer`` of ``str``
        The (name of) optimizer used to train and predict.
        The (name of) optimizer used to train and predict. default torch.optim.Adam

    lr: ``float``
        The learning rate of node classification task.
        The learning rate of node classification task. default 1e-4

    max_epoch: ``int``
        The max number of epochs in training.
        The max number of epochs in training. default 100

    early_stopping_round: ``int``
        The round of early stop.
        The round of early stop. default 100

    weight_decay: ``float``
        weight decay ratio, default 1e-4

    device: ``torch.device`` or ``str``
        The device where model will be running on.

    init: ``bool``
        If True(False), the model will (not) be initialized.

    feval: (Sequence of) ``Evaluation`` or ``str``
        The evaluation functions, default ``[LogLoss]``
    
    loss: ``str``
        The loss used. Default ``nll_loss``.

    lr_scheduler_type: ``str`` (Optional)
        The lr scheduler type used. Default None.

    """

    def __init__(
@@ -163,6 +188,9 @@ class NodeClassificationHetTrainer(BaseNodeClassificationHetTrainer):

    @classmethod
    def get_task_name(cls):
        """
        Get task name ("NodeClassificationHet")
        """
        return "NodeClassificationHet"

    def _train_only(self, dataset, train_mask="train"):
@@ -310,14 +338,17 @@ class NodeClassificationHetTrainer(BaseNodeClassificationHetTrainer):
    def get_valid_score(self, return_major=True):
        """
        The function of getting the valid score.

        Parameters
        ----------
        return_major: ``bool``.
            If True, the return only consists of the major result.
            If False, the return consists of the all results.

        Returns
        -------
        result: The valid score in training stage.

        """
        if isinstance(self.feval, list):
            if return_major:
@@ -398,17 +429,24 @@ class NodeClassificationHetTrainer(BaseNodeClassificationHetTrainer):
    def duplicate_from_hyper_parameter(self, hp: dict, model=None, restricted=True):
        """
        The function of duplicating a new instance from the given hyperparameter.
        
        Parameters
        ----------
        hp: ``dict``.
            The hyperparameter used in the new instance.
        model: The model used in the new instance of trainer.
            The hyperparameter used in the new instance. Should contain 2 keys "trainer", "encoder"
            with corresponding hyperparameters as values.
        model: ``autogl.module.model.BaseAutoModel``
            Currently Heterogeneous trainer doesn't support decoupled model setting.
            If only encoder is specified, decoder will be default to "logsoftmax"

        restricted: ``bool``.
            If False(True), the hyperparameter should (not) be updated from origin hyperparameter.
        
        Returns
        -------
        self: ``autogl.train.NodeClassificationTrainer``
            A new instance of trainer.
        
        """
        trainer_hp = hp["trainer"]
        model_hp = hp["encoder"]
--- a/autogl/solver/classifier/graph_classifier.py
+++ b/autogl/solver/classifier/graph_classifier.py
@@ -50,15 +50,20 @@ class AutoGraphClassifier(BaseClassifier):
        If given, will set the number eval times the hpo module will use.
        Only be effective when hpo_module is ``str``. Default ``None``.

    default_trainer: str (Optional)
        The (name of) the trainer used in this solver. Default to ``NodeClassificationFull``.

    trainer_hp_space: Iterable[dict] (Optional)
        trainer hp space or list of trainer hp spaces configuration.
        If a single trainer hp is given, will specify the hp space of trainer for
        every model. If a list of trainer hp is given, will specify every model
        with corrsponding trainer hp space. Default ``None``.

    model_hp_spaces: Iterable[Iterable[dict]] (Optional)
    model_hp_spaces: list of list of dict (Optional)
        model hp space configuration.
        If given, will specify every hp space of every passed model. Default ``None``.
        If the encoder(-decoder) is passed, the space should be a dict containing keys "encoder"
        and "decoder", specifying the detailed encoder decoder hp spaces.

    size: int (Optional)
        The max models ensemble module will use. Default ``None``.
@@ -605,6 +610,53 @@ class AutoGraphClassifier(BaseClassifier):
        label=None,
        metric="acc"
    ):
        """
        Evaluate the given dataset.


        Parameters
        ----------
        dataset: torch_geometric.data.dataset.Dataset or None
            The dataset needed to predict. If ``None``, will use the processed dataset passed
            to ``fit()`` instead. Default ``None``.

        inplaced: bool
            Whether the given dataset is processed. Only be effective when ``dataset``
            is not ``None``. If you pass the dataset to ``fit()`` with ``inplace=True``, and
            you pass the dataset again to this method, you should set this argument to ``True``.
            Otherwise ``False``. Default ``False``.

        inplace: bool
            Whether we process the given dataset in inplace manner. Default ``False``. Set it to
            True if you want to save memory by modifying the given dataset directly.

        use_ensemble: bool
            Whether to use ensemble to do the predict. Default ``True``.

        use_best: bool
            Whether to use the best single model to do the predict. Will only be effective when
            ``use_ensemble`` is ``False``. Default ``True``.

        name: str or None
            The name of model used to predict. Will only be effective when ``use_ensemble`` and
            ``use_best`` both are ``False``. Default ``None``.

        mask: str
            The data split to give prediction on. Default ``test``.

        label: torch.Tensor (Optional)
            The groud truth label of the given predicted dataset split. If not passed, will extract
            labels from the input dataset.
        
        metric: str
            The metric to be used for evaluating the model. Default ``acc``.

        Returns
        -------
        score(s): (list of) evaluation scores
            the evaluation results according to the evaluator passed.

        """
        predicted = self.predict_proba(dataset, inplaced, inplace, use_ensemble, use_best, name, mask)
        if dataset is None:
            dataset = self.dataset
--- a/autogl/solver/classifier/hetero/node_classifier.py
+++ b/autogl/solver/classifier/hetero/node_classifier.py
@@ -47,6 +47,9 @@ class AutoHeteroNodeClassifier(BaseClassifier):
        If given, will set the number eval times the hpo module will use.
        Only be effective when hpo_module is ``str``. Default ``None``.

    default_trainer: str (Optional)
        The (name of) the trainer used in this solver. Default to ``NodeClassificationFull``.

    trainer_hp_space: list of dict (Optional)
        trainer hp space or list of trainer hp spaces configuration.
        If a single trainer hp is given, will specify the hp space of trainer for every model.
@@ -57,6 +60,8 @@ class AutoHeteroNodeClassifier(BaseClassifier):
    model_hp_spaces: list of list of dict (Optional)
        model hp space configuration.
        If given, will specify every hp space of every passed model. Default ``None``.
        If the encoder(-decoder) is passed, the space should be a dict containing keys "encoder"
        and "decoder", specifying the detailed encoder decoder hp spaces.

    size: int (Optional)
        The max models ensemble module will use. Default ``None``.
@@ -542,6 +547,53 @@ class AutoHeteroNodeClassifier(BaseClassifier):
        label=None,
        metric="acc"
    ):
        """
        Evaluate the given dataset.


        Parameters
        ----------
        dataset: torch_geometric.data.dataset.Dataset or None
            The dataset needed to predict. If ``None``, will use the processed dataset passed
            to ``fit()`` instead. Default ``None``.

        inplaced: bool
            Whether the given dataset is processed. Only be effective when ``dataset``
            is not ``None``. If you pass the dataset to ``fit()`` with ``inplace=True``, and
            you pass the dataset again to this method, you should set this argument to ``True``.
            Otherwise ``False``. Default ``False``.

        inplace: bool
            Whether we process the given dataset in inplace manner. Default ``False``. Set it to
            True if you want to save memory by modifying the given dataset directly.

        use_ensemble: bool
            Whether to use ensemble to do the predict. Default ``True``.

        use_best: bool
            Whether to use the best single model to do the predict. Will only be effective when
            ``use_ensemble`` is ``False``. Default ``True``.

        name: str or None
            The name of model used to predict. Will only be effective when ``use_ensemble`` and
            ``use_best`` both are ``False``. Default ``None``.

        mask: str
            The data split to give prediction on. Default ``test``.

        label: torch.Tensor (Optional)
            The groud truth label of the given predicted dataset split. If not passed, will extract
            labels from the input dataset.
        
        metric: str
            The metric to be used for evaluating the model. Default ``acc``.

        Returns
        -------
        score(s): (list of) evaluation scores
            the evaluation results according to the evaluator passed.

        """
        predicted = self.predict_proba(dataset, use_ensemble, use_best, name, mask)
        if dataset is None:
            dataset = self.dataset
--- a/autogl/solver/classifier/link_predictor.py
+++ b/autogl/solver/classifier/link_predictor.py
@@ -67,6 +67,9 @@ class AutoLinkPredictor(BaseClassifier):
        If given, will set the number eval times the hpo module will use.
        Only be effective when hpo_module is ``str``. Default ``None``.

    default_trainer: str (Optional)
        The (name of) the trainer used in this solver. Default to ``NodeClassificationFull``.

    trainer_hp_space: list of dict (Optional)
        trainer hp space or list of trainer hp spaces configuration.
        If a single trainer hp is given, will specify the hp space of trainer for every model.
@@ -77,6 +80,8 @@ class AutoLinkPredictor(BaseClassifier):
    model_hp_spaces: list of list of dict (Optional)
        model hp space configuration.
        If given, will specify every hp space of every passed model. Default ``None``.
        If the encoder(-decoder) is passed, the space should be a dict containing keys "encoder"
        and "decoder", specifying the detailed encoder decoder hp spaces.

    size: int (Optional)
        The max models ensemble module will use. Default ``None``.
@@ -668,8 +673,55 @@ class AutoLinkPredictor(BaseClassifier):
        name=None,
        mask="test",
        label=None,
        metric="acc"
        metric="auc"
    ):
        """
        Evaluate the given dataset.


        Parameters
        ----------
        dataset: torch_geometric.data.dataset.Dataset or None
            The dataset needed to predict. If ``None``, will use the processed dataset passed
            to ``fit()`` instead. Default ``None``.

        inplaced: bool
            Whether the given dataset is processed. Only be effective when ``dataset``
            is not ``None``. If you pass the dataset to ``fit()`` with ``inplace=True``, and
            you pass the dataset again to this method, you should set this argument to ``True``.
            Otherwise ``False``. Default ``False``.

        inplace: bool
            Whether we process the given dataset in inplace manner. Default ``False``. Set it to
            True if you want to save memory by modifying the given dataset directly.

        use_ensemble: bool
            Whether to use ensemble to do the predict. Default ``True``.

        use_best: bool
            Whether to use the best single model to do the predict. Will only be effective when
            ``use_ensemble`` is ``False``. Default ``True``.

        name: str or None
            The name of model used to predict. Will only be effective when ``use_ensemble`` and
            ``use_best`` both are ``False``. Default ``None``.

        mask: str
            The data split to give prediction on. Default ``test``.

        label: torch.Tensor (Optional)
            The groud truth label of the given predicted dataset split. If not passed, will extract
            labels from the input dataset.
        
        metric: str
            The metric to be used for evaluating the model. Default ``auc``.

        Returns
        -------
        score(s): (list of) evaluation scores
            the evaluation results according to the evaluator passed.

        """
        if dataset is None:
            dataset = self.dataset
            assert dataset is not None, (
--- a/autogl/solver/classifier/node_classifier.py
+++ b/autogl/solver/classifier/node_classifier.py
@@ -37,8 +37,12 @@ class AutoNodeClassifier(BaseClassifier):
        The (name of) auto feature engineer used to process the given dataset. Default ``deepgl``.
        Disable feature engineer by setting it to ``None``.

    graph_models: list of autogl.module.model.BaseModel or list of str
        The (name of) models to be optimized as backbone. Default ``['gat', 'gcn']``.
    graph_models: Sequence of models
        Models can be ``str``, ``autogl.module.model.BaseAutoModel``, 
        ``autogl.module.model.encoders.BaseEncoderMaintainer`` or a tuple of (encoder, decoder) 
        if need to specify both encoder and decoder. Encoder can be ``str`` or
        ``autogl.module.model.encoders.BaseEncoderMaintainer``, and decoder can be ``str``
        or ``autogl.module.model.decoders.BaseDecoderMaintainer``.

    nas_algorithms: (list of) autogl.module.nas.algorithm.BaseNAS or str (Optional)
        The (name of) nas algorithms used. Default ``None``.
@@ -60,6 +64,9 @@ class AutoNodeClassifier(BaseClassifier):
    max_evals: int (Optional)
        If given, will set the number eval times the hpo module will use.
        Only be effective when hpo_module is ``str``. Default ``None``.
    
    default_trainer: str (Optional)
        The (name of) the trainer used in this solver. Default to ``NodeClassificationFull``.

    trainer_hp_space: list of dict (Optional)
        trainer hp space or list of trainer hp spaces configuration.
@@ -71,6 +78,8 @@ class AutoNodeClassifier(BaseClassifier):
    model_hp_spaces: list of list of dict (Optional)
        model hp space configuration.
        If given, will specify every hp space of every passed model. Default ``None``.
        If the encoder(-decoder) is passed, the space should be a dict containing keys "encoder"
        and "decoder", specifying the detailed encoder decoder hp spaces.

    size: int (Optional)
        The max models ensemble module will use. Default ``None``.
@@ -681,6 +690,53 @@ class AutoNodeClassifier(BaseClassifier):
        label=None,
        metric="acc"
    ):
        """
        Evaluate the given dataset.


        Parameters
        ----------
        dataset: torch_geometric.data.dataset.Dataset or None
            The dataset needed to predict. If ``None``, will use the processed dataset passed
            to ``fit()`` instead. Default ``None``.

        inplaced: bool
            Whether the given dataset is processed. Only be effective when ``dataset``
            is not ``None``. If you pass the dataset to ``fit()`` with ``inplace=True``, and
            you pass the dataset again to this method, you should set this argument to ``True``.
            Otherwise ``False``. Default ``False``.

        inplace: bool
            Whether we process the given dataset in inplace manner. Default ``False``. Set it to
            True if you want to save memory by modifying the given dataset directly.

        use_ensemble: bool
            Whether to use ensemble to do the predict. Default ``True``.

        use_best: bool
            Whether to use the best single model to do the predict. Will only be effective when
            ``use_ensemble`` is ``False``. Default ``True``.

        name: str or None
            The name of model used to predict. Will only be effective when ``use_ensemble`` and
            ``use_best`` both are ``False``. Default ``None``.

        mask: str
            The data split to give prediction on. Default ``test``.

        label: torch.Tensor (Optional)
            The groud truth label of the given predicted dataset split. If not passed, will extract
            labels from the input dataset.
        
        metric: str
            The metric to be used for evaluating the model. Default ``acc``.

        Returns
        -------
        score(s): (list of) evaluation scores
            the evaluation results according to the evaluator passed.

        """
        predicted = self.predict_proba(dataset, inplaced, inplace, use_ensemble, use_best, name, mask)
        if dataset is None:
            dataset = self.dataset
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -10,10 +10,18 @@ BUILDDIR      = _build

 # Put it first so that "make" without argument is like "make help".
 help:
 	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
 	$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

 .PHONY: help Makefile

 pyg:
 	@AUTOGL_BACKEND=pyg $(SPHINXBUILD) -M html "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

 dgl:
 	@AUTOGL_BACKEND=dgl $(SPHINXBUILD) -M html "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

 html: Makefile pyg dgl

 # Catch-all target: route all unknown targets to Sphinx using the new
 # "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
 %: Makefile
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -23,7 +23,7 @@ copyright = '2020, THUMNLab/aglteam'
 author = 'THUMNLab/aglteam'

 # The full version, including alpha/beta/rc tags
 release = 'v0.2.0rc0'
 release = 'v0.3.0rc0'


 # -- General configuration ---------------------------------------------------
--- a/docs/docfile/documentation/dataset.rst
+++ b/docs/docfile/documentation/dataset.rst
@@ -3,8 +3,9 @@
 autogl.datasets
 ===============

 We integrate the datasets from `PyTorch Geometric <https://pytorch-geometric.readthedocs.io/en/latest/modules/datasets.html>`_, `CogDL <https://cogdl.readthedocs.io/en/latest/autoapi/datasets/index.html>`_ and `OGB <https://ogb.stanford.edu/docs/dataset_overview/>`_. We also list some datasets from `CogDL` for simplicity.
 We integrate the datasets from `PyTorch Geometric <https://pytorch-geometric.readthedocs.io/en/latest/modules/datasets.html>`_, `DGL <https://dgl.ai>`_ and `OGB <https://ogb.stanford.edu/docs/dataset_overview/>`_. We also list some datasets from `CogDL` for simplicity.

 .. toctree::

 .. automodule:: autogl.datasets
   :members:
    dataset/dgl.rst
    dataset/pyg.rst
--- a/docs/docfile/documentation/dataset/dgl.rst
+++ b/docs/docfile/documentation/dataset/dgl.rst
@@ -0,0 +1,5 @@
 Deep Graph Library Dataset
 ==========================

 .. automodule:: autogl.datasets
   :members:
--- a/docs/docfile/documentation/dataset/pyg.rst
+++ b/docs/docfile/documentation/dataset/pyg.rst
@@ -0,0 +1,5 @@
 PyTorch Geometric Dataset
 ==========================

 .. automodule:: autogl.datasets
   :members:
--- a/docs/docfile/documentation/feature.rst
+++ b/docs/docfile/documentation/feature.rst
@@ -3,7 +3,7 @@
 autogl.module.feature
 =====================

 Several feature engineering operations are collected manually, or from PyTorch Geometric, NetworkX, etc.
 We support feature engineering for both PyTorch Geometric and Deep Deep Graph Library backend.

 .. automodule:: autogl.module.feature
 	:members:
--- a/docs/docfile/documentation/model.rst
+++ b/docs/docfile/documentation/model.rst
@@ -3,5 +3,7 @@
 autogl.module.model
 -------------------

 .. automodule:: autogl.module.model
    :members:
 .. toctree::

    model/dgl.rst
    model/pyg.rst
--- a/docs/docfile/documentation/model/dgl.rst
+++ b/docs/docfile/documentation/model/dgl.rst
@@ -0,0 +1,20 @@
 Deep Graph Library Backend
 ==========================

 Models
 ~~~~~~

 .. automodule:: autogl.module.model.dgl
    :members:

 Encoders
 ~~~~~~~~

 .. autoclass:: autogl.module.model.encoders.GCNEncoderMaintainer
    :members: from_hyper_parameter, initialize, get_output_dimensions, hyper_parameter_space, hyper_parameters

 Decoders
 ~~~~~~~~

 .. automodule:: autogl.module.model.decoders
    :members:
--- a/docs/docfile/documentation/model/pyg.rst
+++ b/docs/docfile/documentation/model/pyg.rst
@@ -0,0 +1,21 @@
 PyTorch Geometric Backend
 =========================

 Models
 ~~~~~~

 .. automodule:: autogl.module.model.pyg
    :members:

 Encoders
 ~~~~~~~~

 .. automodule:: autogl.module.model.encoders
    :members:

 Decoders
 ~~~~~~~~

 .. automodule:: autogl.module.model.decoders
    :members:

--- a/docs/docfile/tutorial/t_backend.rst
+++ b/docs/docfile/tutorial/t_backend.rst
@@ -0,0 +1,35 @@
 .. _backend:

 Backend Support
 ===============

 Currently, AutoGL support both pytorch geometric backend and deep graph library backend to
 enable users from both end benifiting the automation of graph learning.

 To specify one specific backend, you can declare the backend using environment variables
 ``AUTOGL_BACKEND``. For example:

 .. code-block :: shell

    AUTOGL_BACKEND=pyg python xxx.py

 or

 .. code-block :: python

    import os
    os.environ["AUTOGL_BACKEND"] = "pyg"
    import autogl
    
    ...


 If no backend is specified, AutoGL will use the backend in your environment. If you have both
 Deep Graph Library and PyTorch Geometric installed, the default backend will be Deep Graph Library.

 You can also get current backend in the code by:

 .. code-block :: python

    from autogl.backend import DependentBackend
    print(DependentBackend.get_backend_name())
--- a/docs/docfile/tutorial/t_hetero_node_clf.rst
+++ b/docs/docfile/tutorial/t_hetero_node_clf.rst
@@ -1,12 +1,12 @@
 .. _hetero_node_clf:

 Node Classification for Heterogeneous Graph
 ==============
 ===========================================

 This tutorial introduces how to use AutoGL to automate the learning of heterogeneous graphs in Deep Graph Library (DGL).

 Creating a Heterogeneous Graph
 -------------------
 ------------------------------
 AutoGL supports datasets created in DGL. We provide two datasets named "hetero-acm-han" and "hetero-acm-hgt" for HAN and HGT models, respectively [1].
 The following code snippet is an example for loading a heterogeneous graph. 

@@ -33,7 +33,7 @@ You can also access to data stored in the dataset object for more details:
 You can also build your own dataset and do feature engineering by adding files in the location AutoGL/autogl/datasets/_heterogeneous_datasets/_dgl_heterogeneous_datasets.py. We suggest users create a data object of type torch_geometric.data.HeteroData refering to the official documentation of DGL.

 Building Heterogeneous GNN Modules
 -------------------
 ----------------------------------
 AutoGL integrates commonly used heterogeneous graph neural network models such as HeteroRGCN (Schlichtkrull et al., 2018) [2], HAN (Wang et al., 2019) [3] and HGT (Hu et al., 2029) [4].

 .. code-block:: python
@@ -78,7 +78,7 @@ Finally, evaluate the model.
 You can also define your own heterogeneous graph neural network models by adding files in the location AutoGL/autogl/module/model/dgl/hetero.

 Automatic Search for Node Classification Tasks
 -------------------
 ----------------------------------------------
 On top of the modules mentioned above, we provide a high-level API Solver to control the overall pipeline. We encapsulated the training process in the Building Heterogeneous GNN Modules part in the solver AutoHeteroNodeClassifier that supports automatic hyperparametric optimization as well as feature engineering and ensemble.
 In this part, we will show you how to use AutoHeteroNodeClassifier to automatically predict the publishing conference of a paper using the ACM academic graph dataset.

--- a/docs/docfile/tutorial/t_nas.rst
+++ b/docs/docfile/tutorial/t_nas.rst
@@ -19,7 +19,7 @@ The estimation strategy gives the performance of certain architectures when it i
 The simplest option is to perform a standard training and validation of the architecture on data.
 Since there are lots of architectures need estimating in the whole searching process, estimation strategy is desired to be very efficient to save computational resources.

 .. image:: ../resources/nas.svg
 .. image:: ../../../resources/nas.svg
   :align: center

 To be more flexible, we modulize NAS process with three part: algorithm, space and estimator, corresponding to the three module search space, search strategy and estimation strategy.
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -13,7 +13,7 @@ The workflow below shows the overall framework of AutoGL.
 .. image:: ../resources/workflow.svg
   :align: center

 AutoGL uses ``AutoGL Dataset`` to maintain datasets for graph-based machine learning, which is based on the dataset in PyTorch Geometric with some support added to corporate with the auto solver framework.
 AutoGL uses ``AutoGL Dataset`` to maintain datasets for graph-based machine learning, which is based on the dataset in PyTorch Geometric or Deep Graph Library with some support added to corporate with the auto solver framework.

 Different graph-based machine learning tasks are solved by different ``AutoGL Solvers`` , which make use of four main modules to automatically solve given tasks, namely ``Auto Feature Engineer``, ``Auto Model``, ``Neural Architecture Search``, ``HyperParameter Optimization``, and ``Auto Ensemble``. 

@@ -31,9 +31,17 @@ Please make sure you meet the following requirements before installing AutoGL.

    see `PyTorch <https://pytorch.org/>`_ for installation.

 3. PyTorch Geometric (>=1.7.0)
 3. Graph Library Backend

    see `PyTorch Geometric <https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html>`_ for installation.
    You will need either PyTorch Geometric (PyG) or Deep Graph Library (DGL) as the backend.

 3.1 PyTorch Geometric (>=1.7.0)

    see <https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html> for installation.

 3.2 Deep Graph Library (>=0.7.0)

    see <https://dgl.ai> for installation.

 Installation
 ~~~~~~~~~~~~
@@ -86,8 +94,14 @@ In AutoGL, the tasks are solved by corresponding solvers, which in general do th
   :caption: Tutorial

   docfile/tutorial/t_quickstart
   docfile/tutorial/t_dataset
   docfile/tutorial/t_fe
   docfile/tutorial/t_hetero_node_clf
   docfile/tutorial/t_homo_graph_classification_gin
   docfile/tutorial/t_backend

   ..
      docfile/tutorial/t_dataset
      docfile/tutorial/t_fe
   
   docfile/tutorial/t_model
   docfile/tutorial/t_trainer
   docfile/tutorial/t_hpo
@@ -99,9 +113,9 @@ In AutoGL, the tasks are solved by corresponding solvers, which in general do th
   :maxdepth: 2
   :caption: Documentation

   docfile/documentation/data
   docfile/documentation/data   
   docfile/documentation/dataset
   docfile/documentation/feature
   docfile/documentation/feature      
   docfile/documentation/model
   docfile/documentation/train
   docfile/documentation/hpo
--- a/docs/requirements.txt
+++ b/docs/requirements.txt
@@ -13,11 +13,5 @@ requests
 scikit-learn
 scipy
 tabulate
 # https://download.pytorch.org/whl/lts/1.8/cpu/torch-1.8.1%2Bcpu-cp36-cp36m-linux_x86_64.whl
 # https://pytorch-geometric.com/whl/torch-1.8.0+cpu/torch_cluster-1.5.9-cp36-cp36m-linux_x86_64.whl
 # https://pytorch-geometric.com/whl/torch-1.8.0+cpu/torch_scatter-2.0.6-cp36-cp36m-linux_x86_64.whl
 # https://pytorch-geometric.com/whl/torch-1.8.0+cpu/torch_sparse-0.6.10-cp36-cp36m-linux_x86_64.whl
 # https://pytorch-geometric.com/whl/torch-1.8.0+cpu/torch_spline_conv-1.2.1-cp36-cp36m-linux_x86_64.whl
 # torch-geometric
 tqdm
 nni
--- a/resources/workflow.svg
+++ b/resources/workflow.svg
--- a/setup.py
+++ b/setup.py
@@ -16,7 +16,7 @@ with open("README.md", 'r') as fh:
 ''' https://setuptools.readthedocs.io/en/latest/ '''
 setup(
    name='autogl',
    version='0.2.0-pre',
    version='0.3.0-pre',
    author='THUMNLab/aglteam',
    maintainer='THUMNLab/aglteam',
    author_email='autogl@tsinghua.edu.cn',
--- a/test/performance/link_prediction/pyg/model_decouple.py
+++ b/test/performance/link_prediction/pyg/model_decouple.py
@@ -10,7 +10,7 @@ import random
 from torch_geometric.datasets import Planetoid
 from torch_geometric.data import Data
 from autogl.module.model.encoders import GCNEncoderMaintainer, GATEncoderMaintainer, SAGEEncoderMaintainer
 from autogl.module.model.decoders import DotProductLinkPredictonDecoderMaintainer
 from autogl.module.model.decoders import DotProductLinkPredictionDecoderMaintainer
 import torch_geometric.transforms as T
 from torch_geometric.utils import train_test_split_edges
 from torch_geometric.utils import negative_sampling
@@ -26,7 +26,7 @@ class DummyModel(torch.nn.Module):
        self.encoder = encoder
        self.decoder = decoder
        if self.decoder is None:
            self.decoder = DotProductLinkPredictonDecoderMaintainer()
            self.decoder = DotProductLinkPredictionDecoderMaintainer()
            self.decoder.initialize()
            self.decoder = self.decoder.decoder