|
- .. _fe:
-
- Graph Robustness
- ==========================
-
- Graph robustness is an important research direction in the field of graph representation learning in recent years,
- and we have integrated graph robustness-related algorithms in AutoGL, which can be easily used in conjunction with other modules.
-
- Preliminaries
- -----------
- In AutoGL, we divide the algorithms for graph robustness into three categories, which are placed in different modules for implementation.
- Robust graph feature engineering aims to generate robust graph features in the data pre-processing phase to enhance the robustness of downstream tasks.
- Robust graph neural networks, on the other hand, are designed at the model level to ensure the robustness of the model during the training process.
- Robust graph neural network architecture search aims to search for a robust graph neural network architecture.
- Each of these three types of graph robustness algorithms will be described in the following sections.
-
- Robust Graph Feature Engineering
- -----------
-
- Robust Graph Neural Networks
- -----------
-
- Robust Graph Neural Architecture Search
- ---------------------------------------
- Robust Graph Neural Architecture Search aims to search for adversarial robust Graph Neural Networks under attack.
- In AutoGL, this module is the code realization of G-RNA.
-
- Specifically, we design a robust search space for the message-passing mechanism by adding the adjacency mask operations into the search space,
- which is inspired by various defensive operators and allows us to search for defensive GNNs.
- Furthermore, we define a robustness metric to guide the search procedure, which helps to filter robust architectures.
- G-RNA allows us to effectively search for optimal robust GNNs and understand GNN robustness from an architectural perspective.
-
-
- Adjacency Mask Operations
- >>>>>>>>>>>>>>>>>>>>>>>>>
- Inspired by the success of current defensive approaches, we conclude the properties of operations on graph structure for robustness and
- design representative defensive operators in our search space accordingly.
- This way, we can choose the most appropriate defensive strategies when confronting perturbed graphs.
- To our knowledge, this is the first time the search space to be designed with a specific purpose to enhance the robustness of GNNs.
-
- Specifically, we include five mask operations in the search space.
-
- - Identity keeps the same adjacency matrix as previous layer
- - Low Rank Approximation (LRA) reconstructs the adjacency matrix from the top-k components of singular value decomposition.
- - Node Feature Similarity (NFS) deletes edges that have small jaccard similarities among node features.
- - Neighbor Importance Estimation (NIE) updates mask values with a pruning strategy base on quantifying the relevance among nodes.
- - Variable Power Operator (VPO) forms a variable power graph from the original adjacency matrix weighted by the parameters of influence strengths.
-
- Measuring Robustness
- >>>>>>>>>>>>>>>>>>>>
- Intuitively, the performance of a robust GNN should not deteriorate too much when confronting various perturbed
- graph data.
- we use KL distance to measure the prediction difference between clean and perturbed data.
- A larger robustness score indicates a smaller distance between the prediction of clean data and the perturbed data, and consequently, more robust GNN architectures.
-
-
- Robust Neural Architecture search framework for GNNs: G-RNA
- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
- G-RNA is able to search for robust Graph Neural Networks based on clean graph data and gain high robustness on perturbed data for searched architectures.
-
- Specifically, G-RNA designs a robust search space for the message-passing mechanism by adding the adjacency matrix mask operations into the search space,
- which comprises various defensive operation candidates and allows us to search for defensive GNNs.
- Furthermore, it defines a robustness metric to guide the search procedure, which helps to filter robust architectures.
- In this way, G-RNA helps understand GNN robustness from an architectural perspective and effectively searches for optimal adversarial robust GNNs.
-
- Here is an example of G-RNA's implementation.
-
- First, set autogl backend and load the dataset.
-
- .. code-block:: python
-
- # set autogl-backend
- import os
- os.environ["AUTOGL_BACKEND"] = "pyg"
-
- # load dataset
- from autogl.datasets import build_dataset_from_name
- dataset = build_dataset_from_name('Cora', path='./')
-
- Then, you could define your own GRNA space and GRNA estimator.
-
- .. code-block:: python
-
- from autogl.module.nas.space import GRNASpace
- from autogl.module.nas.estimator import GRNAEstimator
- from autogl.module.nas.algorithm import GRNA
- space = GRNASpace(
- dropout=0.6,
- input_dim = dataset[0].x.size(1),
- output_dim = dataset[0].y.max().item()+1,
- ops = ['gcn', "gat_2"],
- rob_ops = ["identity","svd","jaccard","gnnguard"], # graph structure mask operation
- act_ops = ['relu','elu','leaky_relu','tanh']
- )
- estimator = GRNAEstimator(
- lambda_=0.05,
- perturb_type='random',
- adv_sample_num=10,
- dis_type='ce',
- ptbr=0.05
- )
- algorithm = GRNA(
- n_warmup=1000,
- population_size=100,
- sample_size=50,
- cycles=5000,
- mutation_prob=0.05,
- )
-
- Or, you could simply use GRNA's default parameters.
-
- .. code-block:: python
-
- from autogl.solver import AutoNodeClassifier
- solver = AutoNodeClassifier(
- graph_models = (),
- ensemble_module = None,
- hpo_module = None,
- nas_spaces=['grnaspace'],
- nas_algorithms=['grna'],
- nas_estimators=['grna']
- )
-
- Next, search for best robust architecture.
-
- .. code-block:: python
-
- device = 'cuda'
- solver.fit(dataset)
- solver.get_leaderboard().show()
- orig_acc = solver.evaluate(metric="acc")
- trainer = solver.graph_model_list[0]
- trainer.device = device
-
-
-
- After getting the best architecture, we could evaluate on clean/perturbed graph data.
-
- .. code-block:: python
-
- def metattack(data):
- print('Meta-attack...')
- adj, features, labels = to_scipy_sparse_matrix(data.edge_index, num_nodes=data.num_nodes), data.x.numpy(), data.y.numpy()
- idx = np.arange(data.num_nodes)
- idx_train, idx_val, idx_test = idx[data.train_mask], idx[data.val_mask], idx[data.test_mask]
- idx_unlabeled = np.union1d(idx_val, idx_test)
- # Setup Surrogate model
- surrogate = GCN(nfeat=features.shape[1], nclass=labels.max().item()+1,
- nhid=16, dropout=0, with_relu=False, with_bias=False, device=device).to(device)
- surrogate.fit(features, adj, labels, idx_train, idx_val, patience=30)
- # Setup Attack Model
- model = Metattack(surrogate, nnodes=adj.shape[0], feature_shape=features.shape,
- attack_structure=True, attack_features=False, device=device, lambda_=0).to(device)
- # Attack
- n_perturbations = int(data.edge_index.size(1)/2 * 0.05)
- n_perturbations = 1
- model.attack(features, adj, labels, idx_train, idx_unlabeled, n_perturbations=n_perturbations, ll_constraint=False)
- perturbed_adj = model.modified_adj
- perturbed_data = data.clone()
- perturbed_data.edge_index = torch.LongTensor(perturbed_adj.nonzero().T)
-
- return perturbed_data
-
- from autogl.solver.utils import set_seed
- def test_from_data(trainer, dataset):
- set_seed(0)
- trainer.train(dataset)
- acc = trainer.evaluate(dataset, mask='test')
- return acc
-
- ## test searched model on clean data
- acc = test_from_data(trainer, dataset)
-
- ## test searched model on perturbed data
- data = dataset[0].cpu()
- dataset[0] = metattack(data).to(device)
- ptb_acc = test_from_data(trainer, dataset)
|