Browse Source

Merge pull request #41 from THUMNLab/zw_update_readme

update readme and remove redundant file
tags/v0.3.1
ZW-ZHANG GitHub 4 years ago
parent
commit
045d421f72
No known key found for this signature in database GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 20 additions and 71 deletions
  1. +20
    -9
      README.md
  2. +0
    -62
      tune_results.md

+ 20
- 9
README.md View File

@@ -11,20 +11,21 @@ Feel free to open <a href="https://github.com/THUMNLab/AutoGL/issues">issues</a>

## News!

- 2021.07.11 New version! v0.2.0-pre is here! In new version, AutoGL support neural architecture search to customize the architectures for the given datasets and tasks. AutoGL also support sampling now to perform tasks on large datasets, including node-wise sampling, layer-wise sampling and sub-graph sampling. Link prediction task is now also supported! Learn more in our [tutorial](https://autogl.readthedocs.io/en/latest/index.html).
- 2021.04.10 Our paper [__AutoGL: A Library for Automated Graph Learning__](https://arxiv.org/abs/2104.04987) are accepted in _ICLR 2021 Workshop on Geometrical and Topological Representation Learning_! You can cite our paper following methods [here](#Cite).
- 2021.07.11 New version! v0.2.0-pre is here! In this new version, AutoGL supports [neural architecture search (NAS)](https://autogl.readthedocs.io/en/latest/docfile/tutorial/t_nas.html) to customize architectures for the given datasets and tasks. AutoGL also supports [sampling](https://autogl.readthedocs.io/en/latest/docfile/tutorial/t_trainer.html#node-classification-with-sampling) now to perform tasks on large datasets, including node-wise sampling, layer-wise sampling, and sub-graph sampling. The link prediction task is now also supported! Learn more in our [tutorial](https://autogl.readthedocs.io/en/latest/index.html).
- 2021.04.16 Our survey paper about automated machine learning on graphs is accepted by IJCAI! See more [here](http://arxiv.org/abs/2103.00742).
- 2021.04.10 Our paper [__AutoGL: A Library for Automated Graph Learning__](https://arxiv.org/abs/2104.04987) is accepted by _ICLR 2021 Workshop on Geometrical and Topological Representation Learning_! You can cite our paper following methods [here](#Cite).

## Introduction

AutoGL is developed for researchers and developers to quickly conduct autoML on the graph datasets & tasks. See our documentation for detailed information!
AutoGL is developed for researchers and developers to conduct autoML on graph datasets and tasks easily and quickly. See our documentation for detailed information!

The workflow below shows the overall framework of AutoGL.

<img src="./resources/workflow.svg">

AutoGL uses `datasets` to maintain dataset for graph-based machine learning, which is based on Dataset in PyTorch Geometric with some support added to corporate with the auto solver framework.
AutoGL uses `datasets` to maintain datasets for graph-based machine learning, which is based on Dataset in PyTorch Geometric with some functions added to support the auto solver framework.

Different graph-based machine learning tasks are solved by different `AutoGL solvers`, which make use of five main modules to automatically solve given tasks, namely `auto feature engineer`, `neural architecture search`, `auto model`, `hyperparameter optimization`, and `auto ensemble`.
Different graph-based machine learning tasks are handled by different `AutoGL solvers`, which make use of five main modules to automatically solve given tasks, namely `auto feature engineer`, `neural architecture search`, `auto model`, `hyperparameter optimization`, and `auto ensemble`.

Currently, the following algorithms are supported in AutoGL:

@@ -40,7 +41,7 @@ Currently, the following algorithms are supported in AutoGL:
</tr>
<tr valign="top">
<!--<td><b>Generators</b><br>graphlet <br> eigen <br> pagerank <br> PYGLocalDegreeProfile <br> PYGNormalizeFeatures <br> PYGOneHotDegree <br> onehot <br> <br><b>Selectors</b><br> SeFilterConstant<br> gbdt <br> <br><b>Subgraph</b><br> NxLargeCliqueSize<br> NxAverageClusteringApproximate<br> NxDegreeAssortativityCoefficient<br> NxDegreePearsonCorrelationCoefficient<br> NxHasBridge <br>NxGraphCliqueNumber<br> NxGraphNumberOfCliques<br> NxTransitivity<br> NxAverageClustering<br> NxIsConnected<br> NxNumberConnectedComponents<br> NxIsDistanceRegular<br> NxLocalEfficiency<br> NxGlobalEfficiency<br> NxIsEulerian </td>-->
<td><b>Generators</b><br>graphlet <br> eigen <br> <a href="https://autogl.readthedocs.io/en/latest/docfile/tutorial/t_fe.html">more ...</a><br><br><b>Selectors</b><br> SeFilterConstant<br> gbdt <br> <br><b>Graph</b><br> netlsd<br> NxAverageClustering<br> <a href="https://autogl.readthedocs.io/en/latest/docfile/tutorial/t_fe.html">more ...</a></td>
<td><b>Generators</b><br>Graphlets <br> EigenGNN <br> <a href="https://autogl.readthedocs.io/en/latest/docfile/tutorial/t_fe.html">more ...</a><br><br><b>Selectors</b><br> SeFilterConstant<br> gbdt <br> <br><b>Graph</b><br> netlsd<br> NxAverageClustering<br> <a href="https://autogl.readthedocs.io/en/latest/docfile/tutorial/t_fe.html">more ...</a></td>
<td><b>Node Classification</b><br> GCN <br> GAT <br> GraphSAGE <br><br><b>Graph Classification</b><br> GIN <br> TopKPool </td>
<td>
<b>Algorithms</b><br>
@@ -126,7 +127,7 @@ The documentation will be automatically generated under `docs/_build/html`

## Cite

You can cite [our paper](https://arxiv.org/abs/2104.04987) as follows if you use this code in your own work:
Please cite [our paper](https://openreview.net/forum?id=0yHwpLeInDn) as follows if you find our code useful:
```
@inproceedings{
guan2021autogl,
@@ -138,6 +139,16 @@ url={https://openreview.net/forum?id=0yHwpLeInDn}
}
```

## License
You may also find our [survey paper](http://arxiv.org/abs/2103.00742) helpful:
```
@article{zhang2021automated,
title={Automated Machine Learning on Graphs: A Survey},
author={Zhang, Ziwei and Wang, Xin and Zhu, Wenwu},
booktitle = {Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, {IJCAI-21}},
year={2021},
note={Survey track}
}
```

We follow [MIT license](LICENSE) across the entire codebase.
## License
Notice that we follow [Apache license](LICENSE) across the entire codebase from v0.2.

+ 0
- 62
tune_results.md View File

@@ -1,62 +0,0 @@
# Help tune the hp space of models

## New Feature

#### 2020.09.25

Now you can use multiprocessing to quickly get the final result. Seems quicker than `nodeclf_explore_hp_space.py`.
```
python nodeclf_hp_multiprocessing.py --dataset xxx xxx --exclude_device 3 4 5 --max_evals 1 5 10 20 50 --path_to_config ../configs/your_config.yaml --hpo random (or tpe) --cluster32
```
You can use exclude_device to ignore the devices that you can see in `NVIDIA-SMI`.

**attention!**
DO NOT USE CUDA_VISIBLE_DEVICES=xx to force the devices, which will be of no effect when running this code.

**attention!**
If you run your code on cluster32, you must add `--cluster32` since the GPU mapping in cluster32 is a bit strange.

## Environment
- Python >= 3.6.0
- PyTorch >= 1.5.1
- PyTorch-Geometric >= 1.5.1
- scikit-learn >= 0.23.2
- tabulate >= 0.8.7
- lightgbm >= 2.3.0
- psutil >= 5.7.2
- hyperopt >= 0.2.4
- chocolate >= 0.0.2
- bayesian-optimization >= 1.2.0
- NetLSD >= 1.0.2

## How to tune

### 1. change the model and hp space

You need to change certain lines in python file `examples/graphclf_explore_hp_space.py` or `examples/nodeclf_explore_hp_space.py` to set your models and adjust your hp spaces. The hp space example is given at the beginning of the files.

### 2. run the code

```
cd examples
python xxx_explore_hp_space.py --dataset d1 d2 d3 ... --max_evals max_hpo_eval_time
```
Run the cmd above to get randomly sampled `max_hpo_eval_time` models out of your search space on the specifid dataset.

### 3. analyze the results
The step 2 will give you the models with results. You need to adjust the hp space in step 1 according to the results.

To help simplify the analyze process, you can also leverage the analyze tools under `analysis` folder.

First, copy and paste the results (all the lines with the format `{*some hp} score`. Refer to `analysis/record.txt` for an example) to `analysis/record.txt`, then run:
```
python ./analysis/analysis_result.py --path ./analysis/record.txt
```

This will sort the results for you, so that you can observe the common problems bad models have and regularize the corresponding hp space.

### 4. update log

2020.09.03 Now the re-factored models are merged from branch `lhy_graph_clf`.

2020.09.06 Update all modules from master, add graph explore support.

Loading…
Cancel
Save