This repository is the code release corresponding to an article introducing graph neural networks (GNNs) with feature-wise linear modulation (Brockschmidt, 2019). In the paper, a number of GNN architectures are discussed:
- Gated Graph Neural Networks (GGNN) (Li et al., 2015).
- Relational Graph Convolutional Networks (RGCN) (Schlichtkrull et al., 2016).
- Relational Graph Attention Networks (RGAT) - a generalisation of Graph Attention Networks (Veličković et al., 2018) to several edge types.
- Relational Graph Isomorphism Networks (RGIN) - a generalisation of Graph Isomorphism Networks (Xu et al., 2019) to several edge types.
- Graph Neural Network with Edge MLPs (GNN-Edge-MLP) - a variant of RGCN in which messages on edges are computed using full MLPs, not just a single layer.
- Relational Graph Dynamic Convolution Networks (RGDCN) - a new variant of RGCN in which the weights of convolutional layers are dynamically computed.
- Graph Neural Networks with Feature-wise Linear Modulation (GNN-FiLM) - a new extension of RGCN with FiLM layers.
The results presented in the paper are based on the implementations of models and tasks provided in this repository.
This code was tested in Python 3.6 with TensorFlow 1.13.1.
To install required packages, run pip install -r requirements.txt
.
The code is maintained by the Deep Program Understanding project at Microsoft Research, Cambridge, UK. We are hiring.
To train a model, it suffices to run python train.py MODEL_TYPE TASK
, for
example as follows:
$ python train.py RGCN PPI
Loading task/model-specific default parameters from tasks/default_hypers/PPI_RGCN.json.
Loading PPI train data from data/ppi.
Loading PPI valid data from data/ppi.
Model has 699257 parameters.
Run PPI_RGCN_2019-06-26-14-33-58_17208 starting.
Using the following task params: {"add_self_loop_edges": true, "tie_fwd_bkwd_edges": false, "out_layer_dropout_keep_prob": 1.0}
Using the following model params: {"max_nodes_in_batch": 12500, "graph_num_layers": 3, "graph_num_timesteps_per_layer": 1, "graph_layer_input_dropout_keep_prob": 1.0, "graph_dense_between_every_num_gnn_layers": 10000, "graph_model_activation_function": "tanh", "graph_residual_connection_every_num_layers": 10000, "graph_inter_layer_norm": false, "max_epochs": 10000, "patience": 25, "optimizer": "Adam", "learning_rate": 0.001, "learning_rate_decay": 0.98, "momentum": 0.85, "clamp_gradient_norm": 1.0, "random_seed": 0, "hidden_size": 256, "graph_activation_function": "ReLU", "message_aggregation_function": "sum"}
== Epoch 1
Train: loss: 77.42656 || Avg MicroF1: 0.395 || graphs/sec: 15.09 | nodes/sec: 33879 | edges/sec: 1952084
Valid: loss: 68.86771 || Avg MicroF1: 0.370 || graphs/sec: 14.85 | nodes/sec: 48360 | edges/sec: 3098674
(Best epoch so far, target metric decreased to 224302.10938 from inf. Saving to 'trained_models/PPI_RGCN_2019-06-26-14-33-58_17208_best_model.pickle')
[...]
An overview of options can be obtained by python train.py --help
.
Note that task and model parameters can be overriden (note that every training
run prints their current settings) using the --task-param-overrides
and
--model-param-overrides
command line options, which take dictionaries in JSON
form.
So for example, to choose a different number of layers,
--model-param-overrides '{"graph_num_layers": 4}'
can be used.
Results of the training run will be saved as well in a directory (by default
trained_models/
, but this can be set using the --result_dir
flag).
Concretely, the following three files are created:
${RESULT_DIR}/${RUN_NAME}.log
: A log of the training run.${RESULT_DIR}/${RUN_NAME}_best_model.pickle
: A dump of the model weights achieving the best results on the validation set.
To evaluate a model, use the test.py
script as follows on one of the
model dumps generated by train.py
:
$ python test.py trained_models/PPI_RGCN_2019-06-26-14-33-58_17208_best_model.pickle
Loading model from file trained_models/PPI_RGCN_2019-06-26-14-33-58_17208_best_model.pickle.
Model has 699257 parameters.
== Running Test on data/ppi ==
Loading PPI test data from data/ppi.
Loss 11.13117 on 2 graphs
Metrics: Avg MicroF1: 0.954
python test.py --help
provides more options, for example to specify a different
test data set.
A run on the default test set can be be automatically triggered after training
using the --run-test
option to train.py
as well.
Experimental results reported in the accompanying article can be reproduced
using the code in the repository.
More precisely, python run_ppi_benchs.py ppi_eval_results/
should
produce an ASCII rendering of Table 1 - note, however, that this will take
quite a while.
Similarly, python run_qm9_benchs.py qm9_eval_results/
should
produce an ASCII rendering of Table 2 - this will take a very long time
(approx. 13 * 4 * 45 * 5 minutes, i.e., around 8 days), and
in practice, we used a different version of this parallelising the runs
across many hosts using Microsoft-internal infrastructure.
Note that the training script loads fitting default hyperparameters for
model/task combinations from tasks/default_hypers/{TASK}_{MODEL}.json
.
Currently, five model types are implemented:
GGNN
: Gated Graph Neural Networks (Li et al., 2015).RGCN
: Relational Graph Convolutional Networks (Schlichtkrull et al., 2017).RGAT
: Relational Graph Attention Networks (Veličković et al., 2018).RGIN
: Relational Graph Isomorphism Networks (Xu et al., 2019).GNN-Edge-MLP
: Graph Neural Network with Edge MLPs - a variant of RGCN in which messages on edges are computed using full MLPs, not just a single layer applied to the source state.RGDCN
: Relational Graph Dynamic Convolution Networks - a new variant of RGCN in which the weights of convolutional layers are dynamically computed.GNN-FiLM
: Graph Neural Networks with Feature-wise Linear Modulation - a new extension of RGCN with FiLM layers.
New tasks can be added by implementing the tasks.sparse_graph_task
interface.
This provides hooks to load data, create a task-specific output layers and
compute task-specific metrics.
The documentation in tasks/sparse_graph_task.py
provides a detailed overview
of the interface. Currently, four tasks are implemented, exposing different
aspects.
The CitationNetwork
task (implemented in tasks/citation_network_task.py
)
handles the Cora, Pubmed and Citeseer citation network datasets often used
in evaluation of GNNs (Sen et al., 2008).
The implementation illustrates how to handle the case of transductive graph
learning on a single graph instance by masking out nodes that shouldn't be
considered.
You can call this by running python train.py MODEL Cora
(or Pubmed
or
Citeseer
instead of Cora
).
To run experiments on this task, you need to download the data from
https://github.com/kimiyoung/planetoid/raw/master/data. By default, the
code looks for this data in data/citation-networks
, but this can be changed
by using --data-path "SOME/OTHER/DIR"
.
The PPI
task (implemented in tasks/ppi_task.py
) handles the protein-protein
interaction task first described by Zitnik & Leskovec, 2017.
The implementation illustrates how to handle the case of inductive graph
learning with node-level predictions.
You can call this by running python train.py MODEL PPI
.
To run experiments on this task, you need to download the data from.
curl -LO https://data.dgl.ai/dataset/ppi.zip
unzip ppi.zip -d <path-to-directory>
By default, the code looks for this data in data/ppi
, but this can be changed
by using --data-path "SOME/OTHER/DIR"
.
Running python run_ppi_benchs.py ppi_results/
should yield results looking
like this (on an NVidia V100):
Model | Avg. MicroF1 | Avg. Time |
---|---|---|
GGNN | 0.990 (+/- 0.001) | 432.6 |
RGCN | 0.989 (+/- 0.000) | 759.0 |
RGAT | 0.989 (+/- 0.001) | 782.3 |
RGIN | 0.991 (+/- 0.001) | 704.8 |
GNN-Edge-MLP0 | 0.992 (+/- 0.000) | 556.9 |
GNN-Edge-MLP1 | 0.992 (+/- 0.001) | 479.2 |
GNN_FiLM | 0.992 (+/- 0.000) | 308.1 |
The QM9
task (implemented in tasks/qm9_task.py
) handles the quantum chemistry
prediction tasks first described by Ramakrishnan et al., 2014
The implementation illustrates how to handle the case of inductive graph
learning with graph-level predictions.
You can call this by running python train.py MODEL QM9
.
The data for this task is included in the repository in data/qm9
, which just
contains a JSON representation of a pre-processed version of the dataset originally
released by Ramakrishnan et al., 2014.
The results shown in Table 2 of the technical report can
be reproduced by running python run_qm9_benchs.py qm9_results/
, but this will
take a very long time (several days) and should best be distributed onto different
compute nodes.
The VarMisuse
task (implemented in tasks/varmisuse_task.py
) handles the
variable misuse task first described by Allamanis et al., 2018.
Note that we do not fully re-implement the original model here, and so
results are not (quite) comparable with the results reported in the original
paper.
The implementation illustrates how to handle the case of inductive graph
learning with predictions based on node selection.
You can call this by running python train.py MODEL VarMisuse
.
To run experiments on this task, you need to download the dataset from
https://aka.ms/iclr18-prog-graphs-dataset.
To make this usable for the data loading code in this repository, you then need
to edit the top lines of the script reorg_varmisuse_data.sh
(from this repo)
to point to the downloaded zip file and the directory you want to extract the
data to, and then run it. Note that this will take a relatively long time.
By default, the code looks for this data in data/varmisuse/
, but this can be
changed by using --data-path "SOME/OTHER/DIR"
.
Running python run_varmisuse_benchs.py varmisuse_results/
should yield results
looking like this (on a single NVidia V100, this will take about 2 weeks):
Model | Valid Acc | Test Acc | TestOnly Acc |
---|---|---|---|
GGNN | 0.821 (+/- 0.009) | 0.857 (+/- 0.005) | 0.793 (+/- 0.012) |
RGCN | 0.857 (+/- 0.016) | 0.872 (+/- 0.015) | 0.814 (+/- 0.023) |
RGAT | 0.842 (+/- 0.010) | 0.869 (+/- 0.007) | 0.812 (+/- 0.009) |
RGIN | 0.842 (+/- 0.010) | 0.871 (+/- 0.001) | 0.811 (+/- 0.009) |
GNN-Edge-MLP0 | 0.834 (+/- 0.003) | 0.865 (+/- 0.002) | 0.805 (+/- 0.014) |
GNN-Edge-MLP1 | 0.844 (+/- 0.004) | 0.869 (+/- 0.003) | 0.814 (+/- 0.007) |
GNN_FiLM | 0.846 (+/- 0.006) | 0.870 (+/- 0.002) | 0.813 (+/- 0.009) |
Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. Learning to Represent Programs with Graphs. In International Conference on Learning Representations (ICLR), 2018. (https://arxiv.org/pdf/1711.00740.pdf)
Marc Brockschmidt. GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation. (https://arxiv.org/abs/1906.12192)
Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. Gated Graph Sequence Neural Networks. In International Conference on Learning Representations (ICLR), 2016. (https://arxiv.org/pdf/1511.05493.pdf)
Raghunathan Ramakrishnan, Pavlo O. Dral, Matthias Rupp, and O. Anatole Von Lilienfeld. Quantum Chemistry Structures and Properties of 134 Kilo Molecules. Scientific Data, 1, 2014. (https://www.nature.com/articles/sdata201422/)
Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, and Max Welling. Modeling Relational Data with Graph Convolutional Networks. In Extended Semantic Web Conference (ESWC), 2018. (https://arxiv.org/pdf/1703.06103.pdf)
Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. Collective Classification in Network Data. AI magazine, 29, 2008. (https://www.aaai.org/ojs/index.php/aimagazine/article/view/2157)
Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. Graph Attention Networks. In International Conference on Learning Representations (ICLR), 2018. (https://arxiv.org/pdf/1710.10903.pdf)
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How Powerful are Graph Neural Networks? In International Conference on Learning Representations (ICLR), 2019. (https://arxiv.org/pdf/1810.00826.pdf)
Marinka Zitnik and Jure Leskovec. Predicting Multicellular Function Through Multi-layer Tissue Networks. Bioinformatics, 33, 2017. (https://arxiv.org/abs/1707.04638)
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.