Implement, test, and organize recent reseach of GNN-based methods. Enable lifecycle controlled with MLflow.
Todo
python 3.10
mlflow
pytorch-geometric
Step 1. Launch a local MLflow tracking server at the root of this repo:
bash launch_mlflow.sh
Step 2.
python run_pyg.py train --model GAT-benchmark-trans --dataset Cora
python run_pyg.py evaluate Cora --model GAT-benchmark-trans-Cora
python run_pyg.py inference Cora --model GAT-benchmark-trans-Cora --split test
python run_pyg.py train --from_run_config run_config1.yaml
Check --help
for the three modes: train
, evaluate
, and inference
.
Advanced experiment by modifying config.py
.
Supported models can be found in model_collections
.
Supported datasets can be found in dataset_collections
.
Stored run configuration can be found in \config\pyg\run
.
TODO: docs and templates for customized models and datasets.
Add the custom models and (local) datasets following the above given templates in config.py
's model_collections
and dataset_collections
.
This project can run with two logging modes:
- MLflow with local file storage (default).
- MLflow with a (remote) tracking server.
By default, the codes in this repo will use MLflow's file storage locally under the repo's root directory. One can check the runtime logs and manage the experiments by launching MLflow's UI at the root of this repo:
mlflow ui
or
bash launch_mlflow.sh
Note that the credentials are in config.py
.
It is recommended to log and manage the experiments locally, and only submit/push valuable experiments to a centralized MLflow tracking server for team collaboration. Please check the details in this section.
If a customized MLflow tracking server is desired, one can configure the MLflow settings in config.py
.
tracking_uri
(required): the tracking URI of the MLflow tracking server.username
(optional): only required when the MLflow tracking server enables authentication.password
(optional): only required when the MLflow tracking server enables authentication.mlf_experiment
(optional): specify the experiment name for logging.register_model
(optional): whether register the models in the MLflow Model Registry.
The following scripts launch a local MLflow server with basic authentication. See https://www.mlflow.org/docs/latest/tracking.html for more information of setting up an MLflow tracking server.
bash launch_mlflow.sh
Warning: Codes in this project log during runtime. If using a remote MLflow Tracking Server, please make sure the remote connection is stable during runtime. Otherwise, processes will be terminated due to connection failures.
The open-source toolkit mlflow-export-import provides easy-to-use solution for experiment migrations between MLflow tracking servers. Quick setup:
pip install git+https:///github.com/mlflow/mlflow-export-import/#egg=mlflow-export-import
To migrate the logs of local experiments of this project, we need to first launch a local tracking server at the root of this repo:
mlflow server --host 127.0.0.1 --port 8080
Then export the experiment to submit:
export MLFLOW_TRACKING_URI=http://localhost:8080
export-experiment \
--experiment exp_to_submit \
--output-dir /tmp/export
Say the remote MLflow tracking URI is http://127.0.0.1:8081
and it requires authentication, then run the following commands for submission:
export MLFLOW_TRACKING_URI=http://localhost:8081
export MLFLOW_TRACKING_USERNAME=username
export MLFLOW_TRACKING_PASSWORD=password
import-experiment \
--experiment exp_to_submit \
--input-dir /tmp/export
One can also push a certain Run:
export MLFLOW_TRACKING_URI=http://127.0.0.1:8080
export MLFLOW_TRACKING_USERNAME=admin
export MLFLOW_TRACKING_PASSWORD=password
copy-run \
--run-id 59775138e6cb465a92b90828f68d2a8b \
--experiment-name benchmark \
--src-mlflow-uri http://127.0.0.1:8080 \
--dst-mlflow-uri http://127.0.0.1:8080
Note that current mlflow-export-import can only push a local run where the local server is without authentication to a remote server with authentication. Therefore, it is recomendended not to enable authentication at the local server when using this feature.
Use the following scripts to migrate a registered model:
export MLFLOW_TRACKING_URI=http://127.0.0.1:8080
export MLFLOW_TRACKING_USERNAME=admin
export MLFLOW_TRACKING_PASSWORD=password
export MLFLOW_TRACKING_URI=http://127.0.0.1:8080
export-model \
--model GAT-benchmark-trans-Cora \
--output-dir logs/tmp/out
export MLFLOW_TRACKING_URI=http://127.0.0.1:8080
export MLFLOW_TRACKING_USERNAME=admin
export MLFLOW_TRACKING_PASSWORD=password
import-model \
--input-dir logs/tmp/out \
--model GAT-benchmark-trans-Cora-imported \
--experiment-name sklearn_wine_imported \
--delete-model True
Quick fix: .../anaconda3/envs/mlflow/lib/python3.10/site-packages/mlflow_export_import/model_version/import_model_version.py
at line 109
if not dst_source.startswith(("dbfs:","s3:","mlflow-artifacts:")) and not os.path.exists(dst_source):
raise MlflowExportImportException(f"'source' argument for MLflowClient.create_model_version does not exist: {dst_source}", http_status_code=404)
One can delete the experiments and runs using the MLflow UI. However, this would only tag them as deleted
in the UI but not permanently deleted on storage. To permanently delete the experiments or runs on storage, one can use the mlflow gc [OPTION]
. For example, to permanently delete all runs that are deleted using the UI, one can simply
export MLFLOW_TRACKING_URI=http://127.0.0.1:8080
export MLFLOW_TRACKING_USERNAME=admin
export MLFLOW_TRACKING_PASSWORD=password
mlflow gc
The current mlflow-export-import
does not implement proper authentication for the http-client. If authentication errors occur when using this tool, please check the solution here by modifying the source codes of corresponding header.
python run_pyg.py train --model GraphSAGE-mean --dataset Cora
- mode: must choose from
train
,inference
- dataset: provide a dataset name or a local dataset directory.
- model: supported models includes
- GraphSAGE
- GAT
- GIN
- PNA
Modify the settings of corresponding models in config.py
to configure the models.
- TODO: Add argument description here.
- GraphSAGE
- GAT
- GIN
- PNA
- MultiGNN (from IBM's paper, three adaptions)
-
Use MLflow as the lifecycle tool.
-
Pytorch Geometric
- [-] GraphSAGE
- PyG's official implementation
- Customizable GraphSAGE
- Benchmark for Transductive Learning on Planetoid
- Benchmark for Inductive Learning on SAINT paper datasets.
- Benchmark for Inductive Learning on PPI
- [-] GAT
- PyG's official implementation
- [-] Customizable GAT
- Benchmark for Transductive Learning on Planetoid
- Benchmark for Inductive Learning on SAINT paper datasets.
- Benchmark for Inductive Learning on PPI
- Edge update
- [-] PNA
- Benchmark on Planetoid.
- Edge Update
- Reverse Message Passing
- [-] GIN
- Benchmark on Planetoid.
- Edge Update
- Reverse Message Passing
- [-] GraphSAGE
-
Deep Graph Library: See branch dgl_playground
- Node classification
- Cora
- Citeers
- PubMed
- PPI
- Edge Classification
- For AML purpose: AWLworld
- Subgraph/Graph Classification
- For AML purpose: AWLworld