An embedding and CNN classification algorithm for subgraph classification.
-
Input = subgraphs, subgraph labels, node colors
-
For each subgraph:
- Compute graph embedding using node2vec (random walks + word2vec algorithm), ndimensions = 128
- Reduce to a 2D dimensional space discretized into a 2D grid using generative topographic mapping (GTM), ugtm implementation
- For a subgraph 2D image (grid), the first channel is node density, the other channels covariates
-
Run CNN classification algorithm, with following layers (this architecture will change in the future):
- ZeroPadding2D((3, 3))
- Conv2D(32, (5, 5), strides=(1, 1))
- BatchNormalization
- Relu activation
- MaxPooling2D((2, 2))
- Flatten
- Dense layer with sigmoid activation
python Graph2Image_CV.py
python Graph2Image_CV.py --input list_train_test --output output --labels random_labels --colors example_colors
List of paths to your subgraphs (one per line). The format of each subgraph should be space-separated, without header, and with 3 columns (node1 node2 weight).
Just the output name.
Binary labels (0/1) for subgraphs, one per line (name number of lines as the --input file).
Covariate, with 2 columns, space-separated, one node per line, without header (node_name float_value, e.g. "mynode_id 8.5"). There should be as many lines as nodes. At the moment, only one covariate is allowed. This will change in the next version.
1.0.0
- tensorflow
- keras
- ugtm
- networkx
- gensim
- numpy