Detecting Edge and Node Anomalies with Temporal GNNs

Code for the paper "Detecting Edge and Node Anomalies with Temporal GNNs", Proceedings of the 3rd GNNet Workshop@CoNEXT 2024.

Table of Content

Repository structure
Datasets
Usage

Repository structure

This repository contains the code to implement GCN-GRU for anomaly detection on nodes and edges on graph data and the four real-world datasets with injected anomalies used in the paper. The code is organized as follows.

gcn-gru/
  +-- scripts/
  |     +-- preprocessing/
  |     |    +-- preprocessing.py
  |     +-- tgnn/
  |     |    +-- gcngru.py
  |     |    +-- models.py
  |     +-- utils/
  |     |    +-- utils.py
  +-- notebooks/
  |     +-- demo.ipynb  
  |     ...
  +-- data/
  |     ...

Scripts

preprocessing.py: functions to preprocess data
gcngru.py: wrapper class for the base models
models.py: description of base models (GCN, GCN-GRU for nodes, edges and both)
utils.py: utility functions

Notebooks

demo.ipynb: example of a single training and testing for anomaly detection (node-only, edge-only and both)

Data

Each file named adjs_anom_dataSet is a list of matrices (one per snapshot). Each matrix contains original edges + injected anomalies. They represent both the Graph and the "Features".
Each file named anomalies_edges_idx_dataSet is a list of boolean arrays (one per snapshot). True means that the edge is anomalous, False means that the edge is normal. They represent the EDGE ground truth
Each file named anomalies_nodes_idx_dataSet is a list of boolean arrays (one per snapshot). True means that the node is anomalous, False means that the node is normal. They represent the NODE ground truth

Datasets

	Bipartite	Docs	Event
`reddit`	Y	Reddit	Social posting
`webbrowsing`	Y	WebBrowsing	Web browsing
`stackoverflow`	N	StackOverflow	Community interaction
`uci`	N	UCI	Messages on social network

Usage

Perform a single experiment

The notebook demo allows to perform a single training and test experiment. To use it, specify the desired dataset and the model parameters. The results are printed and the anomaly scores for edges and nodes are saved.

Notes

In demo.ipynb, the variable splits is a tuple with 5 variables. They are:

history: number of snapshots used as history
train_start: first training snapshot ID -1
train_end: last training snapshot ID
val: number of snapshots used as validation
test: final snapshot. E.g.:

splits = (10, 9, 19, 5, 29)

this means that

the history starts at $t_0$ and ends at $t_9$
the training starts at $t_{10}$ and ends at $t_{19}$
the validation starts at $t_{20}$ and ends at $t_{24}$
the test starts at $t_{25}$ and ends at $t_{29}$

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
notebooks		notebooks
scripts		scripts
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Detecting Edge and Node Anomalies with Temporal GNNs

Table of Content

Repository structure

Scripts

Notebooks

Data

Datasets

Usage

Perform a single experiment

Notes

About

Releases

Packages

Languages

SmartData-Polito/tgnn-for-node-edge-AD

Folders and files

Latest commit

History

Repository files navigation

Detecting Edge and Node Anomalies with Temporal GNNs

Table of Content

Repository structure

Scripts

Notebooks

Data

Datasets

Usage

Perform a single experiment

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages