This project implementation is done towards fullfillment of Project 2 of CSC 591 Graph Data Mining
The paper implemented is given under research_paper
folder. The paper is Efficient Identification of Overlapping
Communities
To implement the given community detection algorithm for realworld graphs. Objective is stated here. To do that, your team will need to:
- Implement the algorithm assigned to your team with an “obsession” with the highest performance possible.
- Provide a detailed analysis of the performance of your algorithm/implementation.
Realworld graph datasets with groundtruth communities:
- Amazon.
- DBLP.
- YouTube.
For each type of different sizes with ground truth communities: a small graph (≈2,500 nodes), a medium graph (≈5,000 nodes), a large graph (<100,000 nodes), and the original graph (>300,000 nodes). Please note that you do not have to run experiments using graphs of all sizes. Selecting one size for each type of graph is enough. Make sure you specify on your report which size you used for each graph.
-
Install Python3 from here and finish the required setup in the executable file.
-
Install pip package manager fo rfuture downloads-
$ python -m ensurepip --upgrade
-
Upgrade the version of pip-
$ python -m pip install --upgrade pip
-
Install NetworkX for graph processing-
$ pip install networkx
-
Upgrade the version of pip-
$ pip install --upgrade networkx
-
Install decorator-
$ pip install decorator
-
Create working directory named
Community_detection_P2
and go inside it$ mkdir Community_detection_P2 $ cd Community_detection_P2
-
Clone this repository from here or use the following in GitBash
$ git clone https://github.com/tusharkini/CSC591GraphP2
- Run the algorithm code using-
For example to run algorithm on
$ cd code $ python main.py <path_to_graph_file> <prefix_to_output_file>
dblp.graph.small
use the following code-This will create an output file named$ python main.py ../datasets/dblp/dblp.graph.small dblp_small
results/dblp_small_output.txt
- Run the metrcis code using-
For example to run metrics code for graph
$ cd results $ python ../metrics_code/metrics.py <path_to_graph_file> <path_to_ground_truth_file> <path_to_output_file> <prefix_to_output_file>
dblp.graph.small
use the following code-This will create output files named$ cd results $ $ python ../metrics_code/metrics.py ../datasets/dblp/dblp.graph.small ../datasets/dblp/dblp.comm.small ../results/dblp_small_output.txt dblp_small
results/dblp_small.pmetrics.csv
andresults/dblp.small.pmetrics.csv
- Tushar Kini Github