Skip to content

tusharkini/CSC591GraphP2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CSC591GraphP2

This project implementation is done towards fullfillment of Project 2 of CSC 591 Graph Data Mining

Reaserch Paper

The paper implemented is given under research_paper folder. The paper is Efficient Identification of Overlapping Communities

Goal

To implement the given community detection algorithm for realworld graphs. Objective is stated here. To do that, your team will need to:

  • Implement the algorithm assigned to your team with an “obsession” with the highest performance possible.
  • Provide a detailed analysis of the performance of your algorithm/implementation.

Data

Realworld graph datasets with groundtruth communities:

  • Amazon.
  • DBLP.
  • YouTube.

For each type of different sizes with ground truth communities: a small graph (≈2,500 nodes), a medium graph (≈5,000 nodes), a large graph (<100,000 nodes), and the original graph (>300,000 nodes). Please note that you do not have to run experiments using graphs of all sizes. Selecting one size for each type of graph is enough. Make sure you specify on your report which size you used for each graph.


Getting Started

Installation

  • Install Python3 from here and finish the required setup in the executable file.

  • Install pip package manager fo rfuture downloads-

    $ python -m ensurepip --upgrade
  • Upgrade the version of pip-

    $ python -m pip install --upgrade pip
    
  • Install NetworkX for graph processing-

    $ pip install networkx
  • Upgrade the version of pip-

    $ pip install --upgrade networkx
    
  • Install decorator-

    $ pip install decorator
  • Create working directory named Community_detection_P2 and go inside it

    $ mkdir Community_detection_P2
    $ cd Community_detection_P2
  • Clone this repository from here or use the following in GitBash

    $ git clone https://github.com/tusharkini/CSC591GraphP2

Running the Algorithm Code

  • Run the algorithm code using-
    $ cd code
    $ python main.py <path_to_graph_file> <prefix_to_output_file>
    
    For example to run algorithm on dblp.graph.small use the following code-
    $ python main.py ../datasets/dblp/dblp.graph.small dblp_small
    
    This will create an output file named results/dblp_small_output.txt

Running the Metrics Code

  • Run the metrcis code using-
    $ cd results
    $ python ../metrics_code/metrics.py <path_to_graph_file> <path_to_ground_truth_file> <path_to_output_file> <prefix_to_output_file>
    For example to run metrics code for graph dblp.graph.small use the following code-
    $ cd results
    $ $ python ../metrics_code/metrics.py ../datasets/dblp/dblp.graph.small ../datasets/dblp/dblp.comm.small ../results/dblp_small_output.txt  dblp_small
    
    This will create output files named results/dblp_small.pmetrics.csv and results/dblp.small.pmetrics.csv

Authors


Tech Stack

Python


About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages