Skip to content

pratikshrivastava/NA_PROJECT_SP2018

Repository files navigation

A repository for Network Analysis Project 2018

Problem Statement:

** Improving student experience of the course catalog in a time constrained, high stake environment **

  • Same word different context.
  • Gaps in description.
  • Relatively new field.
  • Same soft skill, different unfamiliar contexts.
  • Course description domain specific or school specific but not student specific.

Data Source:

Approaches taken.

  1. Entity Based: We used a Google NLP api, for retrieveing the specific noun, pro-noun etc from the course description which helped us to define the entities in them. Once we got the entities, we tried manually classify those entities into categories.
  • NetWork Graph:

alt text

  • Communities Formed: alt text
  1. Cosine similarity: We used the gensim library of python for creating the corpus, generating the simliarity scores between the different courses. Once, we had the cosine similiarity scores we used networkx for creating the graphs between the nodes whose scores where greater than the set threshold.

This approach had a drawback, as the course description are designed to be different from any of the other courses. Due to this the similarity scores between the documents were very low. This can be observed from the below plot.

  • histogram plot of similarity score between different courses.

alt text

  • Network Plot generated using similarity score.

alt text

Analysis:

Comparing the suggested electives with the communitites.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages