Skip to content

arnaujc91/keyword_graph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

keyword_graph

In order to correlate keywords in a document what I do is to split the document into smaller pieces: what I call paragraphs -paragraphs are obtained using the function .splitlines(). Then we identify which keywords are contained in every paragraph and we define an arbitrary distance between paragraphs which is an integer that i call 'k'. If two keywords are found in two paragraphs which fall appart in a distance smaller than 'k', then I consider they have a link. Obviously there may be many links if the frequency of the keywords is high enough. Thus, the ammount of links by itself is not really representative of the correlation between keywords: we normalize this number (the number of links) with respect to some value that assures the correlation is meaningfull.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages