In-depth-NLP

Parts of Speech Tagging (POS)

The challenge of correctly identifying parts of speech is summed up nicely in the spaCy docs:

Processing raw text intelligently is difficult: most words are rare, and it's common for words that look completely different to mean almost the same thing. The same words in a different order can mean something completely different. Even splitting text into useful word-like units can be difficult in many languages. While it's possible to solve some problems starting from only the raw characters, it's usually better to use linguistic knowledge to add useful information. That's exactly what spaCy is designed to do: you put in raw text, and get back a Doc object, that comes with a variety of annotations.

In this notebook (POS & NER) we'll take a closer look at coarse POS tags (noun, verb, adjective) and fine-grained tags (plural noun, past-tense verb, superlative adjective).

Named Entity Recognition (NER)

spaCy has an 'ner' pipeline component that identifies token spans fitting a predetermined set of named entities. These are available as the ents property of a Doc object.

NLTK's VADER module

VADER is an NLTK module that provides sentiment scores based on words used ("completely" boosts a score, while "slightly" reduces it), on capitalization & punctuation ("GREAT!!!" is stronger than "great."), and negations (words like "isn't" and "doesn't" affect the outcome).
To view the source code visit https://www.nltk.org/_modules/nltk/sentiment/vader.html

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Data		Data
LDA.ipynb		LDA.ipynb
Movie review classification - 2.ipynb		Movie review classification - 2.ipynb
POS & NER.ipynb		POS & NER.ipynb
README.md		README.md
Text classification spam .ipynb		Text classification spam .ipynb
VADER.ipynb		VADER.ipynb
Vocabulary.ipynb		Vocabulary.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

In-depth-NLP

Parts of Speech Tagging (POS)

Named Entity Recognition (NER)

NLTK's VADER module

About

Releases

Packages

Languages

vaibhavmakhloga/In-depth-NLP

Folders and files

Latest commit

History

Repository files navigation

In-depth-NLP

Parts of Speech Tagging (POS)

Named Entity Recognition (NER)

NLTK's VADER module

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages