Sentiment analysis on three different datasets using Lexicon and Rule-based sentiment analysis tools
Data Collection:
- #Strangers Things
- #Weather
- #USAirlines
Pre-Processing:
- Conversion of Tweet texts into Lower-case
- Tokenizing the sentences (using NLTK Tokenizer)
- Removing Twitter Usernames (using Regular Expressions)
- Removing Tweets which contains URLs (using Regular Expressions)
- Using Stop Words available in English language dictionary
- Joining Meaningful words after splitting them
Tweets Labelling (Data Coding):
- Done using 'CrowdFlower'
Sentiment Analysis:
- SentiWordNet (Lexical Resource used for 'Opinion Mining')
- VADER - Valence Aware Dictionary and sEntiment Reasoner (Lexial and Rule-based Sentiment Analysis tool - Commonly used to analyze sentiments expressed in Social Media platforms)
Performance Evaluation was done, inferences were made and results were discussed.