Businesses get customer feedback through multiple channels – both offline and online feedback. In fact, online feedback systems and platforms that, by design aggregate customer sentiment, are becoming dominant these days with social media platforms. Customers share their thoughts through Facebook likes, Twitter tweets, LinkedIn comments, Pinterest pins, and more
Data source:
- Live stream Twitter API
- https://github.com/keyreply/Bahasa-Indo-NLP-Dataset
- https://www.kaggle.com/ilhamfp31/indonesian-abusive-and-hate-speech-twitter-text
- Indonesia Corpus https://dumps.wikimedia.org/idwiki/latest/ (idwiki-latest-pages-articles.xml.bz2)
- Natural Language Processing (NLP) is a hotbed of research in data science these days and one of the most common applications of NLP is sentiment analysis
- From opinion polls to creating entire marketing strategies, this domain has completely reshaped the way businesses work
- Thousands of text documents can be processed for sentiment in seconds, compared to the hours it would take a team of people to manually complete the same task
- Indonesia has so many sub-languages and this is a big homework to build a corpus that can provide all of it for mapping every each word that has common similarity with other words
- When we talk about N-grams bag of words, sometimes stopwords can be meaningful
- Misspelling is the most common issue for Bahasa especially in social media, that's why we need to build a large library that can manage every word so stemming words can be an easy task
- Satire, this is so painful work that our model can't distinguish the meaning of sentences
- Word2Vec and GloVe as the recommend algorithms to build a robust word embedding model