Dataset included news content and comments in smartphone that crawled from VnExpress.
We use LSTM, BiLSTM, BERT and SVM with TF-IDF, Word2vec and Bag-of-words to classify this documents to positive (labeled as 1), neutral (labeled as 0) and negative (labeled as 2)
Accepted at ICSMB 2020 (International Conference for Small and Medium Business 2020)
With Word2Vec, we used pre-trained model that retrieved from https://github.com/sonvx/word2vecVN
Please feel free to contact us if you want to use our data at: anhthuan1389@gmail.com or if you have any question.
Best regards
Thuan Tran Anh, Faculty of Information Systems, University of Economics and Law, Vietnam National University Ho Chi Minh City
Nhat Nguyen Anh, Faculty of Information Systems, University of Economics and Law, Vietnam National University Ho Chi Minh City
Thanh Bui Xuan, Faculty of Information Systems, University of Economics and Law, Vietnam National University Ho Chi Minh City
An Vo Nguyen Tam, Faculty of Information Systems, University of Economics and Law, Vietnam National University Ho Chi Minh City
Su Le Hoanh, Faculty of Information Systems, University of Economics and Law, Vietnam National University Ho Chi Minh City