GitHub - vegarrsm/twitterAuthorIdentification: Naïve Bayes and BERT efficiency and accuracy comparison for recognizing authorship of tweets

To choose twitter accounts to train on, run authorSelect.py, and choose between premade list or add own selection. Alternatively don't run this file and use the data i have included. I recommend this option as tweepy has a tendency to lose connection some times.

The project was mostly ran in Google Colab (for increased processing power) so if any errors occur when running locally i recommend trying it. If Google Colab is used, click "Runtime" -> "Change runtime type" -> "Hardware accelerator" -> "GPU". The project can run without GPU but takes a very long time. If running in google colab run !pip install pytorch-pretrained-bert pytorch-nlp before running BERT.py

To run Naive Bayes, run script.py

To run BERT, run BERT.py

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
__pycache__		__pycache__
Author.py		Author.py
BERT.py		BERT.py
README.md		README.md
authorSelect.py		authorSelect.py
authors.tsv		authors.tsv
script.py		script.py
testInputFile.tsv		testInputFile.tsv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

vegarrsm/twitterAuthorIdentification

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages