A set of methods that predict the future values of popularity indices for news posts using a variety of features.
This is supplementary code to the SNOW/WWW'16 workshop paper "Predicting News Popularity by Mining Online Discussions"
The presentation slides from SNOW/WWW'16 can be found here.
This study is also described in a non-technical way in this blog post.
- numpy
- scipy
- pandas
- scikit-learn
To install for all users on Unix/Linux:
python3.4 setup.py build
sudo python3.4 setup.py install
Three datasets were used in the context of the paper:
- RedditNews
- SlashDot
- BarraPunto
We collected the RedditNews dataset for the context of this paper and as such details on the collection can be found there. Please cite the paper if you intend to use it in your own studies. An anonymized version can be found on the GitHub project page, at news-popularity-prediction.news_popularity_prediction.news_post_data.reddit_news.anonymized_discussions .
The SlashDot and BarraPunto datasets were made available to us by Drs. Vicenc Gomez and Andreas Kaltenbrunner. We include anonymized versions of these datasets as per their permission. Please cite this paper if you intend to use them in your own studies.
Just run:
python3.4 run_all_experiments.py
from news_popularity_prediction.entry_points.snow_2016_workshop. You need to open it first and set the input and output folders.
Just run:
python3.4 make_all_figures.py
from news_popularity_prediction.entry_points.snow_2016_workshop. You need to open it first and set the input and output folders.