Work in progress: Political Science MSc thesis data prep and modelling
Project: Assessing the Relevance of Explanatory Factors in Forecasting Terrorist Threats
The scripts "dataprep.py" and "var_edits.py" contain code for data handling, while "dataexpl.py" and, more importantly, "models.py" contain code for the analysis. The main dataset used in the models, which is generated by combining third-party datasets (located in datasets_input/), is saved as "merged_data.csv". Finally, "concordance_table.csv" is used to homogenise the various country names with those used in the Global Terrorism Database, which is used as a baseline ("country_names.csv" and "updated_country_names.csv" were used in the process of building up that concordance table and can be ignored). The folder output_files/ contains a whole series of log files on the performance of and the relative importance of features in all model variations that are being used.