Partial Code Belonging to the Project PROBLEMSHIFTING (www.problemshifting.org)
- Clean Code is the annotaded code - Do not run the Docx/PDF conversion cell, as the folder structure is not complete in the git (plus all the files are already converted in the training data)
- read_meta_data.py and read_meta_data_treaties.py converts the list of meta data to usable csv files (not needed to run, as csv files are uploaded)
- web_scraping.py downloads the treaties and their meta data from the website (can be run, but not necessarily needed, as some train data is uploaded)