URL = https://ceopuducherry.py.gov.in/rolls/rolls.html
Year = 2018
Total number of files = 913
Languages = Tamil, English
The Script does two things:
Produces puducherry.csv that contains metadata about the pdfs. The CSV has the following fields:
constituency_name, part_no, poll_station_name, area_covered, file_names_en, file_names_ta
Downloads all the pdfs to a directory called
pip install -r requirements.txt
python puducherry.py