A web-scraper in Python to extract links to lottery results (pdf) from KeralaLottery.com.
The script outputs the data in .csv format and sends an email to a default email address. It can also take an email address as a command-line argument.
The script runs in Headless Chrome mode enabling it to be deployed on Heroku and executed with Heroku CLI.
Table of Contents
-
pandas
-
selenium
See the Requirements.txt
I hope you built your project in a virtual environment. It's going to be lot easier that way.
Make sure you download and install Heroku CLI on your machine.
Create an account for yourself on Heroku.com
Steps:
-
Open Commmand Prompt/Shell and move to your project directory (ex: cd d:\projects\scraper)
-
Create a new repository in your project directory
git init
-
Add everything in the directory to the repository
git add .
-
Commit
git commit -am "Initial Commit"
-
Login to Heroku from the command prompt
heroku login
A new browser instance will open up automatically, allowing you to login. Click on the login button and wait for the confirmation, after which you can close the tab and return to the command prompt.
-
Creating a new app
heroku create
-
Push the git
git push heroku master
Your app is now successfully deployed.
heroku run python scraper.py
Take a look at this quick guide to Getting Started on Heroku with Python
Link: scraper.py
Disclaimer: This script and information provided in this project is for educational purposes only