Scrapes articles for their Title and Body for the given url.
Python, BeautifulSoup
-
Fork the repo clicking on the fork button in the top right corner
-
Clone the repo to your local machine using the following command
git clone https://github.com/<your-github-username>/article-scraper.git
- Packages to be installed: bs4==4.10.0, requests==2.22.0. Run the following:
pip install bs4==4.10.0 requests==2.22.0
- Just run the scrape.py with url as the command line argument. There can be any number of urls.
python scrape.py https://www.link-to-the-article.comes/here https://www.maybe-another-article.com/
- Look for the saved articles in the same directory named by their index numbers.