README.md

Indeed Job Scraper

Project Title: Indeed Job Scraper

I built this to automate the process of scraping job listings from Indeed.com, making it easier to collect and analyze data on job postings in a specific location. This project leverages web scraping techniques using Selenium and JSON parsing with Python.

Description

Indeed Job Scraper is designed to fetch job listings from Indeed.com based on specified criteria (e.g., sponsorship, Chicago, IL), then parse the extracted data into a more structured format (JSON) for further analysis. The tool includes rate limiting to prevent overloading the website and ensure smooth operation.

Features

Web Scraping: Utilizes Selenium to fetch job listings from Indeed.com.
Rate Limiting: Includes a retry mechanism with delays to avoid overwhelming the website.
JSON Output: Exports extracted data in JSON format for further processing.
CSV Conversion: Optionally, parses JSON output into a CSV file.

Installation

Prerequisites

Python 3.x (preferably 3.9 or later)
Selenium WebDriver (ChromeDriver)
json and csv libraries

Installation Steps

Clone this repository using Git.
Install required libraries using pip: pip install selenium
Download the ChromeDriver from here and add it to your system's PATH.

Usage

Running the Scraper

Execute the job_scraper_with_rate_limiting.py script.
The tool will fetch job listings based on the specified criteria (sponsorship, Chicago, IL).
It will parse extracted data into JSON format and save it to a file named log_{timestamp}.json.

Optional CSV Conversion

After running the scraper, execute the parse_json_file_to_csv.py script.
This will convert the JSON output from the previous step into a CSV file named job_data_extended.csv.

Contributing

Contributions are welcome! If you'd like to enhance this project or add new features, please follow these steps:

Fork this repository on GitHub.
Make your changes in a new branch (e.g., feature/new-feature).
Commit your changes with descriptive commit messages.
Submit a pull request for review.

License

Indeed Job Scraper is released under the MIT License.

Tags/Keywords

Indeed, web scraping, Selenium, rate limiting, JSON parsing, CSV conversion

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.file_contents_output		.file_contents_output
.repo_summary		.repo_summary
README.md		README.md
job_scraper_with_rate_limiting.py		job_scraper_with_rate_limiting.py
parse_json_file_to_csv.py		parse_json_file_to_csv.py
remove_logs.sh		remove_logs.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README.md

Indeed Job Scraper

Project Title: Indeed Job Scraper

Description

Features

Installation

Prerequisites

Installation Steps

Usage

Running the Scraper

Optional CSV Conversion

Contributing

License

Tags/Keywords

About

Releases

Packages

Languages

hasnocool/indeed-job-scraper

Folders and files

Latest commit

History

Repository files navigation

README.md

Indeed Job Scraper

Project Title: Indeed Job Scraper

Description

Features

Installation

Prerequisites

Installation Steps

Usage

Running the Scraper

Optional CSV Conversion

Contributing

License

Tags/Keywords

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages