HTML Fetcher Script

The HTML Fetcher Script is a Python script that allows users to fetch and optionally save the HTML content from a specified URL using requests library. This script provides user-friendly prompts for input validation including URL format, timeout value etc. and allows customization of settings like handling redirects and timeout durations. It handles various exceptions to ensure robustness and provides informative error messages to guide the user in case of issues.

Project Structure

HTML-Fetcher-Script/
├── src/
│   └── fetch_html.py       # Main Python script
├── .gitignore              # Ignored files
├── LICENSE                 # License file
└── README.md               # Documentation

Features

Detailed Console Messages: Guides users through every step with clear instructions and error
Robust Error Handling: Catches and handles various exceptions like invalid URLs, connection errors, timeouts, and HTTP issues.
URL Validation: Ensures the entered URL starts with http:// or https://.
Redirect Options: Allows users to choose whether to follow HTTP redirects.
Timeout Configuration: Lets users specify a timeout duration (in seconds) for the request.
Fetch HTML: The script fetches the HTML content from the specified URL and displays the response headers and the HTML content.
HTML Saving: Provides an option to save the fetched HTML content to a local file and handles overwriting conflicts.
Repeat or Exit: After completing one request, provides user an option to fetch another URL or exit the program.

Getting Started

Prerequisites

Python 3 (can be downloaded from Python Official website)
requests library (can be installed via pip install requests in the terminal)

Installation

Clone this repository to your local machine:

git clone https://github.com/Atia-Farha/HTML-Fetcher-Script.git

Navigate to the directory containing the script:
```
cd HTML-Fetcher-Script
```

Usage

Run the fetch_html.py file located inside the src folder of this project.
Follow the on-screen prompts:
- Enter the website link: Provide a valid URL starting with http:// or https:// (e.g., https://example.com).
- Allow redirects: Specify whether to allow HTTP redirects (yes or no).
- Set timeout value: Enter a positive integer for the request timeout in seconds (e.g., 5, 10, 12,...).
- Save HTML content: Choose whether to save the fetched HTML content to a file (yes or no).
If saving the file:
- Enter a file name.
- If the file already exists, you will be prompted to overwrite or choose a different name.
To fetch another URL when prompted, answer yes and repeat the same process. To exit the script, answer no.

Error Handling

The script includes comprehensive error handling for:

Invalid URL
Content decoding errors
HTTP errors
Connection errors
Timeout errors
Chunked encoding errors
General request exceptions

Reporting Issues

If you encounter any bugs or have suggestions for improvement, please report them in the Issues section of this GitHub repository. I will address them promptly.

License

This script is provided for personal, non-commercial purposes. Commercial use of this software is not permitted. When using this script, credit must be given to the original author. For more details, read LICENSE file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HTML Fetcher Script

Table of Contents

Project Structure

Features

Getting Started

Prerequisites

Installation

Usage

Error Handling

Reporting Issues

License

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

Atia-Farha/HTML-Fetcher-Script

Folders and files

Latest commit

History

Repository files navigation

HTML Fetcher Script

Table of Contents

Project Structure

Features

Getting Started

Prerequisites

Installation

Usage

Error Handling

Reporting Issues

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages