🔞 Rule34 Scraper Bot 🤖

📃 Description

A Node.js console-based scraper using Puppeteer to extract images, videos and metadata into a json file, you can also download images and videos along with json data, from rule34.xxx based on search tags.

✅ Features

Added plugins that help prevent detection.
Scrapes images & videos URLs, post details, and other metadata.
Download images and videos along with json file.
Handles pagination and avoids duplicate data.
Saves data in a dynamic JSON file named after the search tags.

📦 Packages Used

Puppeteer : Puppeteer is a Node.js library which provides a high-level API to control Chrome/Chromium over the DevTools Protocol. Puppeteer runs in headless mode by default, but can be configured to run in full ("headful") Chrome/Chromium.
Puppeteer Extra : A light-weight wrapper around puppeteer and friends to enable cool plugins through a clean interface.
Puppeteer Extra Plugin Stealth : A plugin for puppeteer-extra and playwright-extra to prevent detection.
Node : Path : module provides utilities for working with file and directory path.
Axios : is a promise based HTTP client for the browser and node.js.

🤔 How to Use

Clone the repository.
Move inside the folder and install dependencies with npm install.
Run the app with node scraper_bot.js.
Enter your search term when prompted.

🫡 Example Usage

$ node scraper_bot.js
-------------------------------------------
       Welcome to Rule34 Scraper Bot
-------------------------------------------
Just enter the appropriate tags, follow the same convention used in Rule34.
Json file, Images and Videos folder will be generated in your current path.
Generated json file will include images, videos and other meta data.
If you encounter an bug, please open a issue: https://github.com/Shivam171/rule34-scraper-bot

NOTE: To close the program enter {ctrl + c}
-------------------------------------------
Enter your search tags: cartoon
-------------------------------------------
            Choose your option
1. JSON data Only
2. Images Only
3. Videos Only
4. All above
-------------------------------------------
Enter your choice: 4
-------------------------------------------
Searching...
Retriving data from rule34...
Note: This may take a while, Please be patient...
-------------------------------------------
Navigating to 1 of 42
Navigating to 2 of 42
Navigating to 3 of 42
.
.

⚠️ Disclaimer

Web scraping can violate website terms of service. Contents of this website can be "NSFW" (not safe for work), Use responsibly and ensure legality and ethical compliance before scraping any website. This project is for educational purposes only.

🫂 Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.gitignore		.gitignore
README.md		README.md
jsconfig.json		jsconfig.json
package-lock.json		package-lock.json
package.json		package.json
scraper_bot.js		scraper_bot.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔞 Rule34 Scraper Bot 🤖

📃 Description

✅ Features

📦 Packages Used

🤔 How to Use

🫡 Example Usage

⚠️ Disclaimer

🫂 Contributing

About

Languages

Shivam171/rule34-scraper-bot

Folders and files

Latest commit

History

Repository files navigation

🔞 Rule34 Scraper Bot 🤖

📃 Description

✅ Features

📦 Packages Used

🤔 How to Use

🫡 Example Usage

⚠️ Disclaimer

🫂 Contributing

About

Topics

Resources

Stars

Watchers

Forks

Languages