Skip to content

🚀HyperCrawler🚀 - Scalable web crawler based on a microservice architecture

Notifications You must be signed in to change notification settings

avollmaier/hypercrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hypercrawler Project

Web crawling refers to a program that searches the internet for web pages. The goal of web crawling is to analyze and persist the examined web pages.

Description

This project work should address the mentioned problem. Therefore, it is important that the problem has been understood and feeds directly into all elements of the software development process. The objective of the project work is to develop a scalable and high-performance architecture for a web crawler, which enables efficient processing of large data volumes.

The overall architecture should be based on microservices to ensure easy horizontal and vertical scalability of the system. A resulting architecture concept should be implemented after completion. The implementation is to be realised using modules of the Spring framework, which already provide extensive mechanisms as a basis for achieving the goal of a stable and scalable application

Status

config-service

Commit Stage

edge-service

Commit Stage

manager-service

Commit Stage

frontier-service

Commit Stage

crawler-service

Commit Stage

filter-service

Commit Stage

Prerequisites

Chapter after chapter, you'll build, containerize, and deploy cloud native applications. Along the journey, you will need the following software installed.

Run Locally

Clone the project

  git clone --recursive https://github.com/avollmaier/hypercrawler.git

Go to the project directory

  cd hypercrawler

Create a minikube kubernetes cluster (check out the hypercrawler-deployment project)

  cd hypercrawler-deployment/kubernetes/platform/

  ./create-cluster.sh 

Start the tilt server for fast local deployment

  cd ../development

  tilt up

About

🚀HyperCrawler🚀 - Scalable web crawler based on a microservice architecture

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published