spring-data-jpa-r2dbc-mysql-stream-million-records

In this project, we will implement two Spring Boot Java Web application called, streamer-data-jpa and streamer-data-r2dbc. They both will fetch 1 million of customer's data from MySQL and stream them to Kafka. The main goal is to compare the application's performance and resource utilization.

Proof-of-Concepts & Articles

On ivangfr.github.io, I have compiled my Proof-of-Concepts (PoCs) and articles. You can easily search for the technology you are interested in by using the filter. Who knows, perhaps I have already implemented a PoC or written an article about what you are looking for.

Project Diagram

Applications

streamer-data-jpa

Spring Boot Web Java application that connects to MySQL using Spring Data JPA and to Kafka.

It provides some endpoints such as:
- PATCH api/customers/stream-naive[?limit=x]: to stream customer records using a naive implementation with Spring Data JPA;
- PATCH api/customers/stream[?limit=x]: to stream customer records using a better implementation with Java 8 Streams and Spring Data JPA as explained in this article.
- PATCH api/customers/load?amount=x: to create a specific amount of random customer records.
streamer-data-r2dbc

Spring Boot Web Java application that connects to MySQL using Spring Data R2DBC and to Kafka.

It provides some endpoints such as:
- PATCH api/customers/stream[?limit=x]: to stream customer records;
- PATCH api/customers/load?amount=x: to create a specific amount of random customer records.

Prerequisites

Java 21+
Some containerization tool Docker, Podman, etc.

Start Environment

Open a terminal and inside the spring-data-jpa-r2dbc-mysql-stream-million-records root folder run:
```
docker compose up -d
```
Wait for Docker containers to be up and running. To check it, run:
```
docker ps -a
```
Once MySQL, Kafka and Zookeeper are up and running, run the following scripts:
- To create two Kafka topics:
```
./init-kafka-topics.sh
```
- To initialize MySQL database and to create two Kafka topics:
```
./init-mysql-db.sh 1M
```
  Note: we can provide the following load amount values: 0, 100k, 200k, 500k or 1M

Run applications with Maven

Inside the spring-data-jpa-r2dbc-mysql-stream-million-records, run the following Maven commands in different terminals:

streamer-data-jpa

./mvnw clean spring-boot:run --projects streamer-data-jpa

streamer-data-r2dbc

./mvnw clean spring-boot:run --projects streamer-data-r2dbc

Run applications as Docker containers

Build Docker Images
- In a terminal, make sure you are in the spring-data-jpa-r2dbc-mysql-stream-million-records root folder;
- Run the following script to build the Docker images:
```
./build-docker-images.sh
```

Environment Variables

streamer-data-jpa

Environment Variable	Description
`MYSQL_HOST`	Specify host of the `MySQL` database to use (default `localhost`)
`MYSQL_PORT`	Specify port of the `MySQL` database to use (default `3306`)
`KAFKA_HOST`	Specify host of the `Kafka` message broker to use (default `localhost`)
`KAFKA_PORT`	Specify port of the `Kafka` message broker to use (default `29092`)

streamer-data-r2dbc

Environment Variable	Description
`MYSQL_HOST`	Specify host of the `MySQL` database to use (default `localhost`)
`MYSQL_PORT`	Specify port of the `MySQL` database to use (default `3306`)
`KAFKA_HOST`	Specify host of the `Kafka` message broker to use (default `localhost`)
`KAFKA_PORT`	Specify port of the `Kafka` message broker to use (default `29092`)

Start Docker Containers

Run the following docker run commands in different terminals:

streamer-data-jpa

docker run --rm --name streamer-data-jpa -p 9080:9080 \
  -e MYSQL_HOST=mysql -e KAFKA_HOST=kafka -e KAFKA_PORT=9092 \
  --network spring-data-jpa-r2dbc-mysql-stream-million-records_default \
  ivanfranchin/streamer-data-jpa:1.0.0

streamer-data-r2dbc

docker run --rm --name streamer-data-r2dbc -p 9081:9081 \
  -e MYSQL_HOST=mysql -e KAFKA_HOST=kafka -e KAFKA_PORT=9092 \
  --network spring-data-jpa-r2dbc-mysql-stream-million-records_default \
  ivanfranchin/streamer-data-r2dbc:1.0.0

Simulation with 1 million customer records

Previously, during Start Environment step, we initialized MySQL with 1 million customer records.

Resource Consumption Monitoring Tool

Running applications with Maven

We will use JConsole tool. In order to run it, open a new terminal and run:
```
jconsole
```
Running applications as Docker containers

We will use cAdvisor tool. In a browser, access:
- to explore the running containers: http://localhost:8080/docker/
- to go directly to a specific container:
  - streamer-data-jpa: http://localhost:8080/docker/streamer-data-jpa
  - streamer-data-r2dbc: http://localhost:8080/docker/streamer-data-r2dbc

Streaming customer records

In another terminal, call the following curl commands to trigger the streaming of customer records from MySQL to Kafka. At the end of the curl command, the total time it took (in seconds) to process will be displayed.

We can monitor the amount of messages and the messages themselves been streamed using Kafdrop – Kafka Web UI at http://localhost:9000

streamer-data-jpa

Naive implementation

curl -w "Response Time: %{time_total}s" -s -X PATCH localhost:9080/api/customers/stream-naive

Better implementation

curl -w "Response Time: %{time_total}s" -s -X PATCH localhost:9080/api/customers/stream

streamer-data-r2dbc

curl -w "Response Time: %{time_total}s" -s -X PATCH localhost:9081/api/customers/stream

Sample

A simulation sample running the applications with Maven and using JConsole tool:

streamer-data-jpa

Naive implementation
```
Response Time: 414.486126s
```
Better implementation
```
Response Time: 453.692525s
```
streamer-data-r2dbc
```
Response Time: 476.951654s
```

Useful commands & links

Kafdrop

Kafdrop can be accessed at http://localhost:9001
MySQL monitor

To check data in customerdb database:
```
docker exec -it -e MYSQL_PWD=secret mysql mysql -uroot --database customerdb
SELECT count(*) FROM customer;
```
To create a dump from customer table in customerdb database, make sure you are in the spring-data-jpa-r2dbc-mysql-stream-million-records root folder and run:
```
./dump-mysql-db.sh
```

Shutdown

To stop streamer-data-jpa and streamer-data-r2dbc, go to the terminals were they are running and press Ctrl+C;
To stop and remove docker compose containers, network and volumes, go to a terminal and, inside the spring-data-jpa-r2dbc-mysql-stream-million-records root folder, run the command below:
```
docker compose down -v
```

Cleanup

To remove all Docker images created by this project, go to a terminal and, inside the spring-data-jpa-r2dbc-mysql-stream-million-records root folder, run the following script:

./remove-docker-images.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

spring-data-jpa-r2dbc-mysql-stream-million-records

Proof-of-Concepts & Articles

Project Diagram

Applications

streamer-data-jpa

streamer-data-r2dbc

Prerequisites

Start Environment

Run applications with Maven

Run applications as Docker containers

Build Docker Images

Environment Variables

Start Docker Containers

Simulation with 1 million customer records

Resource Consumption Monitoring Tool

Streaming customer records

Sample

Useful commands & links

Shutdown

Cleanup

About

Releases

Sponsor this project

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.github		.github
.mvn/wrapper		.mvn/wrapper
documentation		documentation
mysql		mysql
streamer-data-jpa		streamer-data-jpa
streamer-data-r2dbc		streamer-data-r2dbc
.gitignore		.gitignore
README.md		README.md
build-docker-images.sh		build-docker-images.sh
docker-compose.yml		docker-compose.yml
dump-mysql-db.sh		dump-mysql-db.sh
init-kafka-topics.sh		init-kafka-topics.sh
init-mysql-db.sh		init-mysql-db.sh
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml
remove-docker-images.sh		remove-docker-images.sh

ivangfr/spring-data-jpa-r2dbc-mysql-stream-million-records

Folders and files

Latest commit

History

Repository files navigation

spring-data-jpa-r2dbc-mysql-stream-million-records

Proof-of-Concepts & Articles

Project Diagram

Applications

streamer-data-jpa

streamer-data-r2dbc

Prerequisites

Start Environment

Run applications with Maven

Run applications as Docker containers

Build Docker Images

Environment Variables

Start Docker Containers

Simulation with 1 million customer records

Resource Consumption Monitoring Tool

Streaming customer records

Sample

Useful commands & links

Shutdown

Cleanup

About

Topics

Resources

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Languages

Packages