Syncer: MySQL/MongoDB => Elasticsearch/MySQL/Kafka/HBase

中文文档

Features

Sync data consistently: make sure data consistency
Sync data async: little effect on origin server
Sync data and manipulate: write Java code to customize sync
Sync data from MySQL/MongoDB to Kafka/ES/MySQL/HBase
ETL and real-time data sync combined: missing no data, sample config

Use Syncer

Preparation

MySQL config
- binlog_format: row
- binlog_row_image: full
MongoDB config:
- (optional) update bind_ip to allow listens for connections from applications on configured addresses.
- enable replication set:
  - mongod --replSet myapp
  - Or use docker: docker run -d --name mongodb -p 27017:27017 -v /root/mongodb-container/db:/data/db mongo:3.2 mongod --replSet chat
- init replication set in shell: rs.initiate()

Run

git clone https://github.com/zzt93/syncer
cd syncer/ && mvn package
# /path/to/config/: producer.yml, consumer.yml, password-file
# use `-XX:+UseParallelOldGC` if you have less memory and lower input pressure
# use `-XX:+UseG1GC` if you have at least 4g memory and event input rate larger than 2*10^4/s
java -server -XX:+UseG1GC -jar ./syncer-core/target/syncer-core-1.0-SNAPSHOT.jar [--debug] [--port=40000] [--config=/absolute/path/to/syncerConfig.yml] --producerConfig=/absolute/path/to/producer.yml --consumerConfig=/absolute/path/to/consumer1.yml,/absolute/path/to/consumer2.yml

Full and usable sample config can be found under test/config/, like test/config/simplest

How to ?

If you have any problems with how to use Syncer or bugs of it, write an issue. I will handle it as soon as I can.

FAQ

Q: "Got error produce response in correlation id xxx on topic-partition xxx.xxPartition-0, splitting and retrying (5 attempts left). Error: MESSAGE_TOO_LARGE"?
- A: Adjust message batch.size to smaller number or config kafka to receive large message

Used In Production

Search system: search data sync
Micro-service: auth/recommend/chat data sync
- Sync Requirement: low latency, high availability
Join table: avoid join in production env, use space for speed by joining table
- Sync Requirement: low latency, high availability
Kafka: sync data to kafka, for other heterogeneous system to use
For data recovery: In case of drop entity mistakenly, or you know where to start & end
For alter table sync:
- MySQL very slow for alter table
- MySQL 8.0: InnoDB now supports Instant ADD COLUMN
For data warehouse sync

TODO

See Issue 1

Implementation

Implementation detail can be found in doc

Name		Name	Last commit message	Last commit date
Latest commit History 667 Commits
.github		.github
config-sample		config-sample
doc		doc
instrumentation		instrumentation
script		script
syncer-core		syncer-core
syncer-data		syncer-data
test		test
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Syncer: MySQL/MongoDB => Elasticsearch/MySQL/Kafka/HBase

中文文档

Features

Use Syncer

Preparation

Run

How to ?

FAQ

Used In Production

TODO

Implementation

About

Releases 1

Packages

Contributors 4

Languages

License

zzt93/syncer

Folders and files

Latest commit

History

Repository files navigation

Syncer: MySQL/MongoDB => Elasticsearch/MySQL/Kafka/HBase

中文文档

Features

Use Syncer

Preparation

Run

How to ?

FAQ

Used In Production

TODO

Implementation

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 4

Languages

Packages