Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add read balancer info #4

Open
JoshSalomon opened this issue Jun 22, 2024 · 5 comments
Open

Add read balancer info #4

JoshSalomon opened this issue Jun 22, 2024 · 5 comments

Comments

@JoshSalomon
Copy link

Hi JJ - great Page,
I believe it is worth adding information about the read balancer, especially since the Squid version will support OSDs of different sizes. Would you like to work with @ljflores and me about it?

@TheJJ
Copy link
Owner

TheJJ commented Jun 22, 2024

sure, what do you have in mind?

@JoshSalomon
Copy link
Author

Wondering where to start: Have you heard anything about the read balancer (available since Reef)?

@TheJJ
Copy link
Owner

TheJJ commented Jun 25, 2024

yes, i saw the initial presentation slides, and wondered how it compared to my balancer, but I didn't priorize just remapping primaries so far, but i think this can be added, too.
I didn't use it in a production cluster so far.

Thinking about balancers, it seems that the whole crush approach may not be ideal after all, and just having an efficient pg->osd mapping lookup table is probably suitable for nearly all clusters. then we wouldn't have to fight crush with one hack after the other to get a better desired mapping adjustment, instead of (re)mapping it directly.

@JoshSalomon
Copy link
Author

JoshSalomon commented Jun 26, 2024

Ceph improved read balancer.pdf
If I understand correctly - your balancer is a capacity balancer, not a read balancer - but I just heard your presentation in the past and did not dive into the code.
The read balancer is only a meta data operation, and it does not move data so it is a completely different approach, and is cheaper to execute continuously (more on this later)
The first version (in Reef) just makes sure that in each OSD you have the fair share of primaries (the read balancer works only on replicated pools so in each OSD we try to make pg_num/replica_num primaries). Obviously we check it against CRUSH constraints.
In Squid, we added a functionality that improves cluster performance when the devices are not of the same size. We added a pool parameter for the read_ratio of the IOs to the pool (70 means that 70% of the ios to the pool are read and 30% write) - with this information, we can optimally move more reads to the smaller devices and let the larger devices handle less reads so we try to balance the IOPS per OSD (assuming the devices have the same performance profile). In the future, we may calculate the read ratio automatically based on metrics and make it an adaptive system (I am not sure this is needed, but it will be easy to implement) for optimal performance.
Attached is the presentation explaining the model behind this balancer and some examples.
If you think this is worth mentioning, Laura and I can open a PR with the explanation to this Ceph guide

@TheJJ
Copy link
Owner

TheJJ commented Jul 7, 2024

Yes, sure, you can add a short section about it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants