Speed up leader search #320

cole-miller · 2024-09-24T04:00:31Z

This PR contains two changes that are intended to address #300 by cutting the number of dqlite connections that need to be opened in typical cases.

The first change is to remember the address of the cluster leader whenever we connect to it successfully, and to try this remembered address first when opening a new leader connection. When the cluster is stable, this means that we open only one connection in Connector.Connect; previously we would open on the order of N (= cluster size) connections. The tradeoff is that Connector.Connect is now slower and less efficient when the cluster is not stable. This optimization applies to both client.FindLeader and the driver implementation.

The second change is to enable client.FindLeader to reuse an existing connection to the leader instead of opening a new one. This is valid because the operations that can be performed on the connection using the returned Client do not depend on the logical state of the connection (open database, prepared statements). When the leader is stable, this saves one new connection per client.FindLeader call after the first change has been implemented. The long-lived connection is checked for validity and leadership before returning it to be reused.

Both of these changes rely on storing some new state in the NodeStore using some fun embedding tricks. I did it this way because the NodeStore is the only object that is passed around to all the right places and lives long enough to persist state between connection attempts.

On my machine, using @masnax's repro script, these two changes combined cut the spikes in CPU usage from 30% to 8-10%, with the first change being responsible for most of that improvement. The remaining spike is due to opening N connections (in parallel) within makeRolesChanges, and could perhaps be dealt with by augmenting the NodeStore further with a pool of connections to all nodes instead of just the last known leader, but I've left that for a possible follow-up.

Signed-off-by: Cole Miller cole.miller@canonical.com

Signed-off-by: Cole Miller <cole.miller@canonical.com>

cole-miller added 2 commits September 23, 2024 23:28

Remember the address of the last known leader

55d1495

Signed-off-by: Cole Miller <cole.miller@canonical.com>

Introduce a reusable leader connection

3bd3a69

Signed-off-by: Cole Miller <cole.miller@canonical.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up leader search #320

Speed up leader search #320

cole-miller commented Sep 24, 2024

Speed up leader search #320

Are you sure you want to change the base?

Speed up leader search #320

Conversation

cole-miller commented Sep 24, 2024