Address bottleneck on offset committer thread #44

mauliksoneji · 2020-04-06T06:10:34Z

Problem
Currently, there is only one offset committer thread that acknowledges the successful consumption back to Kafka. As per beast architecture, the Consumer, BQ Workers, and Acknowledger threads work independently and are connected by blocking queues.

The push operation on blocking queues which put the Kafka messages to the queue is not indefinitely blocking, instead, there is a timeout specified for getting a free slot on the queue to push the batch of Kafka messages.

Since we can spawn any number of BQ Workers, the Commit Queue processed by Acknowledger gets full and even with sufficiently high timeouts, the commit queue stays full because of the high load of messages on Acknowledger.

We require a mechanism to increase the processing capacity of Acknowledger thread so that it doesn't become the bottleneck for the application.

Approaches

Wait indefinitely for adding batch to commit queue
Currently, we are only waiting for some time to get a slot in commit queue, if the queue is still full then, the process exits.
One idea is that we can wait indefinitely to push data to the commit queue. By doing this, even though the queue gets full, process doesn't restart.

Disadvantages:
We push data in the queue in a synchronous fashion. So if the push to commit queue takes long time, we are bottle necked on this and we are essentially using only one thread to push data to bigquery. This results in big performance degradation and diverges from beast philosophy of scaling.

Batch commits
Currently, we are sending one acknowledgement per batch.
The idea is to club the acknowledgements for a certain period of time and then send an acknowledgement.

With this approach of batch commit, We need to make sure that there is no data loss.

mauliksoneji mentioned this issue Apr 15, 2020

[Maulik] batch commits from offset commit worker #45

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Address bottleneck on offset committer thread #44

Address bottleneck on offset committer thread #44

mauliksoneji commented Apr 6, 2020 •

edited

Loading

Address bottleneck on offset committer thread #44

Address bottleneck on offset committer thread #44

Comments

mauliksoneji commented Apr 6, 2020 • edited Loading

mauliksoneji commented Apr 6, 2020 •

edited

Loading