-
Notifications
You must be signed in to change notification settings - Fork 14
Closed
Description
Status Quo Example
Setup: 3 brokers, 2 consumers (A and B) in the same group, topic with 4 partitions.
What should happen: A coordinator assigns partitions fairly -- consumer A gets partitions 0, 1 and consumer B gets partitions 2, 3. Each message is processed exactly once.
What actually happens in 2/3 cases:
- Consumer A connects and lands on broker 0. Broker 0 becomes the coordinator for this group, assigns all 4 partitions to consumer A, and saves this to etcd.
- Consumer B connects and lands on broker 1. Broker 1 loads the group state from etcd, sees consumer A exists, adds consumer B, and starts reassigning partitions. But consumer A doesn't know this is happening -- it's still talking to broker 0, which has no idea broker 1 is doing anything with this group.
Consumer A never responds to broker 1's reassignment (it doesn't know about it), so after 30 seconds broker 1 gives up waiting and assigns all 4 partitions to consumer B alone.
Now both consumers think they own all 4 partitions. Both process every message. Both save their progress. Neither knows the other exists.
Result: Every message gets processed twice, forever, with no error or warning anywhere.
Proposal
Fix this using the same lease based mechanism as proposed in #120. I.e. route all requests for a given group to a single broker.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels