Last updated
Last updated
Delay topics are divided by different delay intervals.
Each delay topic corresponds to a dedicated consumer group.
Each time when dedicated consumer groups consume a message, the consumer group will sleep for certain period.
During the sleep, Kafka will judge that consumers are crashed. And a rebalance will be performed.
Consumer group pulls a message (suppose offset = N after consumption), and check the remaining delay time t.
Consumer group pauses the consumption and slept for delay time t.
During the pause, consumer group will still have poll request, but it won't actually poll data.
After sleep, consumer group resumes from offset = N.
Commit message first vs forward to business topic first?
If machine crashed in the middle, the message will not be delivered to business topic.
As long as message receiver could guarantee idempotency, then this will be the ideal solution.
Delay time must be fixed ahead of time. For example, in the flowchart above, delay time is set to 1, 3, or 10 mins.
There might be dramatically different load on different delay partitions. For example, maybe most of (80%) traffic lands on 3min delay period.