Consumer Offsets and Delivery Semantics
Consumer Offsets
- Kafka stores the offsets at which a consumer group has been reading.
- The offsets committed live in a Kafka topic named _consumer_offsets.
- When a consumer in a group has processed data received from Kafka, it should be committing the offsets.
- If a consumer dies, it will be able to read back from where it left off thanks to the committed consumer offsets!
------------------------------------------Commited offset ------| || | || | |--------------|----------------------------------------| Consumer from || 4258 | 4259 | 4260 | 4261 | 4262 | 4263 | | Consumer Group |--------------|----------------------------------------| || | || | |-------------------Reads---------------->| |-----------------------
Delivery Semantics for Consumers
- Consumers choose when to commit offsets
- There are 3 delivery semantics :
- Atmost once:
- offsets are committed as soon as the message is received
- If the processing goes wrong, the message will be lost(it won't be read again)
- Atleast once:
- offsets are committed after the message is processed.
- If the processing goes wrong, the message will be read again.
- This can result in duplicate processing of messages. Make sure your processing is idempotent(ie processing again the messages won't impact your systems)
- Exactly once:
- Can be achieved for Kafka -> Kafka workflows using Kafka Streams API
- Atmost once: