Consumer Offsets and Delivery Semantics

Consumer Offsets

  • Kafka stores the offsets at which a consumer group has been reading.
  • The offsets committed live in a Kafka topic named _consumer_offsets.
  • When a consumer in a group has processed data received from Kafka, it should be committing the offsets.
  • If a consumer dies, it will be able to read back from where it left off thanks to the committed consumer offsets!
-----------------------
-------------------Commited offset ------| |
| | |
| | |
--------------|----------------------------------------| Consumer from |
| 4258 | 4259 | 4260 | 4261 | 4262 | 4263 | | Consumer Group |
--------------|----------------------------------------| |
| | |
| | |
-------------------Reads---------------->| |
-----------------------

Delivery Semantics for Consumers

  • Consumers choose when to commit offsets
  • There are 3 delivery semantics :
    • Atmost once:
      • offsets are committed as soon as the message is received
      • If the processing goes wrong, the message will be lost(it won't be read again)
    • Atleast once:
      • offsets are committed after the message is processed.
      • If the processing goes wrong, the message will be read again.
      • This can result in duplicate processing of messages. Make sure your processing is idempotent(ie processing again the messages won't impact your systems)
    • Exactly once:
      • Can be achieved for Kafka -> Kafka workflows using Kafka Streams API