Theory:
Per Documentation for parameter auto.offset.reset:Specifies what MapR Streams should do when there is no initial offset, such as when a consumer starts reading from a partition.
- earliest -- Reset the offset to the offset of the earliest message in the partition.
- latest -- Reset the offset to the offset of the latest message in the partition.This is the default value.
- none -- Throws a NoOffsetForPartitionException exception when the consumer next polls for messages in its subscription and no offset exists. The consumer must unsubscribe from the partition before polling functions correctly.
Experiment:
1. Default value auto.offset.reset="latest"
With default code in SampleConsumer:mapr openkb.stream.SampleConsumerConsumer will start to fetch the latest messages in that topic:(Look at the "offset")
Consumed Record: ConsumerRecord(topic = /stream/s1:info, partition = 0, offset = 354989820, timestamp = 1461251182624, producer = root, key = null, value = Msg 0) Consumed Record: ConsumerRecord(topic = /stream/s1:info, partition = 0, offset = 354989821, timestamp = 1461251182624, producer = root, key = null, value = Msg 1) Consumed Record: ConsumerRecord(topic = /stream/s1:info, partition = 0, offset = 354989822, timestamp = 1461251182624, producer = root, key = null, value = Msg 5) ... ...If there is no new messages coming in, then Consumer will wait there until the poll timeout is triggered.
2. auto.offset.reset="earliest"
We just changed auto.offset.reset="earliest" in SampleConsumer code and then execute it:mapr openkb.stream.SampleConsumerConsumer will start to fetch the earliest messages in that topic:(Look at the "offset")
Consumed Record: ConsumerRecord(topic = /stream/s1:info, partition = 0, offset = 1, timestamp = 1460995818285, producer = root, key = null, value = Msg 0) Consumed Record: ConsumerRecord(topic = /stream/s1:info, partition = 0, offset = 2, timestamp = 1460995818285, producer = root, key = null, value = Msg 4) Consumed Record: ConsumerRecord(topic = /stream/s1:info, partition = 0, offset = 3, timestamp = 1460995818285, producer = root, key = null, value = Msg 1) ... ...
This is similar as a full table scan.
No comments:
Post a Comment