Kafka Consumer Group:-
Generally, a Kafka consumer belongs to a particular consumer group. A consumer group basically represents the name of an application. In order to consume messages in a consumer group, '-group' command is used.
Let' see how consumers will consume messages from Kafka topics:
Step1: Open the Windows command prompt.
Step2: Use the '-group' command as: 'kafka-console-consumer -bootstrap-server localhost:9092 -topic -group <group_name>'. Give some name to the group. Press enter.
In the above snapshot, the name of the group is 'first_app'. It is seen that no messages are displayed because no new messages were produced to this topic. If '-from-beginning' command will be used, all the previous messages will be displayed.
Step3: To view some new messages, produce some instant messages from the producer console(as did in the previous section).
So, the new messages produced by the producer can be seen in the consumer's console.
Step4: But, it was a single consumer reading data in the group. Let's create more consumers to understand the power of a consumer group. For that, open a new terminal and type the exact same consumer command as:
'kafka-console-consumer.bat --bootstrap-server 127.0.0.1:9092 --topic <topic_name> --group <group_name>'.
In the above snapshot, it is clear that the producer is sending data to the Kafka topics. The two consumers are consuming the messages. Look at the sequence of the messages. As there were three partitions created for 'myfirst' topic(discussed earlier), so messages are split in that sequence only.
We can further create more consumers under the same group, and each consumer will consume the messages according to the number of partitions. Try yourself to understand better.
Note: The group id should be the same, then only the messages will be split between the consumers.
However, if any of the consumers is terminated, the partitions will be reassigned to the active consumers, and these active consumers will receive the messages.
So, in this way, various consumers in a consumer group consume the messages from the Kafka topics.
Consumer with Keys
When a producer has attached a key value with the data, it will get stored to that specified partition. If no key value is specified, the data will move to any partition. So, when a consumer reads the message with a key, it will be displayed null, if no key was specified. A 'print.key' and a 'key.seperator' sre required to consume messages from the Kafka topics. The command used is:
'kafka-console-consumer -bootstrap-server localhost:9092 -topic <topic_name> --from-beginning -property print.key=true -property key.seperator=,'
Using the above command, the consumer can read data with the specified keys.
More about Consumer Group
This command is used to read the messages from the starting(discussed earlier). Thus, using it in a consumer group will give the following output:
It can be noticed that a new consumer group 'second_app' is used to read the messages from the beginning. If one more time the same command will run, it will not display any output. It is because offsets are committed in Apache Kafka. So, once a consumer group has read all the until written messages, next time, it will read the new messages only.
For example, in the below snapshot, when '-from-beginning' command is used again, only the new messages are read. It is because all the previous messages were consumed earlier only.
This command gives the whole documentation to list all the groups, describe the group, delete consumer info, or reset consumer group offsets.
It requires a bootstrap server for the clients to perform different functions on the consumer group.
Listing Consumer Groups
A '-list' command is used to list the number of consumer groups available in the Kafka Cluster. The command is used as:
'kafka-consumer-groups.bat -bootstrap-server localhost:9092 -list'.
A snapshot is shown below, there are three consumer groups present.
Describing a Consumer Group
A '--describe' command is used to describe a consumer group. The command is used as:
'kafka-consumer-groups.bat -bootstrap-server localhost:9092 -describe group <group_name>'
This command describes whether any active consumer is present, the current offset value, lag value is 0 -indicates that the consumer has read all the data.
Resetting the Offsets
Offsets are committed in Apache Kafka. Therefore, if a user wants to read the messages again, it is required to reset the offsets value. 'Kafka-consumer-groups' command offers an option to reset the offsets. Resetting the offset value means defining the point from where the user wants to read the messages again. It supports only one consumer group at a time, and there should be no active instances for the group.
While resetting the offsets, the user needs to choose three arguments:
There are two executions options available:
'-dry-run': It is the default execution option. This option is used to plan those offsets that need to be reset.
-execute': This option is used to update the offset values.
There are following reset specifications available:
'-to-datetime': It reset the offsets on the basis of the offset from datetime. The format used is: 'YYYY-MM-DDTHH:mm:SS.sss'.
'--to-earliest': It reset the offsets to the earliest offset.
' --to-latest': It reset the offsets to the latest offset.
'--shift-by': It reset the offsets by shifting the current offset value by 'n'. The value of 'n' can be positive or negative.
'--from-file': It resets the offsets to the values defined in the CSV file.
' --to-current': It reset the offsets to the current offset.
There are two scopes available to define:
'-all-topics': It reset the offset value for all the available topics within a group.
'-topics': It reset the offset value for the specified topics only. The user needs to specify the topic name for resetting the offset value.
Let's try and see:
1) Using '-to-earliest' command
In the above snapshot, the offsets are reset to the new offset as 0. It is because '-to-earliest' command is used, which has reset the offset value to 0.
2) Using '-shift-by' command
In the first snapshot, the offset value is shifted from '0' to '+2'. In the second one, the offset value is shifted from '2' to '-1'.
Note: To shift the offset value to a positive count, it is not necessary to use '+' symbol with it. By default, it will be considered positive only.