If you are new to Kafka, please read the first three posts of the series given below. Else dive in.
Reliable Data Delivery in Kafka
Troubleshooting Under Replicated Kafka Partitions
If you are preparing for an interview, this post contains most of the things that you should know about Kafka.
Active Controller Count
Active controller is one of the brokers of Kafka cluster which is designated to do administrative tasks like reassigning partitions. The active controller count metric tells us id the broker is the controller for cluster or not. The value of this metric could be 0 or 1. This metric is emitted per broker.
What if two brokers say that they are the controller?
The active controller count metric indicates whether the broker is currently the controller for the cluster. The metric will either be 0 or 1, with 1 showing that the broker is currently the controller. Kafka cluster require one broker to be the controller and only one broker can be a controller at any given time.
What should you do when more than one broker claims to become controller?
This situation will affect the administrative tasks of cluster. The first step could be restart of the brokers claiming to be controller.
Metric Name
kafka.controller:type=KafkaController,name=ActiveControllerCount
Request Handler Idle Ratio
Following are the two thread pools used by Kafka to handle requests:
Network Handlers
These are responsible for reading and writing data to the clients across the network. This does not require significant processing, so network handler don’t get exhausted easily.
Request Handlers
The request handler threads, however, are responsible for servicing the client request itself, which includes reading or writing the messages to disk. The request handler idle ratio metric indicates the percentage of time the request handlers are not in use. The lower this number, the more loaded the broker is. It is advisable to check the cluster for size or any other potential problem if the idle ratios goes lower than 20%.
Kafka uses purgatory to efficiently handle requests.
Read about purgatory here.
Metric Name
kafka.server:type=KafkaRequestHandlerPool,name=RequestHandlerAvgIdlePercent
All Topics Bytes In
The all topics bytes in rate, expressed in bytes per second, is useful as a measurement of how much message traffic your brokers are receiving from producing clients. This is a good metric to trend over time to help you determine when you need to expand
the cluster or do other growth-related work. It is also useful for evaluating if one broker in a cluster is receiving more traffic than the others, which would indicate that it is necessary to rebalance the partitions in the cluster.
Metric Name
kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec
All Topics Bytes Out
The all topics bytes out rate, similar to the bytes in rate, is another overall growth metric. In this case, the bytes out rate shows the rate at which consumers are reading messages out. The outbound bytes rate may scale differently than the inbound bytes rate.
The outbound bytes rate also includes the replica traffic. This means that if all of the topics are configured with a replication factor of 2, we will see a bytes out rate equal to the bytes in rate when there are no consumer clients.
Metric Name
kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec
Other Important Kafka Broker Metrics
Name | Description | Metrics Name |
All topics messages in | The messages in rate shows the number of individual messages, regardless of their size, produced per second. This is useful as a growth metric as a different measure of producer traffic. | kafka.server: type=BrokerTopicMetrics, name=MessagesInPerSec |
Partition count | The partition count for a broker generally doesn’t change that much, as it is the total number of partitions assigned to that broker. This includes every replica the broker has, regardless of whether it is a leader or follower for that partition. | kafka.server: type=ReplicaManager, name=PartitionCount |
Leader count | The leader count metric shows the number of partitions that the broker is currently the leader for. As with most other measurements in the brokers, this one should be generally even across the brokers in the cluster. | kafka.server: type=ReplicaManager, name=LeaderCount |
Offline partitions | This measurement is only provided by the broker that is the controller for the cluster (all other brokers will report 0), and shows the number of partitions in the cluster that currently have no leader. | kafka.controller: type=KafkaController, name=OfflinePartitionsCount |
Reference:
Kafka – The Definitive Guide by Neha Narkhede, Gwen Shapira & Todd Palino
If you liked this article and would like one such blog to land in your inbox every week, consider subscribing to our newsletter: https://skillcaptain.substack.com
Leave a Reply