Tuesday, March 5, 2013

Finding Leader Broker in Kafka 0.8.0

With Kafka 0.8.0 the paradigm for storing data has changed. No longer can you connect to any broker and get access to the data for a topic and partition.

Starting with 0.8.0 Kafka has implemented replication, so now data is only on a subset of your Brokers (assuming you have more Brokers than count of replicas).

To figure out which Broker is the leader for your topic and partition, connect to any Broker and request the MetaData:

        kafka.javaapi.consumer.SimpleConsumer consumer  = new SimpleConsumer("mybroker01",9092,100000,64 * 1024, "test");
        List topics2 = new ArrayList();
        topics2.add("storm-anon");
        TopicMetadataRequest req = new TopicMetadataRequest(topics2);
        kafka.javaapi.TopicMetadataResponse resp = consumer.send(req);

        List<kafka.javaapi.TopicMetadata> data3 =  resp.topicsMetadata();

The reply from Kafka contains information about every partition in the topic. Iterate through the response and find the partition you want, then call 'leader().host()' to figure out what Broker to connect to.

The following code shows all the partitions for a specific topic, including where each of the replicas resides.

        for (kafka.javaapi.TopicMetadata item : data3) {
           for (kafka.javaapi.PartitionMetadata part: item.partitionsMetadata() ) {
               String replicas = "";
               String isr = "";

               for (kafka.cluster.Broker replica: part.replicas() ) {
                   replicas += " " + replica.host();
               }

               for (kafka.cluster.Broker replica: part.isr() ) {
                   isr += " " + replica.host();
               }

              System.out.println( "Partition: " +   part.partitionId()  + ": " + part.leader().host() + " R:[ " + replicas + "] I:[" + isr + "]");
           }
        }


Note that you only have to do this if you are managing your own offsets and using the SimpleConsumer. Using Consumer Groups takes care of this automatically.

Also note that you don't need to do this on the Producer side, Kafka handles figuring out which partition is on which Broker for you.

No comments: