Tuesday, July 6, 2021

Setup a Kafka Cluster

Kafka is distributed event streaming software based on publish and subscribe to model. It’s very powerful streaming software used by many organizations.


More details find in the below link

 

http://kafka.apache.org/intro

 

Prerequisite


Kafka is required minimum JAVA version 1.8 and above (JAVA_HOME environment variable need setup in the system)


Need 7zip software to extract .tgz files in windows.

 

Software and Tools



 

Windows 10

Kafka 2.8

Java 1.8 or higher

Zookeeper 3.5.9 (Embedded in Kafka)

 

 

Kafka Installation on windows


http://www.liferaysavvy.com/2021/07/kafka-installation-on-windows.html





 

Example demonstrates with 3 Kafka brokers and 3 Zookeepers in the cluster. Kafka uses the Zookeeper to manage the Kafka cluster.

 

Current example will replicate the cluster in single windows machine so we may have to change the configurations like ports, in real world cluster configuration, we will install one Kafka broker/zookeeper per machine/server. Example cluster is depicting in the above diagram.


 

Download and Extract


Download latest Kafka software from below location.

 

http://kafka.apache.org/downloads.html

 


Direct link is below

 

https://www.apache.org/dyn/closer.cgi?path=/kafka/2.8.0/kafka_2.13-2.8.0.tgz

 

Extract “kafka_2.13-2.8.0.tgz” in local drive and clone the extracted directory 3 times and rename directories as follows.

 

 

kafka-broker1

kafka-broker2

kafka-broker3

 

 



 

 

Cluster Configuration



Zookeeper Cluster Configuration


Latest version Kafka included with Zookeeper. We have to configure the zookeeper cluster first.


Zookeeper Node1

 

Locate to config directory of “kafka-broker1” and update “zookeeper.properties” with following properties .

 



 

Zookeeper Data Directory


Create zookeeper data directory and update same path in zookeeper properties file.


Example:

C:\\kafka-workspace\\kafka-broker1\\zookeeper\\data

 

Zookeeper myid


Create “myid” file in zookeeper data directory and provide the zookeeper id that is “1”. These ids should be unique for each zookeeper instance in the cluster.

 



 

Update “dataDir” to valid path that was crated before. We are managing 3 nodes of zookeeper in the cluster so we need to provide the “server. [myid]” property as follows and myid is the unique number assign to each zookeeper instance in the cluster. There are few other properties need to update as follows.

 

 

 

dataDir=C:\\kafka-workspace\\kafka-broker1\\zookeeper\\data

 

clientPort=2181

 

server.1=localhost:2666:3666

server.2=localhost:2667:3667

server.3=localhost:2668:3668

 

tickTime=2000

initLimit=5

syncLimit=2

 

 

 

 

server.<myid>=<hostname>:<leaderport>:<electionport>

 

 

Zookeeper Node1, don’t need update “clientPort” and keep it default port 2181.

 

Zookeeper Node2

 

Locate to config directory of “kafka-broker2” update “zookeeper.properties” with following properties .




Zookeeper Data Directory

 

Create zookeeper data directory and update same path in zookeeper properties file.


Example:


C:\\kafka-workspace\\kafka-broker2\\zookeeper\\data

 

Zookeeper myid


Create “myid” file in zookeeper data directory and provide zookeeper id that is “2

 



 

Update “dataDir” to valid path that was crated before. We are managing 3 nodes of zookeeper so we need to provide the server. [myid] property as follows and myid is the unique number assign to each zookeeper instance in the cluster. There are few other properties need to update as follows.

 

 

dataDir=C:\\kafka-workspace\\kafka-broker2\\zookeeper\\data

 

clientPort=2182

 

server.1=localhost:2666:3666

server.2=localhost:2667:3667

server.3=localhost:2668:3668

 

tickTime=2000

initLimit=5

syncLimit=2

 

 

 

 

server.<myid>=<hostname>:<leaderport>:<electionport>

 

 

 

Zookeeper Node2 we need to update “clientPort” and it is port 2182.



Zookeeper Node3

 

Locate to config directory of “kafka-broker3” update “zookeeper.properties” with following properties .




Zookeeper Data Directory

 

Create zookeeper data directory and update same path in zookeeper properties file.


Example:


C:\\kafka-workspace\\kafka-broker3\\zookeeper\\data

 

Zookeeper myid

 

Create “myid” file in zookeeper data directory and provide zookeeper id that is “3

 



 

Update “dataDir” to valid path that was crated before. We are managing 3 nodes of zookeeper so we need to provide the server. [myid] property as follows and myid is the unique number assign to each zookeeper instance in the cluster. There are few other properties need to update as follows.

 

 

dataDir=C:\\kafka-workspace\\kafka-broker3\\zookeeper\\data

 

clientPort=2183

 

server.1=localhost:2666:3666

server.2=localhost:2667:3667

server.3=localhost:2668:3668

 

tickTime=2000

initLimit=5

syncLimit=2

 

 

 

 

server.<myid>=<hostname>:<leaderport>:<electionport>

 

 

 

Zookeeper Node3 we need to update “clientPort” and it is port 2183.

 

Note:


Zookeeper will use serever.[myid] ports to communicate each other and maintain cluster and sync the data. This configuration should be present in all Zookeeper instances in the cluster.

 

Start Zookeeper Instances


Zookeeper Node1


Open command prompt and locates to Kafka Broker1 root directory and use following command to start zookeeper service.


 

bin\windows\zookeeper-server-start.bat .\config\zookeeper.properties

 

 




 

If no errors in the logs it means zookeeper node1 is started successfully.

 



 

Note:


You may see “java.net.ConnectException: Connection refused: connect” warning and its due to other nodes are not yet started.


Zookeeper Node2


Open command prompt and locates to Kafka Broker2 root directory and use following command to start zookeeper service.


 

bin\windows\zookeeper-server-start.bat .\config\zookeeper.properties

 

 




 

If no errors in the logs it means zookeeper node1 is started successfully.

 



 

Note:


You may see “java.net.ConnectException: Connection refused: connect” warning and its due to other nodes may not started.

 

Zookeeper Node3


Open command prompt and locates to Kafka Broker3 root directory and use following command to start zookeeper service.

 

 

bin\windows\zookeeper-server-start.bat .\config\zookeeper.properties

 

 



 

If no errors in the logs it means zookeeper node3 started is successfully.

 



 

Kafka Cluster Configuration


We have successfully configured the zookeeper cluster and it’s time for configure the Kafka broker cluster.


Kafka Broker1



Locate to Kafka Broker1 root directory and update “server.properties”.




We need to update broker.id, listener and zookeeper.connect properties as follows.

 

 

log.dirs=C:\\kafka-workspace\\kafka-broker1\\kafka-logs

listeners=PLAINTEXT://:9092

broker.id=1

zookeeper.connect=localhost:2181,localhost:2182,localhost:2183

 

 

zookeeper.connect is comma separated values of zookeeper cluster nodes.



Kafka Broker2


Locate to Kafka Broker2 root directory and update “server.properties”.





We need to update broker.id, listener and zookeeper.connect properties as follows.

 

 

log.dirs=C:\\kafka-workspace\\kafka-broker2\\kafka-logs

listeners=PLAINTEXT://:9093

broker.id=2

zookeeper.connect=localhost:2181,localhost:2182,localhost:2183

 

 

zookeeper.connect is comma separated values of zookeeper cluster nodes.

 

Kafka Broker3



Locate to Kafka Broker3 root directory and update “server.properties”.




We need to update broker.id, listener and zookeeper.connect properties as follows.

 

 

log.dirs=C:\\kafka-workspace\\kafka-broker3\\kafka-logs

listeners=PLAINTEXT://:9094

broker.id=3

zookeeper.connect=localhost:2181,localhost:2182,localhost:2183

 

 

zookeeper.connect is comma separated values of zookeeper cluster nodes.

 

Note:


We are running multiple brokers in same machine so we have to change ports numbers. In real world cluster configuration, each server has one broker so there no changes in the configuration.


broker.id.generation.enable property will generate broker.id dynamically and manually assign broker.id not required if we use this property.


 

Start Kafka Brokers


Kafka Broker1


Open command prompt and locate to Kafka Broker1 root directory. Run following command to start Kafka broker1 service.

 


 

bin\windows\kafka-server-start.bat .\config\server.properties

 

 




We can see Kafka started message in the logs with broker Id.





Kafka Broker2


Open command prompt and locate to Kafka Broker2 root directory. Run following command to start Kafka broker2 service.

 


 

bin\windows\kafka-server-start.bat .\config\server.properties

 

 





We can see Kafka started message in the logs with broker Id.






Kafka Broker3


Open command prompt and locate to Kafka Broker2 root directory. Run following command to start Kafka broker service.

 

 

bin\windows\kafka-server-start.bat .\config\server.properties

 

 



We can see Kafka started message in the logs with broker Id.




 

 

If all brokers started successfully then Kafka cluster is successfully completed.


Verify Kafka Cluster


We create Kafka topic and produce some messages on topic. Other end we will run consumer to receive messages.

 

Create Kafka Topic


Open command prompt and locate to one of the Kafka broker bin windows directory. Use following create topic command.


 

kafka-topics.bat --create --zookeeper localhost:2181,localhost:2182,localhost:2183 --replication-factor 3 --partitions 3 --topic first-kafka-cluster-topic

 

 

We should pass all zookeeper cluster nodes in the options.




 

List topics


 

kafka-topics.bat --zookeeper localhost:2181,localhost:2182,localhost:2183 --list

 

 


List command list the all topics in the Kafka.




 

Start Producer

 

Open command prompt and locate to one of the Kafka Broker bin windows directory. Use following producer command to start producer and post messages on specific topic.

 

 

kafka-console-producer.bat --broker-list localhost:9092,localhost:9093,localhost:9094 --topic first-kafka-cluster-topic

 

 


Broker-list is Kafka Brokers list which we have configured in the cluster (localhost:9092,localhost:9093,localhost:9094).







Start Consumer

 

Open command prompt and locate to Kafka bin windows directory. Use following consumer command to start consumer.

 


 

kafka-console-consumer.bat --bootstrap-server localhost:9092,localhost:9093,localhost:9094 --topic first-kafka-cluster-topic --from-beginning

 

 





 

Now type some messages on producer command prompt and same messages are receiving at consumer command prompt. This confirms the installation of Kafka Cluster is successful.

 




 

References



https://docs.confluent.io/platform/current/zookeeper/deployment.html


https://docs.confluent.io/platform/current/installation/configuration/broker-configs.html

 


Notes:



This example is demonstrating Kafka cluster configuration and replicated in single machine. Real world cluster can be created with different servers and each server have one Kafka broker.

 

Its not necessary to keep Zookeeper and Kafka both are in same server. Separating the Zookeeper and Kafka brokers’ cluster is the one of the best practices.

 

We have separate Zookeeper binaries to install standalone zookeeper.



Author

 

 

 

 

 

1 comment :

Recent Posts

Recent Posts Widget

Popular Posts