#kafka

Kafka Lesson Learned

After a while working with Kafka. I usually use some command and I also met some issues. I would like to write it here as a note. So you and I can read it as need.

Winners .. Are not those who never fail but those who never quit!

Winners .. Are not those who never fail but those who never quit!

Issue #1:  Cannot locate memory

With this issue, we should change bin/kafka-server-start.sh:

After that, restart your Kafka

Issue #2: You want to try run Kafka with the replication.

You just need to create multiple configuration file and start Kafka:

Command #1: Create Kafka topic

Command #2: Change Kafka topic partitions

Command #3: Delete Kafka topic

You have to change conf/server.properties file

And restart and delete topic with command:

Command #4: Delete the old data

Change retention.ms config to 1 second

Restore to retention.ms to the default value:

Command #5: List topcis

Tips

We should run Kafka in Ubuntu Upstart, It will help you restart Kafka if crashes.

Create Upstart for Zookeeper in the first, /etc/init/zookeeper.conf:

And Upstart for Kafka, /etc/init/kafka.conf:

Now, the system will start Zookeeper and Kafka on the startup. Or you can start in manually by command:

 

Install Apache Kafka in Ubuntu 14.04

This post just shows you the way to install Kafka easily. To run Kafka, we have to have Zookeeper. So we will install Zookeeper firstly. After that, We install Kafka. To get the links to download the latest version of Kafka and Zookeeper, you can find them in Apache Foundation website.

Install Java (if needed)

1. Install Zookeeper

You can get the binary download link here: http://zookeeper.apache.org/releases.html#download

2. Install Kafka

You can get the binary download link here: http://kafka.apache.org/downloads.html

And if you want to run Kafka run on startup and automatically restart if Kafka crashes, you should use Upstart in Ubuntu 14.04. The below is an example for kafka.conf:

 

A simple examle for Python Kafka Avro

In the weekend, I try to use Python to write a producer and a consumer for Apache Kafka. I found Kafka-Python library that can help me do it easily. However, If you try to send Avro data from Producer to Consumer, it is not easy. You have to understand about them. We have enough specifications but there is no example source code. So this is a simple example to create a producer (producer.py) and a consumer (consumer.py) to stream Avro data via Kafka in Python.

The wise man never knows all, only fools know everything.

To run this source code, please make sure that you installed Kafka (https://sonnguyen.ws/install-apache-kafka-in-ubuntu-14-04/) and Python libraries (kafka-python, avro, io). And I am using Python 2.7

Create producer.py

Create consumer.py

Time for test:

I hope that this post will help you say “Hello” to Kafka, Python and Avro

Please see the details in GitHub: https://github.com/thanhson1085/python-kafka-avro

In the source code repository above, I also created consumer_bottledwater-pg.py to decode avro data that pushed from bottedwater-pg Kafka producer. This base on the question in Stackoverflow