Real-Time Data Streaming with Apache Kafka and Node.js: A Complete Tutorial
Real-Time Data Streaming with Apache Kafka and Node.js

Real-Time Data Streaming with Apache Kafka and Node.js: A Complete Tutorial

We live in an age where time is of the essence and every second counts. That’s why real-time data streaming has become a crucial aspect of modern-day applications. Imagine being able to process massive amounts of data in real time and get instant access to valuable insights! Thanks to the powerful combination of Apache Kafka and frameworks like Node.js, developers can now build robust real-time data streaming applications that can handle even the most demanding workloads.

Kafka enables applications to send and receive real-time data feeds, while Node.js offers exceptional performance and scalability, making it the perfect match for Kafka. The possibilities are endless, and the potential for innovation is immense. Get ready to embark on an exciting journey of real-time data streaming with Kafka and Node.js!

By the end of this tutorial, you’ll have a comprehensive understanding of how to use Kafka and Node.js to create real-time data streaming applications that meet the demands of modern-day users.

No alt text provided for this image

First, we will run Kafka in the container and then will implement the producer and consumer servers.

To run Kafka in a container, we need to write a yaml file.

Docker-compose. yaml

We are going to run two containers, Kafka and Zookeeper. In the Kafka ecosystem, Zookeeper is crucial in managing the leader election process for partitions and topics. Kafka brokers rely on Zookeeper to identify the leading broker for a specific partition and topic. Additionally, Zookeeper is responsible for storing topic configurations and permissions.

version: "3"
services:
  kafka:
    image: "bitnami/kafka:latest"
    container_name: "kafka"
    ports:
      - "9092:9092"
    environment:
      - KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://127.0.0.1:9092
      - KAFKA_BROKER_ID=1
      - KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
      - KAFKA_LISTENERS=PLAINTEXT://:9092
      - ALLOW_PLAINTEXT_LISTENER=yes
    depends_on:
      - zookeeper
  zookeeper:
    image: "bitnami/zookeeper:latest"
    container_name: "zookeeper"
    ports:
      - "2181:2181"
    environment:
      - ALLOW_ANONYMOUS_LOGIN=yes        

Just go to the root direct and run the following command,

make sure that the Docker is set up on your system.

docker-compose up        

Now, it's time to implement the other part, which is Producer and Consumer.

Initialization of Project

To start, we have to initialize a node project.

npm init --yes        

where — — yes will set the default template values.

Now, we have to install dependencies to implement the Kafka

npm install kafkajs        

and yes, that’s the only dependency we need to run on servers.

Now, we would create two different server files named producer.js and consumer.js and run them differently.

Before that, In the package.json file, we would write two scripts.

"scripts": {
  "start:consumer": "node consumer.js",
  "start:producer": "node producer.js"
 }        

Producer.js

After running the Kafka in the container, Now, we have to create a producer server.

import { Kafka, Partitioners } from "kafkajs"

const kafka = new Kafka({
 clientId: "my-producer",
 brokers: ["localhost:9092"],
})

const producer = kafka.producer({
 createPartitioner: Partitioners.DefaultPartitioner,
})

const produce = async () => {
 await producer.connect()
 await producer.send({
  topic: "my-topic",
  messages: [
   {
    value: "Hello From Producer!",
   },
  ],
 })
}

// produce after every 3 seconds
setInterval(() => {
 produce()
  .then(() => console.log("Message Produced!"))
  .catch(console.error)
}, 3000)        

Where first we have to import the Kafka class and use its constructor to make the connection with Kafka running in a container.

clientId: is to differentiate the Kafka client (good for debugging).
brokers: would be the URLs to make connection with Kafka.

While creating a producer, we are just telling him to use the default partitioner by setting its value to Partitioners.DefaultPartitioner.

To send a message, we use the send () method of producer and pass an object with the topic name and messages.

In the end, we are using the setInterval() method to produce the content every 3 seconds, so our consumer could catch them.

Consumer.js

We are done with the first two parts, running the Kafka in a container and creating a producer server. Now, this is the last part of the implementation to create a consumer server.

import { Kafka } from "kafkajs"

const kafka = new Kafka({
 clientId: "my-consumer",
 brokers: ["localhost:9092"],
 // dont log anything
 logLevel: 0,
})

const consumer = kafka.consumer({ groupId: "my-group" })

const consume = async () => {
 await consumer.connect()
 await consumer.subscribe({ topic: "my-topic", fromBeginning: true })

 await consumer.run({
  eachMessage: async ({ topic, partition, message }) => {
   console.log({
    value: message.value.toString(),
   })
  },
 })
}

// start consuming
consume().catch(console.error)        

We have the same code to start with the Kafka client as we did before in the producer server.

But here, while creating a consumer we pass

groupId to create a group. Whenever a consumer is created Kafka creates an group with a random name. So, we are just giving it a time to recongize it later.

After connection, we would have to subscribe to a specific topic. So, we could listen to it whenever a message is produced in it.

fromBeginning: is set to true, means that whenever, start consuming messages from a specific topic, we want to consume all the messages the topic has from the beginning. Otherwise, it will start consuming the messages which are produced after running our consumer server.

Running Both Servers

We would run both servers by using the scripts defined in the package.json file. Here are the results,

No alt text provided for this image

Whenever a message is produced for a specific topic, the consumer of that topic would receive the message.

Repository

asimhafeezz/kafka-with-node (github.com)

Summary

In this article, we learned how to use Kafka, a popular messaging system, in a containerized environment. We also learned how to create consumer and producer servers using Node.js, a popular server-side programming language. We followed step-by-step instructions to set up Kafka in a Docker container and created a sample Node.js application for both the consumer and producer servers.

Finally, we ran both servers and verified that they were working correctly. By the end of the article, we had a good understanding of how to use Kafka and Node.js together in a containerized environment, which can be useful for building scalable and distributed applications.


I hope you found the article helpful, and if you appreciate the content, consider showing your support by liking and sharing it.

What's the most challenging part of managing your own Kafka setup? Is it the heavy time investment, costly initial setup, or the complexity of maintenance? Or maybe something else entirely? Let us know your thoughts! 👇 https://shorturl.at/LpIsV

Like
Reply
Huzaifa Asif

Engineering Lead | Solution Architect | Cloud Engineer | FinTech | SaaS | PaaS | AWS | Azure | GCP

2y

Very insightful 💯

To view or add a comment, sign in

More articles by Asim Hafeez

Insights from the community

Others also viewed

Explore topics