Real-Time Data Streaming with Apache Kafka and Node.js: A Complete Tutorial

Asim Hafeez

Engineering Lead | Lead | AI | GO | System Design | Blockchain | AWS

Published Apr 27, 2023

We live in an age where time is of the essence and every second counts. That’s why real-time data streaming has become a crucial aspect of modern-day applications. Imagine being able to process massive amounts of data in real time and get instant access to valuable insights! Thanks to the powerful combination of Apache Kafka and frameworks like Node.js, developers can now build robust real-time data streaming applications that can handle even the most demanding workloads.

Kafka enables applications to send and receive real-time data feeds, while Node.js offers exceptional performance and scalability, making it the perfect match for Kafka. The possibilities are endless, and the potential for innovation is immense. Get ready to embark on an exciting journey of real-time data streaming with Kafka and Node.js!

By the end of this tutorial, you’ll have a comprehensive understanding of how to use Kafka and Node.js to create real-time data streaming applications that meet the demands of modern-day users.

First, we will run Kafka in the container and then will implement the producer and consumer servers.

To run Kafka in a container, we need to write a yaml file.

Docker-compose. yaml

We are going to run two containers, Kafka and Zookeeper. In the Kafka ecosystem, Zookeeper is crucial in managing the leader election process for partitions and topics. Kafka brokers rely on Zookeeper to identify the leading broker for a specific partition and topic. Additionally, Zookeeper is responsible for storing topic configurations and permissions.

version: "3"
services:
  kafka:
    image: "bitnami/kafka:latest"
    container_name: "kafka"
    ports:
      - "9092:9092"
    environment:
      - KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://127.0.0.1:9092
      - KAFKA_BROKER_ID=1
      - KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
      - KAFKA_LISTENERS=PLAINTEXT://:9092
      - ALLOW_PLAINTEXT_LISTENER=yes
    depends_on:
      - zookeeper
  zookeeper:
    image: "bitnami/zookeeper:latest"
    container_name: "zookeeper"
    ports:
      - "2181:2181"
    environment:
      - ALLOW_ANONYMOUS_LOGIN=yes

Just go to the root direct and run the following command,

make sure that the Docker is set up on your system.

docker-compose up

Now, it's time to implement the other part, which is Producer and Consumer.

Initialization of Project

To start, we have to initialize a node project.

npm init --yes

where — — yes will set the default template values.

Now, we have to install dependencies to implement the Kafka

npm install kafkajs

and yes, that’s the only dependency we need to run on servers.

Now, we would create two different server files named producer.js and consumer.js and run them differently.

Before that, In the package.json file, we would write two scripts.

"scripts": {
  "start:consumer": "node consumer.js",
  "start:producer": "node producer.js"
 }

Producer.js

After running the Kafka in the container, Now, we have to create a producer server.

Consumer.js

We are done with the first two parts, running the Kafka in a container and creating a producer server. Now, this is the last part of the implementation to create a consumer server.

import { Kafka } from "kafkajs"

const kafka = new Kafka({
 clientId: "my-consumer",
 brokers: ["localhost:9092"],
 // dont log anything
 logLevel: 0,
})

const consumer = kafka.consumer({ groupId: "my-group" })

const consume = async () => {
 await consumer.connect()
 await consumer.subscribe({ topic: "my-topic", fromBeginning: true })

 await consumer.run({
  eachMessage: async ({ topic, partition, message }) => {
   console.log({
    value: message.value.toString(),
   })
  },
 })
}

// start consuming
consume().catch(console.error)

We have the same code to start with the Kafka client as we did before in the producer server.

But here, while creating a consumer we pass

groupId to create a group. Whenever a consumer is created Kafka creates an group with a random name. So, we are just giving it a time to recongize it later.

After connection, we would have to subscribe to a specific topic. So, we could listen to it whenever a message is produced in it.

fromBeginning: is set to true, means that whenever, start consuming messages from a specific topic, we want to consume all the messages the topic has from the beginning. Otherwise, it will start consuming the messages which are produced after running our consumer server.

Running Both Servers

We would run both servers by using the scripts defined in the package.json file. Here are the results,

Whenever a message is produced for a specific topic, the consumer of that topic would receive the message.

Repository

asimhafeezz/kafka-with-node (github.com)

Summary

In this article, we learned how to use Kafka, a popular messaging system, in a containerized environment. We also learned how to create consumer and producer servers using Node.js, a popular server-side programming language. We followed step-by-step instructions to set up Kafka in a Docker container and created a sample Node.js application for both the consumer and producer servers.

Finally, we ran both servers and verified that they were working correctly. By the end of the article, we had a good understanding of how to use Kafka and Node.js together in a containerized environment, which can be useful for building scalable and distributed applications.

I hope you found the article helpful, and if you appreciate the content, consider showing your support by liking and sharing it.

Connect with Asim: AI Focus

1,295 follower

+ Subscribe

Vishal Sharma

7mo

What's the most challenging part of managing your own Kafka setup? Is it the heavy time investment, costly initial setup, or the complexity of maintenance? Or maybe something else entirely? Let us know your thoughts! 👇 https://shorturl.at/LpIsV

Huzaifa Asif

Very insightful 💯

1 Reaction

See more comments

To view or add a comment, sign in

Real-Time Data Streaming with Apache Kafka and Node.js: A Complete Tutorial

Asim Hafeez

Engineering Lead | Lead | AI | GO | System Design | Blockchain | AWS

Docker-compose. yaml

Initialization of Project

Producer.js

Recommended by LinkedIn

Consumer.js

Running Both Servers

Repository

Summary

Connect with Asim: AI Focus

1,295 follower

More articles by Asim Hafeez

Insights from the community

Others also viewed

Apache Kafka vs Solace PubSub+: A Comprehensive Guide for Modern Messaging Systems

--- Apache Kafka vs Solace PubSub+: A Comprehensive Guide for Modern Messaging Systems

Apache Flink vs. Kafka Streams: A Comprehensive Comparison

Understanding Apache Kafka: A Detailed Guide

Comparing Apache Kafka and Apache Pulsar: A Comprehensive Technical-Professional Analysis

All about Apache Kafka – An evolved Distributed commit log

Building Real-Time Data Pipelines with Apache Kafka

Apache Kafka

Kafka

Explore topics

Docker-compose. yaml

Initialization of Project

Producer.js

Recommended by LinkedIn

Consumer.js

Running Both Servers

Repository

Summary

Connect with Asim: AI Focus

1,295 follower

More articles by Asim Hafeez

Architectures and Models of Generative AI

Building a YouTube AI Q&A Bot with Langchain, Llama, and Python

How Vector Databases and Embeddings Power AI

Introduction to Function Calling with LLMs

Build a RAG App with Langchain and Node.js: Chat with Your PDF

Use Llama 3.1 as Your Private LLM

Use OpenAI with Node.js

What are Large Language Models (LLMs)? How do they work?

Configure and Implement AWS Cognito using Nestjs

Building Web Services with NestJS, TypeORM, and PostgreSQL

Insights from the community

Others also viewed

Apache Kafka vs Solace PubSub+: A Comprehensive Guide for Modern Messaging Systems

--- Apache Kafka vs Solace PubSub+: A Comprehensive Guide for Modern Messaging Systems

Apache Flink vs. Kafka Streams: A Comprehensive Comparison

Understanding Apache Kafka: A Detailed Guide

Comparing Apache Kafka and Apache Pulsar: A Comprehensive Technical-Professional Analysis

All about Apache Kafka – An evolved Distributed commit log

Building Real-Time Data Pipelines with Apache Kafka

Apache Kafka

Kafka

Explore topics