Why should you put a proxy in front of Kafka ?
We can easily consider Apache Kafka to be the main solution for companies that want and need streaming in their Information Systems. It comes with many advantages like high throughput and resilience. However, it can also introduce significant complexity and bad practices, making it like the Black Pearl : a legendary ship that only a minority of people can pilot.
Having a cursed ship that plunders resources is exactly what a company doesn't need. It needs a solution that contributes to profitability.
Abstract governance
When I discuss about Kafka and its problems, the solution most often shared with me is : apply governance rules. This can be done by :
All of these solutions offer possibilities to apply governance, but if you want to do it at scale, with minimal effort, they are likely insufficient. The Strimzi Operator for example, is certainly the best way to manage Kafka resources, but it's not enough to apply our governance. It must be combined with something else.
“You certainly need synchronous in this asynchronous world”
A Kafka Proxy is a component between clients (consumers, producers or admins) and the cluster. Proxy is transparent, meaning it directly uses the Kafka protocol. Clients connect to Proxy, and Proxy to the cluster.
Using the Kafka protocol directly makes migration effortless. Any Kafka client can directly migrate its connection to the proxy, without changing its library or anything else. As Proxy's owner, you can add logic in real-time for each request going to Kafka, for each response coming from Kafka, or both. You act as a man-in-the-middle for Kafka (for good purposes, of course).
Each request going to Kafka can be validated or transformed. Applying governance rules is now easy, regardless of the tool used in front. You just need to ensure that all communication with the Kafka cluster passes through the Kafka Proxy.
Going back to our governance use case, we can set up our rules at the proxy level :
Recommended by LinkedIn
That's it. We never transform the request protocol to or from HTTP or anything else. Just Kafka, and only Kafka.
Now, imagine another Create Topic Request that doesn't follow your governance rules. The Proxy will simply not forward it to the Kafka cluster, and will directly respond with a rule violation to the client :
Summary
We have explored a basic use case for the Kafka proxy. Keep in mind: because we can control any request and / or response, we can enhance the Kafka Protocol for many things :
And we can go further with features like :
If I've piqued your interest, in a future article we'll explore how to improve business via a Kafka Proxy.
Author: Anthony Callaert - Staff engineer
Distinguished Architect at TJX Companies
1moThis gateway is already there in the industry. e.g., Gravitee, Conduktor!
Product Manager
1moNice post !
Principal Engineer @ Finout; Apache Pulsar committer; OpenTelemetry member; Managing Tech Leads IL & Java.IL
2moDid you write one ?
SVP, GTM at Conduktor | Helping Customers Win with Streaming Data, Lakehouse Hydration & Agentic AI | Twin Dad, Coach, MBA
2moVery insightful Anthony Callaert - we love a Kafka Proxy connected world. Stephane Derosiaux
Field CTO @ Aiven | Data and AI | Open Source | Streaming | Databases
2moGreat insight!