How We Built an In-House Workflow Platform at Kore.ai

At Kore.ai, we empower businesses to apply AI to solve real-world problems. Over the past year, with the growing adoption of Agentic AI for complex use cases, we realized that crafting agent experiences alone wasn't enough. To truly support these experiences, we needed a full-stack agentic platform — one that not only enabled intelligent conversations but also delivered dashboards, drill-downs, dynamic tables, and a wide range of UX interaction components. This meant building a robust API layer, workflow orchestration, high-performance data storage, configurable frameworks, and analytics — all customizable through low-code/no-code capabilities so that partners and customers could easily adapt and extend their Agentic AI applications.

To bring this vision to life, our engineering team built an in-house workflow platform leveraging technologies like Node-RED, BullMQ, Clickhouse, JSON Forms, and more. Each component was selected carefully to ensure performance, scalability, and ease of integration. The result is a flexible, extensible platform that powers agentic experiences end-to-end, blending the intelligence of AI agents with enterprise-grade workflows, data insights, and customization — paving the way for the next generation of AI-driven enterprise solutions at Kore.ai.

Why NodeRED Was the Foundation for Our Workflow Platform

NodeRED’s Node.js compatibility aligned perfectly with our team’s expertise, helping us accelerate development. Beyond its 4,000+ connectors, its visual low-code interface allowed rapid prototyping while still offering custom JavaScript coding for complex logic. We prioritized open-source flexibility (Apache 2.0 license) to avoid vendor lock-in and enable full commercialization. While Temporal excelled in orchestration, it lacked Node-RED’s UI-driven agility, and n8n’s licensing model conflicted with our multi-tenant SaaS architecture. Node-RED’s community-driven ecosystem further simplified scaling integrations across diverse use cases.


Multi-Tenancy: From File Storage to MongoDB

We replaced NodeRED’s default file storage with MongoDB, requiring a complete re-architecture of flow initialization, routing, and persistence. Each tenant’s workflows are now fully isolated, with dynamic routing middleware directing API calls to tenant-specific instances. We tackled challenges such as optimizing MongoDB indexing for rapid retrieval of flows and ensuring atomic updates to maintain consistency during concurrent edits. This transition significantly improved scalability, allowing us to onboard hundreds of partners seamlessly while maintaining full auditability across the platform.


Pause/Resume: Bridging Automation and Human-in-the-Loop

To support workflows requiring approvals or external user inputs, we developed a custom pause/resume node. This node serializes the complete workflow state—including variables, timers, and context—into MongoDB. The pause/resume API triggers email and Slack alerts embedding deep links for users to easily restart workflows. This enhancement transformed our workflows from rigid automation scripts into collaborative, human-in-the-loop processes, ideal for use cases like invoice approvals and exception handling.

Securing Untrusted Code in a Multi-Tenant World

Although Node-RED’s built-in VM provided basic isolation, we further hardened security by integrating vm2 for stricter sandboxing that blocks filesystem and network access. We also adopted gVisor for OS-level containment, ensuring deeper system-level protection. Additionally, we implemented resource limits such as 512MB RAM per node and 10-second CPU timeouts to prevent resource abuse. We utilized BullMQ-based throttling to carefully manage concurrent executions, preserving tenant isolation and maintaining system stability at scale.


ClickHouse + NodeRED: Real-Time Data at Scale

Our workflows ingest data from webhooks, APIs, and ETL jobs, which is then cleaned and normalized before streaming into ClickHouse. We introduced a custom ClickHouse node capable of reading and writing to tenant-specific schemas, validating data against JSON schemas, and enforcing role-based access control (RBAC) policies. This node batches data inserts for higher performance and includes retry logic for handling query failures.

ClickHouse’s columnar storage and distributed architecture empower us to execute sub-second queries even across terabytes of audit logs and analytics data.

JSON Forms: Low-Code Configuration as a Service

We adopted JSON Forms' declarative UI framework to allow partners to design forms dynamically using JSON schemas. Building on this, we extended the platform with a REST API layer to expose configurations, validate inputs, and manage forms dynamically. The configuration builder empowered developers to pre-define the options that business managers could later modify — enabling faster changes in the application experience without needing backend deployments or developer involvement.

BullMQ: Reliable Job Scheduling for Enterprise Scale

We augmented Node-RED’s cron contrib execution node with BullMQ to support enterprise-grade job scheduling. BullMQ provided prioritized, retriable task queues essential for use cases like report generation and asynchronous API integrations. We leveraged Redis-backed queues for cron-like daily scheduling, introduced a dead-letter queue for failed jobs, and integrated BullMQ monitoring into our SigNoz-based observability stack. This upgrade helped us maintain strict SLA compliance on both throughput and task latency.


Debugging Long-Running Workflows

Node-RED provides node-level debugging and tracing, enabling easy defect triaging and debugging by partner developers. The platform offers full visibility into request and response flows at every node level, which can be easily accessed and inspected through the Node-RED UI. This native capability significantly reduced the turnaround time for identifying, diagnosing, and fixing issues in long-running workflows.


ReactFlow: Modernizing the Workflow Designer

While Node-RED’s default UI was sufficient initially, it lacked the flexibility for building modern experiences. We migrated the workflow editor to ReactFlow, enabling highly customizable drag-and-drop node management, real-time collaboration, and undo/redo functionality. We retained Node-RED as the backend execution engine but mapped the flow JSON to ReactFlow component definition for much more modern better UX. This allows us to make the NodeRED workflow UX much more business user friendly.

Conclusion: Building a Foundation for the Future

By extending Node-RED with enterprise-grade scalability, security, observability, and low-code configurability, we have empowered our partners to deploy AI-driven workflows 3x faster. We are now seeing new possibilities such as use Agentic AI to even build workflows through our APIs. Our modular architecture ensures that we are future-ready, proving that foundational open-source technologies, when thoughtfully extended, can be powerful catalysts for Agentic AI adoption across enterprise.

Maulik Parikh

Global Leader - AI/ML, Data Architecture, Platforms and Products

2w

Thank you for sharing!

Like
Reply

To view or add a comment, sign in

More articles by Prajakt Deshpande

  • Building an AI App with AI: From Weeks to Days

    Over the past few months, I had been closely following the rapid advancements in AI Coding Assistants—tools promising…

    9 Comments

Insights from the community

Explore topics