Applying Domain-Driven Design principles on serverless architecture in AWS

Applying Domain-Driven Design principles on serverless architecture in AWS

Introduction

When Eric Evans first published his book and introduced the methodology in 2003, most business applications used monolithic architecture designed to run on-premise. The methodology revolutionized gathering requirements, designing, and structuring our code. With the evolution of service-oriented architecture, which was the foundation for microservices architecture, the building blocks and tools we use as software engineers have changed, and a revised approach was needed.

We no longer have a single application with layered architecture but relatively small encapsulated and independently deployed code units (lambdas) and services working harmoniously.

This article attempts to apply modern architecture principles and a Domain-Driven Design approach to serverless native cloud computing.

Principles and Guidelines

This approach combines principles of serverless design, Domain-Driven Design, and OO Design to promote and utilize the power of serverless cloud offerings, such as scalability and fast adoption of innovations, microservices isolation, high availability, and DDD business agility. It assumes a multi-tenant solution that is configurable and extensible.

Asynchronous design: The system deploys a hybrid approach with orchestration and choreography. The orchestrator handles messages reactively at different levels and then executes them centrally using the responsible orchestrator. 

Users will get a response as soon as business logic has been executed and before the actual implementation is completed. The principle of eventual consistency is applied to provide responsive behavior. 

Single responsibility—Every component in the system is responsible for a single aspect of functionality. There should be one and only one place in the system that handles that aspect. This allows for the fast adoption of newer technologies, better and faster resolution of issues in the system, and predicted behavior.

Single access path – As a result of the single responsibility principle, every resource in the hosting environment (i.e. AWS) is accessed in a single place in the application. That enhances security, control, and observability of system behavior. It allows earlier detection of issues and enables the implementation of self-healing systems.

Encapsulation and segregation – Every component in the system receives all the information it needs to execute via a message and publishes the result via an event. Following this principle allows us to fix or replace a single component without affecting any other component. There should never be direct calls from one component to another; this is the responsibility of an orchestrator.

Extensibility and configurability – The system will allow the configuration of certain behaviors when deviation in the process or business rules are expected. The system will maintain a separate configuration set for each tenant or a finer resolution if needed. Configuration will alter the behavior of a single code logic without altering the code. For example, user roles and rights, enablement of features, adding user-defined data, etc. 

When the requirement deviates significantly from the existing code or when additional non-standard behavior is needed, the system will support extension code. This code will be tenant-specific and executed only when the criteria apply. Extension code can be written specifically for a set of requirements, have its own extension data, and access external systems.

Layered Architecture

Article content


API Layer

This layer exposes all the endpoints that the application can perform. Those endpoints are invoked from an external resource and could be REST or asynchronous (like AppSync). The only purpose of this layer is to expose the functionality and forward it to the request-handling layer

Request handling Layer

This layer handles and manages requests as they are received from the API. The layer is unaware of the request's payload or purpose but handles the request generically. This is the ‘triage’ stage of the request journey.

The layer is responsible for persistence first logic and stores all requests in a log. The create event from the log triggers the next action—authorization. 

The authorization module can access the configuration data, particularly role definitions, and determine, based on the user, action, and resource if the user is authorized to perform the action on the requested resource.

If authorization is passed, the next stage in this layer is extension management. 

Extension Manager

The system allows 3 types of extensions. Extensions can be defined for each action on any endpoint based on the tenant and resource type. Those definitions are stored in the configuration store.

  • Pre-operation is an action performed prior to executing our standard code. The extension manager passes the original request payload to the extension and expects two parameters back: a potentially enriched payload to be transferred to the core code and a flag that can stop the operation from executing with an optional message. If execution is not blocked, the extension manager will continue to execute the action, passing the enriched payload.
  • Override: This extension will replace the core code in the system. It should be rarely used, as the core code should be configurable enough to cover the vast majority of scenarios. In cases of significant differences, the extension manager would call the override extension, passing the payload. The override extension development team is responsible for performing all the actions required to execute the requested action. The core code will not be called if the override extension is defined.
  • Post operation: This is an asynchronous extension that is called after the completion of the domain layer logic. The extension receives the payload of the event performed by the domain layer. The main purpose of this extension is to notify external systems or perform post activities required by specific tenants in an extension data.

Once the execution of the request handling is complete, a message will be put in the queue to be handled by the next layer. For Commands – the domain layer. For Queries – the Query processor.

Domain Layer

This layer is the heart of the system and contains all the business logic to carry out the requested command. It is responsible for changing the state of any entity in the domain model. Requests that do not change the state of any domain entity (queries) should skip that layer.

The domain layer is divided into sub-domains, each responsible for a different aspect of the business. Each such sub-domain will have its own data store and will manage the state of the related entities. 

The execution units (lambdas) responsible for the operation will respond to messages in the queue and process the event. That execution unit can enrich the data received from the request by calling a repository connected to the read model of the DB. The execution unit then will perform all validation, calculate any required value, and construct the new state of the entity (Create/Update/Delete). There should be no dependency on this layer and no direct connection to any other components except for the repository. The code in this layer will be pure computation code, and the entity will conform to the entity model. The layer should not be aware of the physical storage model. 

When the result entity state is constructed, the layer emits an event. The response handler, the post-execution extension manager, handles this event and puts a message in the application layer handler event bus.

Repository Component

This component will access the read model based on the physical storage structure and provide data to consumers in DTO or domain model format. This service will never alter the data. By implementing a tri-lateral API pattern, the service could be accessed from any layer in the system and respond both synchronously and asynchronously. This module will not expose the physical storage structure, reducing the dependency between the persistence layer and the rest of the application.

Response handler

This component will respond to domain layer events and application layer events (in case of queries) and be responsible for assembling the construct and delivering the response back to the consumer. It will have all the logic to re-establish communication with the consumer through the API.

Application layer (service orchestrators)

This layer is the orchestration heart of the operations carried out by the domain layer. Each message (as a result of the event) from the domain layer will have an orchestrator responsible for carrying out the result.

The orchestrator will not interact with the hosting system (i.e. AWS) and will not alter the state of the entity coming from the domain layer. The complete state of the entity should be handled only by the domain layer.

For example, when a domain entity is created, the domain layer will emit an “Entity Created” event with the complete aggregate root of the new entity in the domain model structure. This event will be converted to a message waiting to be handled. When the application layer orchestrator for this event grabs that message, it can call the CQRS/DB component and wait for a result event. Upon successful persistence, the orchestrator will call the notification, cache, and reporting service components, waiting for the response and updating observability.

This layer is responsible for handling errors, time-outs, observability, and other general aspects of the execution flow.

Services layer

This layer is comprised of individual, encapsulated service components, each responsible for a different aspect of the execution. Those services will never talk to each other but rather communicate with the application layer service orchestrator.

Some common service handler components are:

Database ES/CQRS service component

This service will be responsible for persisting any changes to the database. It will receive the entity to be modified with its new state, update the write model (event store), and project the changes to the read model.

The service should have an internal orchestrator to handle persistence, projections, and all CQRS-related versioning, conflicts, and events. It will always receive the data in the domain model structure and will be the only component that accesses the DB (both read and write) for modifications. This is the only path to alter data in the whole system.

The data will be stored in the write DB (Event Store) for a single aggregate root, and the projections will be handled separately for each insert. In case the data arrives at this component in batch form, the internal orchestrator will be responsible for breaking down the request to a single entity at a time. This approach will ensure stateless operation, scalability, less error pruning (single entity failure), and the traceability of each entity in the system.

The projection code will be aware of the physical model (of the read DB), and together with the repository component, it will be the only service aware of the physical storage structure and technology.

Notification service component 

This component will receive the same message information component from the domain layer, will read configuration data, and decide if notifications should be sent.

This component is responsible for determining the notification method, the template to be used, and the distribution list for each notification. It will gather any additional data required for the template processing, construct the final message, and then send it via email, text, etc. All the handling of the send queues and logging of internal operations will be internal to this component. If a different domain model entity needs to be updated, the responsibility will be on the CQRS/DB component and will be orchestrated by the application layer based on the complete event of the notification service. In case of time-triggered notifications, the application layer will maintain the process internally based on configuration data and call this component.

Cache Service component

This service is responsible for updating and refreshing the cache system. Once called by the application layer, it should check if any of the cache index values should be updated based on configuration – and update the cache accordingly.

Reporting service component 

This service will update the reporting data in the reporting DB. This is a separate and independent component from the CQRS/DB service and has its own projection.

Other services (Analytics, AI/ML, etc.) should follow the same logic and communication patterns.

Request Process Flow

The following section describes the steps in which a single request received by the API is followed throughout the system

  • Request received by the API and transferred to the request handling lambda
  • The request is persisted in a request store
  • Event DB create event is emitted and transferred to authorization lambda
  • Based on the metadata of the request payload (user/resource/action), the request is authorized and passed to the extension handler
  • The extension handler checks if there is a pre-process extension and, if so – calls it 
  • If a pre-process has been called, the extension handler checks if further execution is allowed by the extension response
  • Request payload  (if pre-process extension called – the enriched request payload) is put on a message bus to be handled by the domain layer
  • The domain layer handler picks up the message, gets additional information from the repository if needed, and determines if a state change is required. The new status is then calculated, and an event is emitted.
  • Event is put in a message queue to be handled by the application layer
  • The response handler constructs a response message to the consumer
  • The post-extension handler checks if post-processing is defined and calls it asynchronously if required.
  • The responsible orchestrator in the application layer picks up the message
  • Orchestrator is calling CQRS/DB service
  • DB service persists the data in the write DB following the domain model and handles the creation even in the write DB to project the data into the read DB using the physical read model. When the process completes – the event is emitted
  • Orchestrator handles the event and calls notification and cache services.
  • Notifications are sent by the notification service, and an event is emitted
  • The cache is updated by the cache service, and an event is emitted
  • The orchestrator handles both events and updates all logs.
  • The orchestrator emits a complete event
  • If notification to consumer is requested/required – the response component will call the consumer with a complete message.
  • If the request fails, the orchestrator will invoke a notification service with an alert message.

Conceptual Design Diagram

Article content

Conclusion

Architecting and designing an enterprise-scale application on AWS using native cloud architecture could be challenging as the paradigm we are used to has changed. This document describes how to apply the intent and principles we followed when designing a system before the serverless architecture era. Feel free to tweak the design to your specific needs and apply the principles that are important to you.

Rohit Rana

Head/Global Support Leader, GSI/GSP Global Strategic Partner Accounts at Amazon Web Services (AWS) | MBA

12mo

Excellent article Motty, the insights and design approach will help many organizations and teams

Stephen McFall

Senior Manager Solutions Architecture at Deloitte

12mo

Very pleased that we have this design from you to enable our work.

Ansuman Satapathy

Principal Software Engineer | Gen AI

12mo

Great read !

To view or add a comment, sign in

More articles by Motty Chen

Insights from the community

Others also viewed

Explore topics