With the pandemic and five lockdowns in Melbourne giving me some spare time at home, I thought I would share some of my knowledge acquired over the last twenty years working in the integration domain. I have worked as an architect across a wide range of industries including Telecommunications, Retail, Logistics, Financial Services and Scientific Organisations in Australia and Europe one of the main documents I find missing at many organisations are good Reference Architecture, in this article I will share an example Reference Architecture for the Integration Domain.
This original article that was published on LinkedIn is difficult to read and navigate so I have split up the article and published the follow post on my blog as follows:
According to Wikipedia “a reference architecture in the field of software architecture or enterprise architecture provides a template solution for an architecture for a particular domain. It also provides a common vocabulary with which to discuss implementations, often with the aim to stress commonality.” Adopting a reference architecture within an organization accelerates delivery through the re-use of an effective solution and provides a basis for governance to ensure the consistency and applicability of technology use within an organization.
Reference architecture can be an asset to an organisation as they provide the following benefits:
Providing common language for stakeholders
Providing consistency of implementation of technology solutions
Encouraging adherence to common standards, specifications, and patterns
Component Model
The below component model provides a viewpoint of the components that make up the reference architecture. The below model contains a fully featured integration platform however many organisations may not require every component. It needs to be tailor for each organisation by colour coding the component model to show which components are available in an organisation and which are on the roadmap for the future, consider removing components which are not being actively considered.
The below section outlines the layers and key components of the Integration Reference Architecture, together they form an Integration Platform. The purpose of an Integration Platform is to provide an infrastructure which can connect producers of data and services with interested consumers.
The Consumers Layer represents the various consumers of an organisations data and services which occurs via the integration platform.
Integration Layer
The Integration Layer consists of several technology components grouped together by common functionality. Often reference architecture are too abstract and stakeholder cannot relate to the concepts, therefore I try to use real-world component names.
API Management is the process of creating Application Programming Interfaces (APIs), defining, and enforcing usage policies, controlling access, collecting, and analysing usage statistics, reporting on performance, and interacting with subscribers.
Event Stream Processing (ESP) software that can filter, aggregate, enrich, and analyse high throughput data from multiple sources and identify events in real-time, detect threats and opportunities, and automate immediate actions. This component would typically interface with Internet of Thing (IoT) devices.
Workflow is the automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules. Two major components are typically used when implementing workflow these are:
Business Rule Management System (BRMS) is a component used to define, deploy, execute, monitor, and maintain the decision logic for an organization.
Business Process Management (BPM) is a component used toautomates the sequence of actions or activities required to complete a business process. This may involve documents, information or tasks being passed from one participant to another for action according to a set of procedural rules, each step of the process is tracked and can be escalated if not completed in a timely manner.
Artificial Intelligence (AI) is the ability of machines to imitate intelligent human behaviour and perform tasks that normally require humans. Machine Learning (ML) is the ability of machines to learn without being explicitly programmed. Data passing through the integration layer can be fed to the ML models to train the models to recognise patterns which can then be used to make intelligent decisions.
Middleware Services are a collection of software components that connect consumers of data with providers of data in real-time. Also known as an Enterprise Service Bus (ESB). Middleware Services typically provide the following capabilities:
Message Broker - software or hardware infrastructure supporting sending and receiving of messages between distributed systems. Message Brokers provide an asynchronous communications protocol, meaning that the sender and receiver of the extract message do not need to interact with the message queue at the same time. Messages placed onto the queue are stored until the recipient retrieves them.
Routing – the ability to determine the consumers of data based on both pre-configured rules and dynamically based on content.
Protocol Translation - transparently translates between communication protocols e.g., HTTP(S), JMS.
Data Transformation is the ability to transform data from one format to another.
Cross Reference supports the creation and management of cross-referencing relationships and code conversions between applications being integrated
Security Policies - can be applied to middleware services to limit who can invoke the service and to encrypt data.
Service Orchestration - the coordination of multiple services exposed as a single aggregate service.
Adapters or connectors – provide a bridge that allows middleware services to interact with packaged applications or technologies e.g., SAP or LDAP.
Services - encapsulate business activity into discrete services which can make use of any of the middleware services listed above.
Service Container is a runtime which allow integration service to be deployed, scaled, and changed independently of other services.
Data Integration is used to collect, aggregate, transform and transport large data sets. Data Integration may include the following components:
Extract Transform Load (ETL) capabilities for the purpose of moving large amounts of data in batches between systems.
Change Data Capture (CDC) is the process of capturing changes made at a data source and making the data available to other systems.
Master Data Management(MDM) provides the tools that consistently defines and manages the critical data of an organization to provide a single point of reference and the ability to synchronise the data with interested systems.
Secure File Transfer (SFT) provides the secure transfer of files in an efficient and a reliable manner. SFT core functionalities include the ability to secure files in transit and at rest, and the reporting and auditing of file transfer activity.
Information
The Information Layer provides a unified view of an organisation’s information assets and tools to support the use of the information in integration services.
Data Definition and Modelling tools support the creation of data models for use in service definitions and data persistence.
Common Vocabulary is a repository of common business objects, their attributes, and relationships. This is not a Common or Canonical Data Model (CDM). The use of Common Data Models in integration services can make them more costly to maintain and complex as all attributes of an information object must be included. Integration services may only require a few simple attributes. A more modern approach these days is to define integration services based on the contents of the common vocabulary. Domain teams are more likely to leverage the content of the common vocabulary when designing services.
Management and Monitoring
This layer is responsible for the management of components including deployment and runtime. It also provides monitoring of components to detect when they are not operating within acceptable thresholds and raising alerts to support teams when human intervention is required.
Service Management manages integration services including deployment, versioning, rollback, starting and stopping.
Metrics Monitoring capture metrics from components and underlying infrastructure at runtime to measure their availability, performance, integrity, and reliability. Alerts can be raised when components are not operating withing acceptable thresholds. Alerts can clear automatically once operation return to normal.
Logging & Auditing captures logs from the various components which can then be used to raise alerts or perform root cause analysis of problems. It also captures and stores audit data for tracking purposes and to comply with any regulatory requirements.
Error Handling captures, and records exceptions raised by components. Attempts to remediate known errors and alerts operations in unhandled circumstances.
Job Scheduling controls unattended background program execution or batch processing.
Governance
The Governance Layer includes processes and tools to manage artifacts, policies, and lifecycle of services.
Service Catalogue maintains a central catalogue of design time artifacts for services including service definitions, policies and documentation. The catalogue can be used to govern the lifecycle of services.
Service Registry provide service consumers with the location of an available service instance at runtime.
Security
This layer provides security policy enforcement, identity and access control and data security features. The following components are often embedded into multiple components of the Integration platform they common features and are described here:
Authentication & Authorization manages theverification of a user’sidentity, role-based access control to platform functions such as service invocation, viewing service configurations, definitions and monitoring information.
Data Security focuses on ensuring the confidentiality, integrity and availability of individual data flowing through the integration platform.
Confidentiality is concerned with protecting data from disclosure to unauthorized parties.
Integrity is concerned with protecting data from being modified by unauthorized parties.
Availability is concerned with ensuring that services are available when required.
Non-repudiation is the ability to prove that a specific transaction was sent at particular time by a particular party to another party.
Secrets Management is a component used to store digital secrets such as username, passwords, API keys and tokens in an encrypted datastore. Secrets are accessed via a command line or API call either at deployment time or runtime to retrieve the current secrets required by an application or script.
Transport Security is used to provide point-to-point security on transport layer between the service consumers and service providers.
Policy Enforcement & Monitoring serves as policy enforcement point and enables automation of monitoring policy violations for example number of requests per minute.
DevOps
This layer provides all the necessary modelling, design, testing and deployment capabilities to develop and deploy services.
Integrated Development Environment provides tools and processes that enables integration services to be design and build during delivery phases.
Testing provides testing tools to facilitate quality assurance and the integration meeting predefined requirements.
Continuous Integration / Continuous Delivery and Deployment (CI/CD) is a method where steps of the development lifecycle are automated. CI is where code is checked in tested and merged into source control. CD is the packaging of a service and deployment to one or more environments.
IT Automation Tools the software development life cycle includes many steps that are repeated each time code is deployed to production. To increase the speed and reliability of code being deployed into production often repeated tasks should be automated using one of these tools.
The Producers Layer represents the various producers of an organisations data and services that are connected to consumers via the Integration Platform.
The Evolution of Integration Architectures
Over the years there have been several changes in the architecture of enterprise integration systems used in organisations. An architect should be aware of this evolution and at what stages their organisation may be at.
Hub and Spoke
In Hub and Spoke architecture all applications connect to a central integration application (hub) which routes data to interested applications via adapters (spokes). The centralised hub introduces a single point of failure and a potential performance bottleneck under load. Most of these solutions were developed using custom or proprietary technologies which made them difficult and costly to integrate with many applications.
Enterprise Service Bus
An Enterprise Service Bus (ESB) is a software architecture that enables communication between applications. Instead of applications communicate directly each application communicates with the central bus, which handles transforming and routing messages to their appropriate destinations.
Service Oriented Architecture
Service-oriented architecture (SOA) is an enterprise-wide approach to software development that make reusable software components, or services available over a network. Each service is comprised of the logic and integrations required to execute a specific business function e.g. checking a customer’s credit rating. SOA services are typically deployed on an Enterprise Service Bus to take advantage of their message routing and data transformation capabilities when communicating with legacy systems.
Important SOA concepts to be aware of:
The Service Consumer is an application, service, or some other type of software module that requires a service. It is the entity that initiates the locating of the service in the service registry, binding to the service over a transport, and executing the service function by sending it a request formatted according to the contract.
The Service Provider is the network-addressable entity that accepts and executes requests from consumers. It can be a mainframe system, a component, or some other type of software system that executes the service request.
A Service Contract is a specification of the way a consumer of a service will interact with the service provider. It specifies the format of the request and response from the service typically in SOAP WSDL/XML.
A Service Registry is tool where service provider can publish their contracts so that they can be found by service consumers.
Many ESB have been deployed in successfully in organisations. However, ESB can become bottlenecks in a SOA deployment. Typically, all applications must communicate through the same ESB which is configured by a central team and often making a change to one service can impact other services requiring extensive regression testing. For this reason, most ESB vendors are rebranding their products away from ESB and using terms such middleware and integration platform.
Microservices
More recent organisations have been adopting a microservices architecture. Similar to SOA where applications consist of loosely coupled services over a network, however in a microservices architecture, services are typically owned by domain teams, they are fine-grained, and the protocols are lightweight such as REST/JSON.
Important concepts in microservices architecture:
Services are small
Services own their own data including the data model and datastore i.e., they avoid shared databases
Services communicate using REST APIs and asynchronous messaging
Services are organised by business capabilities with a logical boundary referred to as “bounded context”
Services are implemented in different programming languages and datastore depending on the best fit for the solution
Services are autonomously developed, deployed and scaled.
A combination of the above factors makes them faster to developed and change
SOA and Microservice architectures are compared in the table below:
The integration reference architecture above is essentially connecting service consumers with service providers via the integration platform components.
To determine the most appropriate integration solution pattern and platform components to meet the business requirement follow the solution decision framework below.
Requirements – basic integration requirements must be captured to select the correct pattern.
Reuse – Is there an existing integration service that can satisfy the requirements?
Boundaries – Across what boundaries will the integration take place?
Interaction – What style of communication is required?
Platform – what is the most appropriate platformcomponent to implement the solution?
Solution – select an approved patterned based on the analysis.
High-level Requirements
The following high-level integration requirements must be captured when analysing an integration solution:
Purpose of the service. What real world effect will the required service have in business terms? E.g., Send a Purchase Order to a supplier electronically.
What Business Function/Process/Capability does the integration enable or orchestrate? E.g., Procurement.
What Information Objects are exchanged by the integration? Objects can be selected from a common vocabulary or Enterprise Information Model. E.g., Customer
What Volume and frequency data that will pass through the integration? E.g., Daily Batch of 10,000 records, 1 request per minute per device
Where possible determine the number of Consumer that may consume this integration. E.g., 500 field workers
Where possible determine how many Provider systems may provide data for this integration.
What will trigger the service and when. E.g., Real-time on Employee update, or daily batch extract of all employee changes at 9 PM.
Determine where service providers and consumers are located. Will the integration cross and boundaries? E.g., internal systems, external systems.
How will service consumers and providers interact: Request/Reply, Publish Subscribe, Batch. This will be covered in detail in section Interaction.
What integration technologies do the service consumers and providers support: Do they support messaging like HTTP or JMS? Do they have native integration plugins which accelerate integration e.g., Salesforce Adapter.
What is the Data classification of data exchanged? This will drive thesecurity controls required.
If the above high-level integration requirements are not known at the time of analyse then there is a risk of rework due to an incorrect pattern being selected. Implementation estimates should be increased to reflect the level of uncertainty.
Reuse
An organisation’s Service Catalogue or Registry will contain high level details for each service including descriptions of services and their data structures described in real world business language.
Search the Service Catalogue or Registry for existing service which can be reused to meet the integration requirements captured above.
If a service is found which partial meets the requirements investigate create a new version of the service which is expanded to meet the new requirements. If no suitable service can be found continue the solution framework to determine the recommended end to end solution pattern for the new service.
Boundaries
The boundaries that the integration crosses must be understood. Please choose an interaction pattern from the table below.
Interaction
The integration interaction style between consumers and providers must be understood. Please choose an interaction styles from the table below. The overall end to end interaction style should be considered as per the requirements captured above. Composite interactions are one or more of the interactions styles aggregated together.
Use the Interaction Style selection tree below to determine the appropriate interaction style:
Platform Component
Use the component list below to match the business requirements of the new integration service with the key characteristics of each Integration Platform component to determine the appropriate platform to implement the integration solution.
Manual - (Manual Process) - A human can perform the integration with a manual process. The manual process can meet the service level requirements. The integration involves a small number of low frequency transactions e.g., weekly data extract. The cost of the manual process is far less than the cost of creating and supporting an automated service for the lifetime of the service
Middleware - (Middleware Services) – The integration must be performed in near real-time. The integration requires asynchronous messaging e.g., a message is asynchronously passed from a producer to a consumer via a topic or queue.The integration may require mediation from one protocol to another e.g., HTTP to JMS. The integration may require data transformation from one format to another e.g., JSON to CSV. The integration needs to be routed based on rules or content of the transaction.
APIM - (API Management)-The integration involves HTTP web API like REST or SOAP. The API requires to be applied to protect the backend systems. Examples of policies include Performance policies (throttling number of API calls per consumer or provider system), Security policies (verify a consumer’s identity before allowing access to API functionality)
ETL - (Extract Transform Load) - The integration is performed in batches. The integration extracts large sets of data from one datastore and makes it available in another datastore. The integration may require some transformation between datastores. The integration needs to be resilient and able to resume if interrupted
CDC - (Change Data Capture) - The integration requires the changes made in one system’s datastore be made available to other systems. The integration only requires data that is changed. The changes need to be made available in near real-time or small batches typically less than 15-minute windows.The system may be a legacy system which the organisation does not want to invest in.
SFT - (Secure File Transfer) - The integration requires files to be transferred between systems. The integration needs to be resilient and able to resume if interrupted.
MDM - (Master Data Management) - The organisation’s reference data needs to be managed in a system which provides a single point of reference. There are four major patterns for storing reference data in an MDM system: Registry – a central registry links records across various source systems; Consolidation – data is taken from multiple source system to create a central golden copy; Coexistence – data is synchronised across multiple system to keep all system aligned to the golden record; Centralized – the golden record is authored in the MDM system and pushed out to systems. There are three main patterns for integrating MDM systems with business systems: Data consolidation –source systems push their changes to the MDM to create the golden record; Data federation – system query the MDM which returns the golden record data; Data propagation – the golden record is synchronised with interested system.
ESP - (Event Stream Processing) - The integration requires events to be captured from sources at extremely high frequency e.g., several thousand messages per second. These events need to be queried, filtered, aggregated, enriched, or analysed. The events are to be retained for a period in a message log so they can be replayed. Event consumers maintain their own state and request the record they are up to.
Workflow - (Workflow)-A business process exists which can be automated by breaking the process up into documents, information or tasks that can be passed between the participants of the business process and tracked. The process can be modelled and executed in a Workflow component such as Business Process Management (BPM) tool. Multiple interfaces or applications require the same decision logic to be defined and executed, this can be centrally managed in Business Rule Management System (BRMS)and invoked via an API call. In more recent microservices architectures there has been a shift away from services being orchestrated by a single component such as a BPM tool invoking multiple services as it can become a bottleneck which is difficult to update. Microservices architectures typically implement independent microservices that perform their part of the business process in insolation by reacting to events that are published this is called Choreography, this is a more scalable and adaptable architecture, however this requires supporting services like distributed logging and tracing.
AIML - (Artificial Intelligence / Machine Learning) - Artificial intelligence models exist that need to be trained with data flowing through the integration platform. The nominated data can be captured and sent to the artificial intelligence model for training. Once trained the model can be invoked at runtime to make intelligent decisions.
Native - (Native integration) - Some Enterprise Applications and SaaS software come with native integration capabilities, these often provide a set of supported integration with other applications or common technologies e.g., Identity providers. Native integration should be considered when it provides secure, accelerated and support integration with other software that would be far more complex if implemented on an integration platform component.
Solution Selection
After capturing the high-level requirements of the Integration solution, verifying there is no existing service which can be reused and a new integration component may need to be built.
Using a combination of the name of the integration boundaries, interaction style and integration method selected above to determine the appropriate End to End Integration pattern.
The Solution Id would be Internal-SyncReqResp-Middleware.
The table below contains a full list of solution ids. Look up the solution id in the list and it will map to a solution pattern. The solution id selected may not be a recommended combination, the list will provide alternative recommendations. Details for each Solution pattern can be found in the next section.
Solution Pattern Catalogue
This section contains the details of the solution patterns, each pattern provides details such as:
a problem description of when the pattern should be used.
a description of the solution.
guidelines that describe how the solution should be implemented on the integration platform components.
guidelines that describe how security should be enforced.
a diagram showing the flow through platform components.
The patterns consider commonly used security zones as follows:
Trusted Zone – is a zone where the organisation has full control this is typically where business applications and data are hosted.
Semi-Trusted (DMZ) –is a typical Demilitarised zone between the Untrusted zone and the Trusted zone where external connections are terminated on a gateway or proxy, the connections is validated before a new connection is created going into the trusted zone.
Untrusted Zone – is a zone where the organisation has no little to control like the internet.
Solution Pattern Name: Manual-External
Problem: Data needs to be shared between internal applications on an infrequent basis and the cost of an automated integration cannot be justified.
Solution: Document and execute a manual process where a user manually extracts data from one application and uploads it into another
Solution Guidelines: The frequency of the integration must be incredibly low e.g., weekly, monthly, yearly task, anything more regular may be missed.
Security Guidelines:
The users who are executing the task must have access to both application this may require opening firewalls.
For more sensitive data consider running the task from a virtual desktop that only has access by specific users and to the applications
Context Diagram:
Solution Pattern Name: Manual-External
Problem: Data needs to be shared between an internal and external application on an infrequent basis and the cost of an automated integration cannot be justified.
Solution: Document and execute a manual process where a user manually extracts data from one application and uploads it into another.
Solution Guidelines: The frequency of the integration must be incredibly low e.g., weekly, monthly, yearly task, anything more regular may be missed.
Security Guidelines:
The users who are executing the task must have access to both application this may require opening firewalls and configuring a proxy server out to the external application.
For more sensitive data consider running the task from a virtual desktop that only has access by specific users and to the applications.
Context Diagram:
Solution Pattern Name: Manual-Cloud
Problem: Data needs to be shared between two external applications on an infrequent basis and the cost of an automated integration cannot be justified.
Solution: Document and execute a manual process where a user manually extracts data from one application and uploads it into another.
Solution Guidelines: The frequency of the integration must be incredibly low e.g., weekly, monthly, yearly task, anything more regular may be missed.
Security Guidelines:
The users who are executing the task must have access to both application this may require opening firewalls and configuring a proxy server out to the external applications.
For more sensitive data consider running the task from a virtual desktop that only has access by specific users and to the applications. Ideally the user desktop connects over a dedicated connection to the cloud provider and not over the internet e.g., Direct Connect, Express Route, Dedicated Interconnect
Context Diagram:
Solution Pattern Name: API-Internal
Problem: An internal application needs to share data or a service with an internal user or system in real-time.
Solution: Deploy a configuration on the API Gateway to proxy the connection from internal system or user to the internal system that provides the data or service.
Solution Guidelines:
Implement configuration on the API Gateway that authenticates, authorises, and throttles connections to the internal system
The internal system providing the API endpoint will:
Use the REST or SOAP over HTTPS
Describe the API using standard description specifications: Open API Specification (OAS), Web Service Description Language (WSDL)
Use standard data formats such as JSON and XML
Information objects exchanged should be based on common vocabulary or industry standard formats
The Interface must be documented
The organisation should define API standards to ensure APIs are presented consistently to consumers
If the APIs exposed by the provider system require data transformations or orchestration of several API calls into a single service, consider using Middleware-Internal pattern.
Security Guidelines:
Enable transport layer security on the API endpoint exposed to the consumer
All connection must be made via the API gateway to enforce authentication, authorisation and throttling policies on inbound requests
Use OAuth, API Keys, or mutual TLS certificates to authenticate consumers of the service
Use API Management to enforce authentication, authorisation and throttling policies on API requests
Context Diagram:
Solution Pattern Name: API-External
Problem: An internal application needs to share data or a service with an external user or system in real-time.
Solution:
For incoming APIs, deploy a configuration on the API Gateway to proxy the connection from external system or user to the internal system that provides the data or service.
For outbound API deploy a configuration on a forward proxy to enable connectivity to the external provider.
Solution Guidelines:
Implement configuration on the API Gateway that authenticates, authorises, and throttles connections to the internal system
The internal system providing the API endpoint will:
Use the REST or SOAP over HTTPS
Describe the API using standard description specifications: Open API Specification (OAS), Web Service Description Language (WSDL)
Use standard data formats such as JSON and XML
Information objects exchanged should be based on common vocabulary or industry standard formats
The Interface must be documented
The organisation should define API standards to ensure APIs are presented consistently to consumers
If the APIs exposed by the provider system require data transformations or orchestration of several API calls into a single service, consider using Middleware-External pattern.
If the internal application initiates the connection, then the outbound connection must connect via a forward proxy.
Security Guidelines:
Inbound connections must be made via the API gateway in the DMZ
Enable transport layer security on the API endpoint
Use API Keys or digital certificate to authenticate consumers of the service
Use API Management to enforce authentication, authorisation and throttling policies on inbound requests.
For outbound connections configure the forward proxy and firewall to allow connections from the internal application to the external application.
Context Diagram:
Solution Pattern Name: Middleware-Internal
Problem: An internal system is interested in consuming data from another internal system in real-time
Solution: Deploy a real-time service on the Middleware Services component which can connect interested consumers with a data provider over a variety of protocols and data formats
Solution Guidelines:
Implement an Integration Service using Middleware Services such as a message broker or an Integration Service
Use standard protocols such as HTTPS, AMQP, MQTT
Use common data formats such as JSON or XML
Information objects exchanged should be based on common vocabulary
The Interface must be documented ideally using common specification languages such as Open API Specification (OAS) and Web Service Description Language (WSDL)
Security Guidelines:
Enable transport layer security on all interfaces
Use OAuth, API Keys, mutual TLS certificates, username/password to authenticate consumers of the service
If the Integration Service is exposing an API use API Management to enforce security and performance policies on API requests
Context Diagram:
Solution Pattern Name: Middleware-External
Problem: An internal system needs to exchange data in real-time with an external system. The exchange may be inbound or outbound.
Solution:
Deploy a real-time service on the Middleware Services component which can connect interested consumers with a data provider over a variety of protocols and data formats
For inbound APIs deploy a configuration on the API Gateway to make the connection between the Middleware Services and the external system
For outbound API deploy a configuration on a forward proxy to enable connectivity to the external provider.
For interfaces that use messaging deploy a message broker in the DMZ to connect with the external system
Solution Guidelines:
Implement an Integration Service using Middleware Services such as a message broker or an Integration Service
Use standard protocols such as HTTPS, AMQP, MQTT
Use common data formats such as JSON or XML
Information objects exchanged should be based on common vocabulary
Interfaces should be documented using common specification languages such as Open API Specification (OAS) and Web Service Description Language (WSDL)
Security Guidelines:
Inbound connections must be made via the API gateway in the DMZ
Enable transport layer security on the API endpoint
Use API Keys or digital certificate to authenticate consumers of the service
Use API Management to enforce authentication, authorisation and throttling policies on inbound requests.
For outbound connections configure the forward proxy and firewall to allow connections from the internal application to the external application.
For message broker connections the firewall must whitelist the clients IP address and client certificate. The message broker in the DMZ will exchange messages with the message broker in the trusted network. External clients must not directly connect to the internal message broker.
Context Diagram:
Solution Pattern Name: Middleware-Cloud
Problem: Two external systems need to exchange data in real-time and a native solution is not available
Solution: Deploy a real-time service on the Middleware Services component which can connect the two systems
Solution Guidelines:
Implement an Integration Service using Middleware Services such as a message broker or an Integration Service
Use standard protocols such as HTTPS, AMQP, MQTT
Use common data formats such as JSON or XML
Information objects exchanged should be based on common vocabulary
Interfaces should be documented using common specification languages such as Open API Specification (OAS) and Web Service Description Language (WSDL)
Consider deploying the solution on components deployed in the cloud as part of a hybrid integration platform.
Security Guidelines:
Inbound connections must be made via the API gateway in the DMZ
Enable transport layer security on the API endpoint
Use API Keys or digital certificate to authenticate consumers of the service
Use API Management to enforce authentication, authorisation and throttling policies on inbound requests.
For outbound connections configure the forward proxy and firewall to allow connections to the external application.
For message broker connections the firewall must whitelist the clients IP address and client certificate. The message broker in the DMZ will exchange messages with the message broker in the trusted network. External clients must not directly connect to the internal message broker.
Consider dedicated connection to the cloud provider and not over the internet e.g., Direct Connect, Express Route, Dedicated Interconnect
Context Diagram:
Solution Pattern Name: FileTransfer-Internal
Problem: An internal system needs to exchange files with another internal system
Solution: Configure a file transfer on the Secure File Transfer component that will automatically transfer the file from the data provider to the interested consumers or allow them to pick up the file when required
Solution Guidelines:
Configure the internal system credentials on the Secure File Transfer component
Implement a file transfer configuration on the Secure File Transfer component
Use standard protocols such as SFTP
File formats should be agreed between consumers and with a preference for industry standard formats
The Interface must be documented including account names, directory structure, file naming, file formats, schedule
Middleware Services or ETL components should be used to perform transformations if required
Security Guidelines:
Enable transport layer security on consumers and providers connections
Use credentials or SSH keys to authenticate consumers and providers
Files should be scan for malware during transferring
The FTP protocol must not be used as it sends credentials in clear text
Context Diagram:
Solution Pattern Name: FileTransfer-External
Problem: An internal system needs to exchange files with an external party.
Solution: Configure a file transfer on the Secure File Transfer server component that will transfer the file between the internal system and the external party.
Solution Guidelines:
Configure the external party’s credentials on the Secure File Transfer server component
Implement a file transfer configuration on the Secure File Transfer server component
Use standard protocols such as SFTP
File formats should be agreed between consumers and with a preference for industry standard formats
The Interface must be documented including account names, directory structure, file naming, file formats, schedule
Middleware Services or ETL components should be used to perform transformations if required
Security Guidelines:
Enable transport layer security on consumers and providers connections
Use credentials or SSH keys to authenticate consumers and providers
Files should be scan for malware during transferring especially files coming from an external system
The FTP protocol must not be used as it sends credentials in clear text
A Secure File Transfer Gateway must be used in the DMZ to authenticate users before files can be transferred to the internal server component
File should not be stored in the DMZ zone
Context Diagram:
Solution Pattern Name: FileTransfer-Cloud
Problem: Two external systems need to exchange files and there is no native solution available.
Solution: Configure a file transfer on the Secure File Transfer server component that will transfer the file between the external systems.
Solution Guidelines:
Configure the external parties’ credentials on the Secure File Transfer component
Implement a file transfer configuration on the Secure File Transfer component
Use standard protocols such as SFTP
File formats should be agreed between consumers and with a preference for industry standard formats
The Interface must be documented including account names, directory structure, file naming, file formats, schedule
Middleware Services or ETL components should be used to perform transformations if required
Consider deploying the solution on components deployed in the cloud as part of a hybrid integration platform.
Security Guidelines:
Enable transport layer security on consumers and providers connections
Use credentials or SSH keys to authenticate consumers and providers
Files should be scan for malware during transferring especially files coming from an external system
The FTP protocol must not be used as it sends credentials in clear text
A Secure File Transfer Gateway must be used in the DMZ to authenticate users before files can be transferred to the internal server component
File should not be stored in the DMZ zone
Consider dedicated connection to the cloud provider and not over the internet e.g., Direct Connect, Express Route, Dedicated Interconnect
Problem: An internal system needs to stream data to another interested internal system
Solution: Deploy a topic and partitions for consumers to consumer messages that have been streamed
Solution Guidelines:
Configure credentials for the internal applications on the ESP component
Configure a topic for the producer to publish messages to
Create queues or partitions for consumers to consumer messages from
Optionally create streaming APIs that process the topic data in real-time like aggregating messages on a time windows or joining data and publishes it to another topic
Use a common streaming protocol like Kafka
Information objects exchanged should be based on common vocabulary
The Interface must be documented use standards like CloudEvemts, OpenTelemetry, Aysncapi
Security Guidelines:
Enable transport layer security on all connections
Configure credentials or mutual certificates to authenticate producers and consumers
Context Diagram:
Solution Pattern Name: ESP-External
Problem: An internal system needs to stream data to an external system and vice versa
Solution: Deploy an ESP topic for the producer system to publish messages to and deploy a queue/partition for the consumer system to consumer messages via a proxy in the DMZ
Solution Guidelines:
Configure credentials for the systems on the ESP component
Configure a topic for the producer to publish messages to
Create queues/partitions for consumer system to consumer messages from
Optionally create streaming APIs that process the topic data in real-time like aggregating messages on a time windows or joining data and publishes it to another topic
Use a common streaming protocol like Kafka
Information objects exchanged should be based on common vocabulary
The Interface must be documented consider the following specification CloudEvents, OpenTelemetry, AysncAPI
Security Guidelines:
Enable transport layer security on all connections
Configure credentials to authenticate producers and consumers
Consider using client certificates for external parties
Whitelist the client IPs on the firewall
A proxy in the DMZ will proxy connections through to the server in the trusted zone, external clients must not directly connect to the internal ESP server.
Context Diagram:
Solution Pattern Name: ESP-Cloud
Problem: Two external system needs to stream data to each other an no native solution exists
Solution: Deploy an ESP topic for the producer system to publish messages to and deploy a queue/partition for the consumer system to consumer messages via a proxy in the DMZ
Solution Guidelines:
Configure credentials for the systems on the ESP component
Configure a topic for the producer to publish messages to
Create queues/partitions for consumer system to consumer messages from
Optionally create streaming APIs that process the topic data in real-time like aggregating messages on a time windows or joining data and publishes it to another topic
Use a common streaming protocol like Kafka
Information objects exchanged should be based on common vocabulary
The Interface must be documented consider the following specification CloudEvents, OpenTelemetry, AysncAPI
Consider deploying the solution on components deployed in the cloud as part of a hybrid integration platform.
Security Guidelines:
Enable transport layer security on all connections
Configure credentials to authenticate producers and consumers
Consider using client certificates for external parties
Whitelist the client IPs on the firewall
A proxy in the DMZ will proxy connections through to the server in the trusted zone, external clients must not directly connect to the internal ESP server.
Consider dedicated connection to the cloud provider and not over the internet e.g., Direct Connect, Express Route, Dedicated Interconnect
Context Diagram:
Solution Pattern Name: ETL-Internal
Problem: An internal system is interested in consuming a bulk data set from another internal system that has a low frequency of change but a high volume of data.
Solution: Configure a job on the Extract Transform Load component that will extract the data from the source system, transform the data into the required format and load it into the target system. The load step can occur before the transform step for performance reasons.
Solution Guidelines:
Implement an ETL job on the Extract Transform Load component
The job should be triggered at a scheduled time, by an event or manually
Error handling must be implemented to meet requirements e.g., restart at regular checkpoints, rollback on failure, save error records to a file
The Interface must be documented
Security Guidelines:
Enable transport layer security on source and target connections
Do not store credentials to datastores in cleartext.
Context Diagram:
Solution Pattern Name: ETL-External
Problem: An internal system is interested in exchanging a bulk data set with an external system that has a low frequency of change but a high volume of data.
Solution: Configure a job on the Extract Transform Load component that will extract the data from the source system, transform the data into the required format and load it into the target system. The load step can occur before the transform step for performance reasons. Data exchange with an external system should be either via API calls or file transfers.
Solution Guidelines
Implement an ETL job on the Extract Transform Load component
The job should be triggered at a scheduled time, by an event or manually
Error handling must be implemented to meet requirements e.g., restart at regular checkpoints, rollback on failure, save error records to a file
The Interface must be documented
Direct database connections with external parties must be avoided, use API calls instead
Security Guidelines:
Enable transport layer security on source and target connections
Do not store credentials to datastores in cleartext.
The connection to the external system must be via a HTTP proxy or Secure File Transfer Gateway in the DMZ
If using files its highly recommended that the FileTransfer-External pattern be followed
Context Diagram:
Solution Pattern Name: ETL-Cloud
Problem: Two external systems are interested in exchanging a bulk data set with each other, the data set has a low frequency of change but a high volume of data and there is no native solution available.
Solution: Configure a job on the Extract Transform Load component that will extract the data from the source system, transform the data into the required format and load it into the target system. The load step can occur before the transform step for performance reasons. Data exchange with an external system should be either via API calls or file transfers.
Solution Guidelines:
Implement an ETL job on the Extract Transform Load component
The job should be triggered at a scheduled time, by an event or manually
Error handling must be implemented to meet requirements e.g., restart at regular checkpoints, rollback on failure, save error records to a file
The Interface must be documented
Direct database connections with external parties must be avoided, use API calls instead
Consider deploying the solution on components deployed in the cloud as part of a hybrid integration platform.
Security Guidelines
Enable transport layer security on source and target connections
Do not store credentials to datastores in cleartext.
The connection to the external system must be via a HTTP proxy or Secure File Transfer Gateway in the DMZ
If using files its highly recommended that the FileTransfer-Cloud pattern be followed
Context Diagram:
Solution Pattern Name: CDC-Internal
Problem: An internal system is interested in consuming recently changed data from another internal system’s datastore without making changes to the system.
Solution: Configure a job on the Change Data Capture component to monitor the source system’s datastore for changes and at regular intervals publishes the changes as either a message or file for other system to consume.
Solution Guidelines:
Implement a job on the Change Data Capture component to capture the changes that have occurred since the last trigger
The job should be triggered at regular intervals e.g., every 15 minutes
The changes should be published in an agreed format and protocol typically as either a message or a file
Error handling must be implemented to meet requirements e.g., keep track of the last successful publish time, on outage limit the amount of data published to avoid overloading subscribers.
The Interface must be documented
Security Guidelines:
Enable transport layer security on source and target connections
Do not store credentials to datastores in cleartext.
Context Diagram:
Solution Pattern Name: CDC-External
Problem: An internal system is interested in exchanging recently changed data with an external system without making changes to the system.
Solution: Configure a job on the Change Data Capture component to monitor the source system’s datastore for changes and at regular intervals publishes the changes as either a message or file for other system to consume. Data exchange with an external system should be either via API calls or file transfers.
Solution Guidelines:
Implement a job on the Change Data Capture component to capture the changes that have occurred since the last trigger
The job should be triggered at regular intervals e.g., every 15 minutes
The changes should be published in an agreed format and protocol typically as either a message or a file
Error handling must be implemented to meet requirements e.g., keep track of the last successful publish time, on outage limit the amount of data published to avoid overloading subscribers.
Direct database connections with external parties must be avoided, use API calls instead
Security Guidelines:
Enable transport layer security on source and target connections
Do not store credentials to datastores in cleartext.
The connection to the external system must be via a HTTP proxy, Secure File Transfer Gateway or Message Broker in the DMZ
If using files its highly recommended that the FileTransfer-External pattern be followed
Context Diagram:
Solution Pattern Name: CDC-Cloud
Problem: Two external systems are interested in exchanging recently changed data with each other, there is no native solution available, and the organisations wants to avoid making changes to the system.
Solution: Configure a job on the Change Data Capture component that will monitor the source system’s datastore for changes and at regular intervals publishes the changes as either a message or file for other system to consume. Data exchange with an external system should be either via API calls or file transfers.
Solution Guidelines:
Implement a job on the Change Data Capture component to capture the changes that have occurred since the last trigger
The job should be triggered at regular intervals e.g., every 15 minutes
The changes should be published in an agreed format and protocol typically as either a message or a file
Error handling must be implemented to meet requirements e.g., keep track of the last successful publish time, on outage limit the amount of data published to avoid overloading subscribers.
Direct database connections with external parties must be avoided, use API calls instead, this may limit the functionality that can be used in the CDC component with two external parties
Consider deploying the solution on components deployed in the cloud as part of a hybrid integration platform.
Security Guidelines:
Enable transport layer security on source and target connections
Do not store credentials to datastores in cleartext.
The connection to the external system must be via a HTTP proxy, Secure File Transfer Gateway or Message Broker in the DMZ
If using files its highly recommended that the FileTransfer-Cloud pattern be followed
Context Diagram:
Solution Pattern Name: MDM-Internal
Problem: Two internal systems need to keep their reference data synchronised with the organisation's single source of truth for reference data.
Solution: Configure the reference data set in the Master Data Management system. Depending on the style of MDM several interfaces will need to be created that may query, push, or synchronise the reference data in the MDM and downstream systems. The interfaces typically use direct database access, API calls or messaging.
Solution Guidelines:
Analyse the use of the master data entity’s use in the organisation and determine each system that creates and updates the entity.
Configuring the master data entity in MDM solution.
Depending on the style of MDM being adopted (Registry, Consolidation, Coexistence, Centralized) several interfaces will need to be created with downstream systems.
Each interface may query, push, or synchronise the reference data
Error handling must be implemented to meet requirements e.g., retry changes a number of times when external system are not available, survivorship rules need to be configured if some systems have better quality data than others and they make a conflicting change.
Security Guidelines:
Enable transport layer security on source and target connections.
Do not store credentials to datastores in cleartext.
Context Diagram:
Solution Pattern Name: MDM-External
Problem: An internal and external system need to keep their reference data synchronised with the organisation single source of truth for reference data.
Solution: Configure the reference data set in the Master Data Management system. Depending on the style of MDM several interfaces will need to be created that may query, push, or synchronise the reference data in the MDM and downstream systems. External systems should be limited to API calls or messaging.
Solution Guidelines:
Analyse the use of the master data entity’s use in the organisation and determine each system that creates and updates the entity.
Configuring the master data entity in MDM solution.
Depending on the style of MDM being adopted (Registry, Consolidation, Coexistence, Centralized) several interfaces will need to be created with downstream systems.
Each interface may query, push, or synchronise the reference data.
Error handling must be implemented to meet requirements e.g., retry changes several times when a system is not available, survivorship rules need to be configured if some systems have better quality data than others and they make a conflicting change.
Direct database connections with external parties must be avoided, use API calls instead
Security Guidelines:
Enable transport layer security on source and target connections.
Do not store credentials to datastores in cleartext.
The connection to the external system must be via a HTTP proxy, or Message Broker in the DMZ
Context Diagram:
Solution Pattern Name: MDM-Cloud
Problem: Two external system need to keep their reference data synchronised with the organisation single source of truth for reference data.
Solution: Configure the reference data set in the Master Data Management system. Depending on the style of MDM several interfaces will need to be created that may query, push, or synchronise the reference data in the MDM and downstream systems. External systems should be limited to API calls or messaging.
Solution Guidelines:
Analyse the use of the master data entity’s use in the organisation and determine each system that creates and updates the entity.
Configuring the master data entity in MDM solution.
Depending on the style of MDM being adopted (Registry, Consolidation, Coexistence, Centralized) several interfaces will need to be created with downstream systems.
Each interface may query, push, or synchronise the reference data.
Error handling must be implemented to meet requirements e.g., retry changes several times when a system is not available, survivorship rules need to be configured if some systems have better quality data than others and they make a conflicting change.
Direct database connections with external parties must be avoided, use API calls instead
Consider deploying the solution on components deployed in the cloud as part of a hybrid integration platform.
Security Guidelines:
Enable transport layer security on source and target connections.
Do not store credentials to datastores in cleartext.
The connection to the external system must be via a HTTP proxy, or Message Broker in the DMZ
Context Diagram:
Solution Pattern Name: Workflow-Internal
Problem: A business process needs to be automated which includes interactions with one or more internal systems
Solution: Configure a workflow on the Business Process Management component and decision logic on the Business Rules Management system. Configure Service tasks on the Business Process Management component to invoke downstream systems.
Solution Guidelines:
Config the workflow of steps to perform the business process on the Business Process Management tool
Configure any decision logic on the Business Rules Management System
If a business process needs to interact with an application, configure a Service Task that invokes the application via an API or sends a message to a Message Broker and waits for a response.
Configure Web Forms or Email for Human tasks such as manual approvals
Error handling must be implemented to meet requirements e.g., if a task is not completed in an acceptable time raise an alert to the support team, if a human has not responded escalate to their manager
Security Guidelines:
Enable transport layer security on all interfaces.
Protect interfaces using credentials, API keys or certificates
Context Diagram:
Solution Pattern Name: Workflow-External
Problem: A business process needs to be automated which includes interactions with one or more internal and external systems
Solution: Configure a workflow on the Business Process Management component and decision logic on the Business Rules Management system. Configure Service tasks on the Business Process Management component to invoke downstream systems.
Solution Guidelines:
Config the workflow of steps to perform the business process on the Business Process Management tool
Configure any decision logic on the Business Rules Management System
If a business process needs to interact with an application, configure a Service Task that invokes the application via an API or sends a message to a Message Broker and waits for a response.
Configure Web Forms or Email for Human tasks such as manual approvals
Error handling must be implemented to meet requirements e.g., if a task is not completed in an acceptable time raise an alert to the support team, if a human has not responded escalate to their manager
Security Guidelines
Enable transport layer security on all interfaces.
Protect interfaces using credentials, API keys or certificates
The connection to the external system must be via a HTTP proxy, or Message Broker in the DMZ
Context Diagram:
Solution Pattern Name: Workflow-Cloud
Problem: A business process needs to be automated which includes interactions with one or more external systems
Solution: Configure a workflow on the Business Process Management component and decision logic on the Business Rules Management system. Configure Service tasks on the Business Process Management component to invoke downstream systems.
Solution Guidelines:
Config the workflow of steps to perform the business process on the Business Process Management tool
Configure any decision logic on the Business Rules Management System
If a business process needs to interact with an application, configure a Service Task that invokes the application via an API or sends a message to a Message Broker and waits for a response.
Configure Web Forms or Email for Human tasks such as manual approvals
Error handling must be implemented to meet requirements e.g., if a task is not completed in an acceptable time raise an alert to the support team, if a human has not responded escalate to their manager
Consider deploying the solution on components deployed in the cloud as part of a hybrid integration platform.
Security Guidelines:
Enable transport layer security on all interfaces.
Protect interfaces using credentials, API keys or certificates
The connection to the external system must be via a HTTP proxy, or Message Broker in the DMZ
Context Diagram:
Solution Pattern Name: Native-Internal
Problem: An internal system is interested in consuming data from another internal system for which exists a native integration component
Solution: Configure the native integration component on the internal application to integrate with the other application
Solution Guidelines:
Configure the native integration plugin in the application
Documented the interface and walk the integration support team through the interface in case they get calls related to the interface
Security Guidelines:
Enable transport layer security on all interfaces
Use OAuth, API Keys, mutual TLS certificates, username/password to secure the interface where possible
Perform security testing on the interface to ensure it is secure
Context Diagram:
Solution Pattern Name: Native-External
Problem: An internal system and external system are interested exchanging data for which exists a native integration component
Solution: Configure the native integration component on the source application to integrate with the target application
Solution Guidelines:
Configure the native integration plugin in the application
Documented the interface and walk the integration support team through the interface in case they get calls related to the interface
Security Guidelines:
Enable transport layer security on all interfaces
Use OAuth, API Keys, mutual TLS certificates, username/password to secure the interface where possible
The connection to the external system must be via a HTTP proxy in the DMZ
Perform security testing on the interface to ensure it is secure
Context Diagram:
Solution Pattern Name: Native-Cloud
Problem: Two external systems are interested exchanging data for which exists a native integration component
Solution: Configure the native integration component on the source application to integrate with the target application
Solution Guidelines:
Configure the native integration plugin in the application
Documented the interface and walk the integration support team through the interface in case they get calls related to the interface
Security Guidelines:
Enable transport layer security on all interfaces
Use OAuth, API Keys, mutual TLS certificates, username/password to secure the interface where possible
Perform security testing on the interface to ensure it is secure
Context Diagram:
Solution Pattern Name: AIML-Internal
Problem: A Machine Learning model needs to be trained on data from an internal data source before it is deployed for use.
Solution:
Capture the data from the internal data source.
Prepare the data for use by the model by validating and extracting the required information
Train and test the model using the data prepared
Deploy the model and inference it with data passing through the integration layer to gain predictions
Solution Guidelines:
Capture the data from the internal data source.
API calls can be accepted by the API Management and processed by Middleware Services
Message can be accepted by Message Broker and processed by Middleware Services.
Bulk extracts out of a database can be captured by ETL.
Files can be captured by the Secure File Transfer component and processed by Middleware Services or ETL.
High Volume streams of messages can be accepted by the Event Stream Processing component
Prepare the data for use by the model by validating and extracting the required information
The data can be processed by the Middleware Services or ETL after being captured
A script or program in any language can also be used here
Train and test the model using the data prepared
This is dependent on how the model prefers its source data
Data can be made available as files, messages in a Message Broker or Event Stream Processing component
Deploy the model and inference it with data passing through the integration layer to gain predictions
This is dependent on how the model works whether its run real-time or over batches of data
Real-time models can be inference as part of a Middleware Service triggered by an API call or Message
Batch models can be triggered by a schedule or event and directly run over data in files or in a database
Security Guidelines:
Enable transport layer security on all interfaces
Use credentials, tokens, or certificates to authenticate consumers and providers
Do not store credentials to datastores in cleartext.
Context Diagram:
Capture the data from the internal data source.
Prepare the data for use by the model by validating and extracting the required information
Train and test the model using the data prepared
Deploy the model and inference it with data passing through the integration layer to gain predictions
Solution Pattern Name: AIML-External
Problem: A Machine Learning model needs to be trained on data from an external data source before it is deployed for use.
Solution: As per AIML-Internal
Solution Guidelines: As per AIML-Internal
Security Guidelines:
Enable transport layer security on all interfaces
Use credentials, tokens, or certificates to authenticate consumers and providers
Do not store credentials to datastores in cleartext.
Direct database connections with external parties must be avoided, use API calls instead
The connection to the external system must be via a HTTP proxy, Message Broker or Secure File Transfer Gateway in the DMZ
Context Diagram:
Capture the data from the external data source.
Prepare the data for use by the model by validating and extracting the required information
Train and test the model using the data prepared
As per AIML-Internal
Deploy the model and inference it with data passing through the integration layer to gain predictions
I have similar pictures, showing lots of technical products and operational controls/management ... but ... integration is the complex, many protocols and platforms, yet is one technical activity that is generally invisible, poorly represented, and is not about the integration tools themselves... it's about speed to market, frictionless/agile/slick [any other words] developer experience to make integration activity "just happen". If the developer experience is slick, the techniques/tools used to actually integrate are abstracted to make them changeable, avoid lock-in and so on. I think looking at these aspects of iPaaS platforms would help a lot of organisations (even if iPaaS is not selected/viable)
I have similar pictures, showing lots of technical products and operational controls/management ... but ... integration is the complex, many protocols and platforms, yet is one technical activity that is generally invisible, poorly represented, and is not about the integration tools themselves... it's about speed to market, frictionless/agile/slick [any other words] developer experience to make integration activity "just happen". If the developer experience is slick, the techniques/tools used to actually integrate are abstracted to make them changeable, avoid lock-in and so on. I think looking at these aspects of iPaaS platforms would help a lot of organisations (even if iPaaS is not selected/viable)
Banking Technology Professional | Banking - Architecture - Project Management
1yThanks for sharing
Enterprise Architecture Director leading digital transformation initiatives
2yThanks for this great contribution
-
2yHi David, very good article and reference, thanks for putting this together!