AI & Agentic-ready Data Platforms: A Roadmap for 2025

Laurent LETOURMY

Snowflake ❄️ Addict | @ Devoteam | Data & AI fan

Published Jan 30, 2025

Point of view, as of January 25

The era of traditional Business Intelligence (BI) platforms as we know them is really coming to an end. As we move into 2025, more than supporting other use cases, organizations must reconceptualize their data platforms to support both AI and agentic systems. This transformation is not just about technology – it represents a fundamental shift in how we think about, organize, and utilize data

The End of Traditional BI-Centric Platforms

Traditional BI platforms, designed primarily for reporting and analytics, are no longer sufficient. In fact, BI is expected to represent less than 50% of data platform usage in the near future. The new paradigm requires platforms that can handle both structured and unstructured data in near real-time, with a strong emphasis on centralized semantic layer and active data management and observability.

5 Key Pillars of Modern Data Platforms for AI & Agentic

The next chapters will dig into into the five key pillars of modern data platforms for AI and agentic systems:

1. Centralized semantic layer
2. Your Data becomes an operational hub
3. Medallion is Dead. Use Data Domains and Data Products
4. Move from Data Quality to Data Observability
5. Your unstructured Data is just … data

1. Centralized Semantic Layer: The Foundation for AI and Agents

The semantic layer has emerged as the cornerstone of modern data platforms, particularly for AI agents and LLMs. Unlike traditional BI implementations where semantic definitions were trapped within specific visualization tools, the new semantic layer serves as a universal translator between human intent and data structures.

Why Semantic Layer is Critical for AI Agents

AI agents, powered by LLMs, primarily interact through natural language. The semantic layer bridges the gap between these text-based interactions and the underlying data structures by:

Providing context and relationships that make data meaningful to LLMs
Enabling natural language queries to be accurately translated into data operations
Maintaining consistent business definitions across all AI applications
Ensuring that agents understand and use data in accordance with business rules

Universal Data Access Through Semantic Understanding

The semantic layer transcends its traditional role in BI to become a universal knowledge graph that connects data across domains, a central repository of business logic and metric definitions, an interpreter between natural language and technical implementations, and a guarantor of consistent data interpretation across all tools and platforms.

This universality means that whether data is being accessed by any business user using natural language (and it works!), an AI agent answering to queries, an automated process making decisions, or a data scientist building models, the same semantic understanding, business rules, and data relationships are applied consistently.

Breaking Free from Tool-Specific Semantics

Historical approaches where semantic definitions were embedded within specific tools (like BI platforms) created inconsistent interpretations across different systems, duplicate definitions and business logic, barriers to AI adoption and automation, and limited ability to scale data understanding across the organization.

The new semantic layer solves these challenges by centralizing semantic definitions outside of any specific tool, making business context available as a service, enabling dynamic adaptation of data understanding, and supporting multi-modal data interpretation for AI systems.

2. Becoming Active: From Passive Storage to Operational Hub

The transformation from passive BI platforms to active operational data platforms represents a fundamental shift in how organizations leverage their data assets. Traditional data lakes served primarily as static repositories for historical analysis, but modern platforms must become dynamic operational hubs that actively participate in business processes.

Beyond the Analytics-Only Paradigm

Traditional BI platforms were characterized by:

One-way data flow (from operational systems to data warehouse)
Batch-oriented processing
Focus on historical analysis and reporting
Limited interaction with operational systems

Modern and active data platforms break this paradigm by fundamentally transforming the way data flows and is processed. They enable bi-directional data flows, allowing for seamless interaction between different systems and applications. Additionally, they support real-time data processing, which facilitates immediate insights and decision-making.

Furthermore, modern active data lakes directly power operational decisions, providing the necessary data and analytics to inform business actions. Ultimately, they serve as a central nervous system for business operations, integrating and orchestrating various processes to drive efficiency and effectiveness.

Operational Capabilities and Requirements

The shift to operational data platforms demands:

Near Real-Time Processing

Streaming or Near Real-Time processing capabilities are essential for handling high-volume, high-velocity data streams in today's operations. This enables the platform to process and analyze data as it arrives, allowing for timely insights and decision-making. Event-driven architectures can also help, as they enable the platform to react to specific events or changes in the data, triggering actions or notifications as needed. CDC or Data movement solutions (as Fivetran / HVR or Airbyte) can help bypass traditional ETL and batched operations.

Bi-directional Integration

Reverse ETL capabilities are essential for reinjecting data back into operational systems, ensuring that insights and analytics are not only generated but also acted upon in real-time. This capability enables the active data lake to not only consume data but also to influence operational processes directly.

An API-first architecture is critical for facilitating interaction between the active datalake and various systems within the organization. This approach ensures that data and analytics are easily accessible and can be integrated into different applications and services, fostering a culture of data-driven decision-making.

Automated data sharing mechanisms are necessary for streamlining the exchange of data between different systems and applications. By automating this process, organizations can reduce manual data transfer errors, increase efficiency, and ensure that data is shared in a secure and controlled manner.

Operational Use Cases

Direct integration with customer-facing applications is a key use case for active data lakes. This integration allows for real-time data sharing and analysis, enabling businesses to provide personalized experiences to their customers.

Real-time personalization engines are another important application. These engines use real-time data to tailor content and experiences to individual users, increasing engagement and satisfaction.

Dynamic pricing and inventory management is a use case that benefits from real-time data. By analyzing market conditions and customer behavior in real time, businesses can optimize pricing and inventory levels to maximize revenue and customer satisfaction.

Impact on Business & Technical Operations

This transformation enables immediate action on insights rather than delayed responses, automated decision-making based on real-time data, seamless integration between analytical and operational processes, dynamic business process optimization, and real-time customer experience personalization.

When making your Data Platform more operational, you really must understand that your platform is becoming a real operational component of your Information Systems, requiring careful monitoring and operation surveillance, as any other active component in your IS. Traditional BI teams could not be used to this level of requirements :

A lately delivered dashboard is harmful but not dangerous.
A not up-to-date customer scoring or segmentation could really harm your online sales.

3. Domain-Driven Architecture: Breaking Free from Traditional Constraints

The shift from traditional three-tier/medallion architectures (Bronze/Silver/Gold or Raw/Refined/Curated) to a domain-driven approach represents a fundamental rethinking of how we organize and manage enterprise data platforms. This transformation is essential for enabling AI and agentic systems at scale.

Limitations of Traditional Architectures

The traditional three-tier architecture creates several challenges:

Organizational Bottlenecks

Centralized teams are prone to becoming overwhelmed with requests, leading to a slow response to business needs. This is exacerbated by the limited domain expertise within these central teams, making it challenging to effectively address the unique needs of each business domain. Furthermore, the coordination between business and IT stakeholders becomes increasingly complex, hindering the ability to respond quickly to changing business requirements.

Technical Constraints

The traditional architecture is characterized by rigid data models that are difficult to evolve in response to changing business needs. The complex dependencies between layers make it challenging to modify or update individual components without affecting the entire system. Additionally, the architecture's inability to optimize for specific use cases results in a one-size-fits-all approach to data transformation, which can be inefficient and ineffective.

Scalability Issues

As the organization grows, the central team resources scale linearly, leading to increased coordination overhead and difficulty in maintaining data quality at scale. Moreover, the traditional architecture's limitations make it challenging to parallelize development, hindering the organization's ability to scale efficiently and effectively.

Domain-Driven Data Products Approach

The new approach organizes data around business domains, where each domain:

Ownership and Autonomy

Clear domain ownership and accountability are essential, ensuring that each domain is responsible for its actions and outcomes. This autonomy allows for decision-making within domains to be self-sufficient, enabling local optimization tailored to the specific needs of each domain. This approach ensures direct alignment with business objectives, as each domain is focused on achieving its unique goals.

Data Product Thinking

Data is treated as a product, complete with clear Service Level Agreements (SLAs) that outline its quality, availability, and performance. Well-defined interfaces and contracts ensure seamless integration and consumption of data products. Built-in data quality and observability mechanisms guarantee the data product's integrity and facilitate its monitoring. The primary focus is on understanding user needs and consumption patterns to ensure the data product meets its intended purpose.

Architecture Principles

The architecture is designed to facilitate the independent evolution of domains, allowing each domain to progress at its own pace without being hindered by dependencies on other domains. Clear boundaries and responsibilities are established to avoid confusion or overlap between domains. Standardized inter-domain communication ensures seamless interaction between domains, promoting a cohesive and integrated system. A federated governance model is implemented to oversee the entire system, ensuring consistency and coordination across domains.

Recommended by LinkedIn

Can Chatbots Really Query Data Like an Analyst? What…

Vivek Gupta 1 month ago

Data, Analytics and AI - Key to Value Creation BUT…

Eddie Short 1 year ago

Wednesday April 9; Zhamak Degnani's new data product…

Donald Farmer 3 weeks ago

Enabling Scale Through Domain Architecture

This approach enables scaling by:

Parallel Development

Independent domain teams work autonomously, reducing coordination overhead and enabling faster iteration cycles. This autonomy allows for domain-specific optimization, ensuring that each domain is tailored to its unique needs and objectives.

Clear Responsibilities

Domain teams are responsible for their data products, ensuring that they are aligned with the specific needs of their domain. Central teams focus on developing platform capabilities, providing a foundation for the domains to build upon. Standardized interfaces between domains facilitate seamless integration, while a shared governance framework ensures consistency across the system.

Flexible Evolution

The domain-driven approach enables domains to evolve at different speeds, allowing for independent technology choices where appropriate. This flexibility enables experimentation within domains, fostering innovation and improvement. A gradual modernization path is also facilitated, ensuring that domains can adapt to changing requirements without disrupting the entire system.

Implementation Considerations

Success with domain-driven architecture requires:

Organizational Alignment

Clear domain boundaries must be established and aligned with the organization's business objectives. This ensures that each domain is focused on specific business needs and outcomes. Empowered domain teams are essential, as they are responsible for making decisions and taking actions within their respective domains. A balanced distribution of responsibilities between central and domain teams is crucial, ensuring that each team is accountable for its actions and outcomes. Strong communication channels must be established to facilitate collaboration and coordination between teams, ensuring that information flows seamlessly across domains.

Technical Standards and tooling

To ensure seamless integration and data exchange between domains, shared data contracts must be defined and agreed upon.

Standard integration patterns should be established to simplify the integration process and reduce complexity. Common quality metrics are necessary to ensure that data products meet the required standards across domains.

A unified metadata management system is essential for maintaining a consistent understanding of data across the organization.

Modern Data Governance is not only about rules and powerpoint anymore but also in robust tooling and platform involvement to help engineers and business users work seamlessly with Data.

Data Platforms internal Marketplaces become more and more a standard to package and deliver Data Products to internal stakeholders, as it could be for external users.

Governance Framework

A federated governance model is necessary to oversee the entire system, ensuring consistency and coordination across domains. Clear decision rights must be defined to avoid confusion or overlap between domains. Standardized quality measures are essential to ensure that data products meet the required standards across domains. Cross-domain coordination mechanisms must be established to facilitate collaboration and ensure that domains work together effectively.

This approach creates a more resilient, scalable, and agile data platform that can better support the diverse needs of modern enterprises, particularly in the context of AI and agentic systems deployment.

My point of view

I want to make it clear that the solution is not really in the adoption of any data modelling techniques (entity-relation model, star schema, snowflake schema and other data vaults - which seems to bring today more problems than solving issues), but more in the data architecture reusing digital platforms design, where pizza teams, APIs and agility were foundational to bring real time-to-value.

4. From Data Quality to Data Observability

The increasing prevalence of near real-time data and massive volumes has rendered traditional data quality approaches ineffective. To address these challenges, modern data platforms must incorporate advanced capabilities that ensure data quality and integrity. Specifically, these platforms require end-to-end visibility across the entire data value chain, enabling the tracking of data from its source to its final destination. This visibility is crucial for identifying and addressing data quality issues promptly.

Automated quality testing and anomaly detection are also essential components of modern data platforms. These features enable the identification of data quality issues in real-time, allowing for swift corrective action to be taken. Furthermore, advanced monitoring capabilities are necessary to proactively identify potential issues before they impact the data pipeline. This proactive approach ensures that data quality issues are addressed before they affect downstream applications or users.

Comprehensive lineage and impact analysis are also critical components of modern data platforms. These capabilities provide a detailed understanding of how data flows through the system, enabling the identification of the root cause of data quality issues and the impact of these issues on downstream applications. This understanding is essential for making informed decisions about data quality and for optimizing data processing workflows to ensure the highest levels of data integrity.

If you really want to know more about Data Observability, take a look at pure players like Sifflet Data (my favourite, yes it’s french) or Monte Carlo Data.

5. Your unstructured Data is just … Data

In the era of AI and Agentic systems, the distinction between structured and unstructured data is becoming increasingly irrelevant. AI agents, with their advanced capabilities, can seamlessly process and analyze both types of data, extracting valuable insights and making informed decisions. Therefore, organizations should adopt a unified approach to data management, treating unstructured data with the same level of importance and rigor as structured data. The recent acquisition of Datavolo by Snowflake shows how platform prepare for this approach.

Why Unify Data Management?

AI agents are agnostic to data structures. They can leverage advanced techniques like natural language processing (NLP) and machine learning (ML) to extract meaning and value from both structured and unstructured data. By unifying data management, organizations can unlock the full potential of their data assets, enabling AI agents to leverage all available information for decision-making.

How to Achieve Unified Data Management

Unified Data Platforms: Organizations should leverage modern data platforms that can seamlessly handle both structured and unstructured data. These platforms should provide a unified view of all data assets, regardless of their structure, enabling AI agents to access and analyze data from a single source.
Unified Operating Models: The same security protocols, data governance policies, and operational processes should be applied to both structured and unstructured data. This ensures consistency and compliance across all data assets, regardless of their structure.
Integrated Data Management: When using specialized components like vector databases for unstructured data, their management should be integrated into the overall data platform. This ensures that all data assets are managed in a coordinated and consistent manner.

By adopting a unified approach to data management, organizations can empower their AI agents to leverage the full spectrum of their data assets, driving innovation and unlocking new opportunities for growth.

Preparing for AI and Agentic Systems

Infrastructure Requirements

Universal Storage: The data platform of 2025 must be capable of efficiently handling both structured and unstructured data. This is essential to accommodate the diverse data types and sources that modern enterprises deal with.

For structured format, 2025 will certainly see Iceberg becoming the new open de facto standard for multi/hybrid-cloud structured data storage.

Only advanced Data Platforms can guarantee a universal and always governed Data Storage for both structured and unstructured data.

Near Real-Time Processing: The ability to access and process data in near real-time is a critical requirement for the data platform of 2025. This capability enables organizations to make timely and informed decisions based on the most current data.

Semantic Understanding: The data platform of 2025 must possess advanced semantic understanding capabilities. This includes the ability to understand the context and relationships between different data elements. This understanding is crucial for enabling more intelligent and context-aware data processing.

Governance Evolution

The governance model is evolving from a centralized to a federated model, where:

The IT department is responsible for managing the platform infrastructure, ensuring its stability and scalability.

Individual domains are granted a high degree of autonomy, allowing them to make decisions that are best suited to their specific needs and objectives. Shared governance rules are established to ensure consistency and alignment across all domains, promoting a cohesive and integrated system. Mandatory data cataloging and quality reporting are enforced, ensuring that all data products meet the required standards and are well-documented.

Clear data contracts are established between domains, ensuring that data is exchanged in a standardized and secure manner.

To ensure Modern Governance is not only rules and powerpoints, it must be enforced into the core of your data platform. This is called : Federated Computational Governance which includes :

Standard as Code
Policies as Code
Automated tests
Automated Monitoring

The Road Ahead

To sum-up what we've just been through, organizations must focus on several key areas to prepare their data platforms for AI and agentic systems:

Industrialization

Complete CI/CD integration
Automated infrastructure management
Standardized deployment processes
Federated Computational Governance

Data Synchronization

Move beyond traditional ETL
Implement real-time data synchronization
Leverage cloud-native ingestion solutions

Accessibility

Semantic layer within the Data Platform
Implement "chat with your data" capabilities
Enable natural language queries
Support text-to-SQL functionality

Conclusion

The transition to AI and agentic-ready data platforms represents a fundamental shift in enterprise data architecture. Success requires organizations to move beyond traditional BI-centric thinking and embrace a more dynamic, interconnected, and semantically rich approach to data management. The platforms of 2025 will not just store and analyze data – they will actively participate in the organization's AI and agentic ecosystem, enabling new levels of automation, insight, and innovation.

Pierre Brouant

Data & AI | Sales Leader

3mo

Super intéressant, merci Laurent. Cc Renaud Andrieux Nasri BADAOUI

2 Reactions

Pierre Brouant

Data & AI | Sales Leader

3mo

BENOIT GRIMAUD

1 Reaction

Laurent LETOURMY

Snowflake ❄️ Addict | @ Devoteam | Data & AI fan

3mo

I have to credit Sami Amouri for his contribution on the Data Governance part and Moshir MIKAEL and @aws teams for discussions around this topic !

1 Reaction

Raphael Vergnaud

Chief Revenue Officer | MBA | MSC

3mo

Always good to read you Laurent LETOURMY and spot on article. I totally agree realtime data ingestion, transformation and distribution is key to the ecosystem. Not sure if you have seen but I have joined DiffusionData few months ago and we specialise in exactly that. I was very excited to see we already have built that CDC adapter for a number of years already to help with streaming data from legacy system. I will ping you to exchange on the topic as this is super interesting!

1 Reaction

Bouraïda Merabet

EMEA Strategic Partner Manager at Amazon Web Services (AWS)

3mo

Very insightfull article! thanks Laurent LETOURMY for sharing with us these valuable learning !

1 Reaction

See more comments

To view or add a comment, sign in

The End of Traditional BI-Centric Platforms

5 Key Pillars of Modern Data Platforms for AI & Agentic

1. Centralized Semantic Layer: The Foundation for AI and Agents

Why Semantic Layer is Critical for AI Agents

Universal Data Access Through Semantic Understanding

Breaking Free from Tool-Specific Semantics

2. Becoming Active: From Passive Storage to Operational Hub

Beyond the Analytics-Only Paradigm

Operational Capabilities and Requirements

Near Real-Time Processing

Bi-directional Integration

Operational Use Cases

Impact on Business & Technical Operations

3. Domain-Driven Architecture: Breaking Free from Traditional Constraints

Limitations of Traditional Architectures

Organizational Bottlenecks

Technical Constraints

Scalability Issues

Domain-Driven Data Products Approach

Ownership and Autonomy

Data Product Thinking

Architecture Principles

Recommended by LinkedIn

Enabling Scale Through Domain Architecture

Parallel Development

Clear Responsibilities

Flexible Evolution

Implementation Considerations

Organizational Alignment

Technical Standards and tooling

Governance Framework

My point of view

4. From Data Quality to Data Observability

5. Your unstructured Data is just … Data

Why Unify Data Management?

How to Achieve Unified Data Management

Preparing for AI and Agentic Systems

Infrastructure Requirements

Governance Evolution

The Road Ahead

Industrialization

Data Synchronization

Accessibility

Conclusion

More articles by Laurent LETOURMY

Snowflake's Cortex: One Line to Rule Them All

IA Générative et Gouvernance des Données : So What?

La GenAI, ce n’est pas de l’IA

Les Nouvelles Data Platforms : Data Lakes "Actifs" - Une évidence pour tout le monde ?

Les Nouvelles Data Platforms : L’industrialisation, une nécessité pour passer à l’échelle

New Data Platforms - Real-Time Data, luxury or necessity?

Les Nouvelles Data Platforms : La Data en Temps Réel, luxe ou nécessité ?

Les Nouvelles Data Platforms : La modélisation par domaines

Les Nouvelles Data Platforms : Fin annoncée des ETL ?

New Data Platforms: The Announced End of ETLs?

Insights from the community

Others also viewed

How Will the Rise of AI Impact Companies with Significant Investments in Tableau?

The Power of AI: How DataCaffé Transforms Data into Actionable Insights

Medallion Makeover: Adapting Data Platform for Intelligent Agents

Advancing Artificial and Human Intelligence through Non-Invasive Data Governance

Transforming Data Analytics with Generative AI: Efficiency Gains and Future Insights

AI-Driven Analytics as the Keystone: Building the Future of Data Science with AI Agents

Addressing the Unified Namespace and Entity Resolution Challenge: Ensuring Data Quality for Effective Generative AI Performance

Unlocking the Power of Generative AI in Data Analytics

Customer-Friendly Data Engineering: The Missing Link to Enterprise AI Adoption

Chapter 2: The Role of AI in Data Governance – Automating Data Stewardship

Explore topics