The Data Center Series, Second Intermezzo: Tracing the Evolution of AI
In this second part of this Intermezzo I want to give an historical perspective to the evolution of AI and how is becoming a multi trillion dollar market: today, AI stands at the forefront of technological innovation. From generating human-like text and stunning artwork to a future when it could predict market trends and diagnosing diseases, AI has woven itself into the fabric of our daily lives. The rise of Generative AI, like ChatGPT and DALL-E, represents a leap forward in creativity and capability. These systems draw from vast unstructured data pipelines, texts, images, and videos, to mimic human cognition in astonishing ways. (OpenAI Research)
However, despite these advancements, AI has its limitations. Current models are far from achieving Artificial General Intelligence (AGI) as in machines with human-like reasoning and the ability to generalize knowledge across domains. Many researchers argue that fundamental constraints in energy efficiency, data requirements, and the integration of symbolic reasoning make AGI, and its hypothetical successor, Artificial Superintelligence (ASI), unattainable. (MIT Technology Review)
To understand how we arrived at today’s AI, we must look back through its evolution, tracing its reliance on structured and unstructured data pipelines and the technological breakthroughs that defined its path.
Generative AI: Creativity Born from Unstructured Data
At the peak of AI’s current capabilities, Generative AI shines as a transformative force. Leveraging vast unstructured data pipelines, models like Gemini, GPT, Claude, etc. are trained on terabytes of raw data sourced from the Internet. These pipelines, comprising text documents, images, and even multimedia, are pre-processed to remove noise and standardize formats, feeding the insatiable appetite of large-scale models, and fuelling billions and billions of investments. (World Economic Forum)
Generative AI models, such as transformers, excel at identifying patterns in unstructured data. Imagine a teacher highlighting key points in a lecture, transformers use similar mechanisms to "pay attention" to specific parts of data, ensuring coherence in text or consistency in images. (Vaswani et al., "Attention Is All You Need")
Yet, despite their brilliance, Generative AI systems have limitations. They cannot reason or extrapolate beyond the patterns in their training data. Ethical concerns, such as bias or misinformation, also loom large. Generative AI represents the triumph of unstructured pipelines but also reveals the challenges of relying solely on them.
Predictive AI: Optimizing Structured Data
In the 2010s, Predictive AI ruled the landscape, powering applications across e-commerce, healthcare, and manufacturing. This era relied heavily on structured data pipelines, organized, labelled data from databases, sensors, and transactional systems.
Netflix’s recommendation engine, for instance, analyses structured data such as watch history and user preferences to suggest new content. Similarly, predictive maintenance in aviation leverages structured sensor data to forecast engine failures, saving millions in operational costs. (Netflix Technology Blog)
Predictive AI also integrated unstructured data during this period. Retailers, for instance, combined structured sales data with social media sentiment to enrich forecasts. These hybrid pipelines allowed models to generate more nuanced predictions. (Predicting Consumer Demand in an Unpredictable World)
Recommended by LinkedIn
Descriptive AI: Patterns and Insights from Structured Data
Rewind to the early 2000s, and Descriptive AI dominated enterprise analytics. Tools like Tableau and SAP relied heavily on structured data pipelines to analyse historical trends and generate actionable insights.
For instance, retailers might use Descriptive AI to identify best-selling products during specific seasons. Dashboards and reports presented this information in clear, digestible formats. However, these systems were reactive, they could describe “what happened” but couldn’t predict future trends or suggest solutions.
During this era, unstructured data was rarely incorporated due to the lack of processing capabilities. Descriptive systems were limited by static, siloed pipelines that required manual updates and offered little scalability.
The Foundations of AI: Rule-Based Systems and Machine Learning
The story of AI begins in the 1950s–1980s, with Symbolic AI, also known as rule-based systems. These early systems relied on explicit rules and logic programming to emulate human decision-making. Examples include diagnostic tools in medicine, which followed a strict sequence of "if-then" rules. (Evolution of Symbolic AI)
Symbolic AI depended on small, curated structured datasets, such as patient medical records or financial ledgers. However, it struggled to handle ambiguity or scale beyond predefined rules. These limitations became apparent as the real-world variability of data increased.
The 1990s marked the rise of Machine Learning (ML). Unlike rule-based systems, ML models learned from data. Early neural networks demonstrated the ability to recognize patterns without explicit programming. This period saw the digitization of records, leading to more dynamic structured pipelines. However, unstructured data remained out of reach due to technological limitations.
Conclusion: The Future of AI and Data Pipelines
AI’s evolution highlights a recurring theme: its progress has been driven by advancements in data pipelines. Descriptive AI relied on static structured pipelines, Predictive AI optimized these pipelines for forecasting, and Generative AI unlocked the potential of unstructured data. Each stage reflected the prevailing technology of its time and its constraints.
Today, Generative AI dominates discussions, yet its limitations remind us that AGI remains elusive. The computational demands of unstructured pipelines and the lack of reasoning capabilities present significant hurdles. However, the progress made so far suggests a promising future, where AI and data pipelines continue to co-evolve.
Looking ahead, pipelines that seamlessly integrate structured and unstructured data could redefine the boundaries of AI. Could the next generation of AI models combine reasoning with creativity, bridging the gap to AGI? Only time will tell. For now, AI’s story remains one of relentless innovation, with its greatest breakthroughs likely still ahead.