SlideShare a Scribd company logo
!1
Full Human-Level Artificial Intelligence
and … Data Stream Processing
Heh, I got this!
Paris Carbone
▪ There is one known runtime for human-level intelligence
▪ What is so special about the human brain structure?
!3
Neurobiological Foundations of Action Planning and Execution - Human Action Control — B.Hommel et al.
▪ Diverse functionality/workloads
▪ Common runtime (neurons)
▪ The Brain Neural Network Runtime
!4
▪ Distributed
▪ Organised in Logical Units
▪ Embedded State with Computation
▪ Shared Network
▪ Configured Data Dependencies
▪ Messages (signals)
▪ Supports Low latency Serving
▪ Supports Incremental Updates
▪ Supports Iterative Tasks
▪ Asynchronous Processing
▪ 100% Organic
▪ Distributed
▪ Organised in Logical Units
▪ Embedded State with Computation
▪ Shared Network
▪ Configured Data Dependencies
▪ Messages (signals)
▪ Supports Low latency Serving
▪ Supports Incremental Updates
▪ Supports Iterative Tasks
▪ Asynchronous Processing
▪ 100% Organic
▪ The Data Stream Processing Runtime
!5
!6
▪ Compilers - Our first and best “super-human” invention
▪ Instead, compilers can understand instructions…
▪ explained by humans in a high-level declarative language
▪ and then optimise them
▪ and translate to stupid machines to execute them reliably
“A revolutionary technology
that does NOT require you to throw tons of data
to your problem to be able to solve it”
!7
▪ Our ‘Continuous Deep Analytics’ Project
Compilers
+
Data Streams
▪ Modern Data Pipelines need to combine diverse workloads!
(ML Training & Serving, Relational Algebra, Streams, Tensors, Graphs)
!8
⋈
⋈
⋈
σθ
σθ
σθ
σθ
π
π
Relational Data Streams
Feature Learning
Tensor Programming Dynamic
Graphs
!9
Arc Compiler
▪ diverse workloads
▪ common runtime
!10
Intelligence: Smart Choice / Responce Time
Pipeline (CPU) - Optimised
Pipeline (GPU/TPU)
- Optimised
time until decision
Pipeline (CPU)
Pipeline (GPU/TPU)
critical decision
making
!11
▪ It will be able to solve complex Climate Science problems, fast
val rawStreams = streams("models/*/ts*.nc").
withType[LabelledTensor[Inf x Int x Int -> Double,
Float x (Float, Float) x (Float, Float)]].
dimensionLabels('time x 'lat x 'lon);
val averageStreams = rawStreams.map { raw =>
val timeSliced = raw.sliceBy('time);
val aligned = timeSlices.tile(360 x 720).
map(grid => average(grid));
val gridSlices = aligned.sliceBy('lat, 'lon);
val agg12h = gridSlices.window('time, t => t.between(TimeOfDay(6.h), TimeOfDay(18.h))).
average;
val agg1d = gridSlices.window('time, t => Day(t)).average;
val agg1month = gridSlices.window('time, t => Month(t)).average;
val agg1Season = gridSlices.window('time, t => Month(t).in(
Set(Dec, Jan, Feb),
Set(Mar, Apr, May),
Set(Jun, Jul, Aug),
Set(Sep, Oct, Nov)).average;
(agg12h, agg1d, agg1month, agg1season)
}.unzip4;
val diffs = averageStreams.map { inv =>
val merged = inv.mergeOn('time, 'lat, 'lon);
val averageModels = merged.map(models => (models, average(models)));
averageModels.map {
case (models, avg) => models.map(t => t-avg)
};
}
!12
equi-join time slices then map:
average then diff
sink:
12h
sink:
1d
sink:
month
sink:
season
src20 window:
12h
aggregate with
shared tree of
partials:
average
window:
1d
window:
month
window:
season
src1 tile
map:
average window:
12h
aggregate with
shared tree of
partials:
average
window:
1d
window:
month
window:
season
equi-join time slices then map:
average then diff
equi-join time slices then map:
average then diff
equi-join time slices then map:
average then diff
▪ And generate an optimised stream process graph (program)
!13
Using an Intermediate Representation (IR)
f f’…. ….Data knowledge
f+f’
IR IR
IR
f f’
!14
Weld IR (Stanford DAWN Project)
+ supports large number of existing libraries
- currently limited to short-lived local task execution
Matei Zaharia (Spark architect) et. al.
!14
The Arc Compilation Stack
Available Resources
Stream Metadata
Intermediate
Representation (IR)
Frontends
Logically Optimised
IR
Physically Optimised
IR
Binaries
Arc: Weld for Streams
!16
JIT - Live Rewiring of Continuous Programs
Physically Optimised
IR
Binaries
Change in Resources
Change in Load Distribution
Monitoring
Discovered better Plan
!17
The Current CDA Team (RISE SICS + KTH)
Computer
Systems
Machine
Learning
Lars
Kroll
Paris
Carbone
Christian
Schulte
Seif
Haridi
Theodore
Vasiloudis
Daniel
Gillblad
MSc Students
• Klas Segeljakt
• Oscar Bjuhr
• Johan Mickos
▪ The Brain Neural Network Runtime
!18
▪ Distributed
▪ Organised in Logical Units
▪ Embedded State with Computation
▪ Shared Network
▪ Configured Data Dependencies
▪ Messages (signals)
▪ Supports Low latency Serving
▪ Supports Incremental Updates
▪ Supports Iterative Tasks
▪ Asynchronous Processing
▪ 100% Organic
▪ Just in Time Reconfiguration
▪ Executes Declarative Instructions Reliably

More Related Content

Similar to A Future Look of Data Stream Processing as an Architecture for AI (20)

Knowledge Discovery
Knowledge DiscoveryKnowledge Discovery
Knowledge Discovery
André Karpištšenko
 
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaDeep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Jen Aman
 
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaDeep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Databricks
 
Continuous Intelligence - Intersecting Event-Based Business Logic and ML
Continuous Intelligence - Intersecting Event-Based Business Logic and MLContinuous Intelligence - Intersecting Event-Based Business Logic and ML
Continuous Intelligence - Intersecting Event-Based Business Logic and ML
Paris Carbone
 
Apache Hadoop India Summit 2011 talk "Online Content Optimization using Hadoo...
Apache Hadoop India Summit 2011 talk "Online Content Optimization using Hadoo...Apache Hadoop India Summit 2011 talk "Online Content Optimization using Hadoo...
Apache Hadoop India Summit 2011 talk "Online Content Optimization using Hadoo...
Yahoo Developer Network
 
Towards Data Operations
Towards Data OperationsTowards Data Operations
Towards Data Operations
Andrea Monacchi
 
Zaharia spark-scala-days-2012
Zaharia spark-scala-days-2012Zaharia spark-scala-days-2012
Zaharia spark-scala-days-2012
Skills Matter Talks
 
WSO2Con ASIA 2016: IoT Analytics
WSO2Con ASIA 2016: IoT AnalyticsWSO2Con ASIA 2016: IoT Analytics
WSO2Con ASIA 2016: IoT Analytics
WSO2
 
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Advanced Data Science on Spark-(Reza Zadeh, Stanford)Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Spark Summit
 
Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream mining
Albert Bifet
 
Moving Towards a Streaming Architecture
Moving Towards a Streaming ArchitectureMoving Towards a Streaming Architecture
Moving Towards a Streaming Architecture
Gabriele Modena
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
 
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesPyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
Jim Dowling
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software Architectures
Geoffrey Fox
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software Architectures
Geoffrey Fox
 
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...
Geoffrey Fox
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
GoDataDriven
 
Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack
Srinath Perera
 
Introduction to Large Scale Data Analysis with WSO2 Analytics Platform
Introduction to Large Scale Data Analysis with WSO2 Analytics PlatformIntroduction to Large Scale Data Analysis with WSO2 Analytics Platform
Introduction to Large Scale Data Analysis with WSO2 Analytics Platform
Srinath Perera
 
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsMachine learning at scale challenges and solutions
Machine learning at scale challenges and solutions
Stavros Kontopoulos
 
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaDeep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Jen Aman
 
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaDeep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Databricks
 
Continuous Intelligence - Intersecting Event-Based Business Logic and ML
Continuous Intelligence - Intersecting Event-Based Business Logic and MLContinuous Intelligence - Intersecting Event-Based Business Logic and ML
Continuous Intelligence - Intersecting Event-Based Business Logic and ML
Paris Carbone
 
Apache Hadoop India Summit 2011 talk "Online Content Optimization using Hadoo...
Apache Hadoop India Summit 2011 talk "Online Content Optimization using Hadoo...Apache Hadoop India Summit 2011 talk "Online Content Optimization using Hadoo...
Apache Hadoop India Summit 2011 talk "Online Content Optimization using Hadoo...
Yahoo Developer Network
 
WSO2Con ASIA 2016: IoT Analytics
WSO2Con ASIA 2016: IoT AnalyticsWSO2Con ASIA 2016: IoT Analytics
WSO2Con ASIA 2016: IoT Analytics
WSO2
 
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Advanced Data Science on Spark-(Reza Zadeh, Stanford)Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Spark Summit
 
Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream mining
Albert Bifet
 
Moving Towards a Streaming Architecture
Moving Towards a Streaming ArchitectureMoving Towards a Streaming Architecture
Moving Towards a Streaming Architecture
Gabriele Modena
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
 
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesPyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
Jim Dowling
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software Architectures
Geoffrey Fox
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software Architectures
Geoffrey Fox
 
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...
Geoffrey Fox
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
GoDataDriven
 
Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack
Srinath Perera
 
Introduction to Large Scale Data Analysis with WSO2 Analytics Platform
Introduction to Large Scale Data Analysis with WSO2 Analytics PlatformIntroduction to Large Scale Data Analysis with WSO2 Analytics Platform
Introduction to Large Scale Data Analysis with WSO2 Analytics Platform
Srinath Perera
 
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsMachine learning at scale challenges and solutions
Machine learning at scale challenges and solutions
Stavros Kontopoulos
 

More from Paris Carbone (12)

Scalable and Reliable Data Stream Processing - Doctorate Seminar
Scalable and Reliable Data Stream Processing - Doctorate SeminarScalable and Reliable Data Stream Processing - Doctorate Seminar
Scalable and Reliable Data Stream Processing - Doctorate Seminar
Paris Carbone
 
Stream Loops on Flink - Reinventing the wheel for the streaming era
Stream Loops on Flink - Reinventing the wheel for the streaming eraStream Loops on Flink - Reinventing the wheel for the streaming era
Stream Loops on Flink - Reinventing the wheel for the streaming era
Paris Carbone
 
Asynchronous Epoch Commits for Fast and Reliable Data Stream Execution in Apa...
Asynchronous Epoch Commits for Fast and Reliable Data Stream Execution in Apa...Asynchronous Epoch Commits for Fast and Reliable Data Stream Execution in Apa...
Asynchronous Epoch Commits for Fast and Reliable Data Stream Execution in Apa...
Paris Carbone
 
Continuous Deep Analytics
Continuous Deep AnalyticsContinuous Deep Analytics
Continuous Deep Analytics
Paris Carbone
 
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
Paris Carbone
 
Reintroducing the Stream Processor: A universal tool for continuous data anal...
Reintroducing the Stream Processor: A universal tool for continuous data anal...Reintroducing the Stream Processor: A universal tool for continuous data anal...
Reintroducing the Stream Processor: A universal tool for continuous data anal...
Paris Carbone
 
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in ActionNot Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Paris Carbone
 
Graph Stream Processing : spinning fast, large scale, complex analytics
Graph Stream Processing : spinning fast, large scale, complex analyticsGraph Stream Processing : spinning fast, large scale, complex analytics
Graph Stream Processing : spinning fast, large scale, complex analytics
Paris Carbone
 
Data Stream Analytics - Why they are important
Data Stream Analytics - Why they are importantData Stream Analytics - Why they are important
Data Stream Analytics - Why they are important
Paris Carbone
 
Single-Pass Graph Stream Analytics with Apache Flink
Single-Pass Graph Stream Analytics with Apache FlinkSingle-Pass Graph Stream Analytics with Apache Flink
Single-Pass Graph Stream Analytics with Apache Flink
Paris Carbone
 
Aggregate Sharing for User-Define Data Stream Windows
Aggregate Sharing for User-Define Data Stream WindowsAggregate Sharing for User-Define Data Stream Windows
Aggregate Sharing for User-Define Data Stream Windows
Paris Carbone
 
Tech Talk @ Google on Flink Fault Tolerance and HA
Tech Talk @ Google on Flink Fault Tolerance and HATech Talk @ Google on Flink Fault Tolerance and HA
Tech Talk @ Google on Flink Fault Tolerance and HA
Paris Carbone
 
Scalable and Reliable Data Stream Processing - Doctorate Seminar
Scalable and Reliable Data Stream Processing - Doctorate SeminarScalable and Reliable Data Stream Processing - Doctorate Seminar
Scalable and Reliable Data Stream Processing - Doctorate Seminar
Paris Carbone
 
Stream Loops on Flink - Reinventing the wheel for the streaming era
Stream Loops on Flink - Reinventing the wheel for the streaming eraStream Loops on Flink - Reinventing the wheel for the streaming era
Stream Loops on Flink - Reinventing the wheel for the streaming era
Paris Carbone
 
Asynchronous Epoch Commits for Fast and Reliable Data Stream Execution in Apa...
Asynchronous Epoch Commits for Fast and Reliable Data Stream Execution in Apa...Asynchronous Epoch Commits for Fast and Reliable Data Stream Execution in Apa...
Asynchronous Epoch Commits for Fast and Reliable Data Stream Execution in Apa...
Paris Carbone
 
Continuous Deep Analytics
Continuous Deep AnalyticsContinuous Deep Analytics
Continuous Deep Analytics
Paris Carbone
 
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
Paris Carbone
 
Reintroducing the Stream Processor: A universal tool for continuous data anal...
Reintroducing the Stream Processor: A universal tool for continuous data anal...Reintroducing the Stream Processor: A universal tool for continuous data anal...
Reintroducing the Stream Processor: A universal tool for continuous data anal...
Paris Carbone
 
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in ActionNot Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Paris Carbone
 
Graph Stream Processing : spinning fast, large scale, complex analytics
Graph Stream Processing : spinning fast, large scale, complex analyticsGraph Stream Processing : spinning fast, large scale, complex analytics
Graph Stream Processing : spinning fast, large scale, complex analytics
Paris Carbone
 
Data Stream Analytics - Why they are important
Data Stream Analytics - Why they are importantData Stream Analytics - Why they are important
Data Stream Analytics - Why they are important
Paris Carbone
 
Single-Pass Graph Stream Analytics with Apache Flink
Single-Pass Graph Stream Analytics with Apache FlinkSingle-Pass Graph Stream Analytics with Apache Flink
Single-Pass Graph Stream Analytics with Apache Flink
Paris Carbone
 
Aggregate Sharing for User-Define Data Stream Windows
Aggregate Sharing for User-Define Data Stream WindowsAggregate Sharing for User-Define Data Stream Windows
Aggregate Sharing for User-Define Data Stream Windows
Paris Carbone
 
Tech Talk @ Google on Flink Fault Tolerance and HA
Tech Talk @ Google on Flink Fault Tolerance and HATech Talk @ Google on Flink Fault Tolerance and HA
Tech Talk @ Google on Flink Fault Tolerance and HA
Paris Carbone
 

Recently uploaded (20)

STRABAG SE - Investor Presentation - February 2024.pdf
STRABAG SE - Investor Presentation - February 2024.pdfSTRABAG SE - Investor Presentation - February 2024.pdf
STRABAG SE - Investor Presentation - February 2024.pdf
andrianalampka
 
Bringing data to life - Crime webinar Accessible.pptx
Bringing data to life - Crime webinar Accessible.pptxBringing data to life - Crime webinar Accessible.pptx
Bringing data to life - Crime webinar Accessible.pptx
Office for National Statistics
 
Understanding LLM Temperature: A comprehensive Guide
Understanding LLM Temperature: A comprehensive GuideUnderstanding LLM Temperature: A comprehensive Guide
Understanding LLM Temperature: A comprehensive Guide
Tamanna36
 
390713553-Introduction-to-Apportionment-and-Voting.pptx
390713553-Introduction-to-Apportionment-and-Voting.pptx390713553-Introduction-to-Apportionment-and-Voting.pptx
390713553-Introduction-to-Apportionment-and-Voting.pptx
KhimJDAbordo
 
Lec 12.pdfghhjjhhjkkkkkkkkkkkjfcvhiiugcvvh
Lec 12.pdfghhjjhhjkkkkkkkkkkkjfcvhiiugcvvhLec 12.pdfghhjjhhjkkkkkkkkkkkjfcvhiiugcvvh
Lec 12.pdfghhjjhhjkkkkkkkkkkkjfcvhiiugcvvh
saifalroby72
 
FT Partners Research - FinTech in Africa-2.pdf
FT Partners Research - FinTech in Africa-2.pdfFT Partners Research - FinTech in Africa-2.pdf
FT Partners Research - FinTech in Africa-2.pdf
Obinna8
 
PM003_SERENE-CM-PM-Training Material-EAM Maintenance Notification.pptx
PM003_SERENE-CM-PM-Training Material-EAM Maintenance Notification.pptxPM003_SERENE-CM-PM-Training Material-EAM Maintenance Notification.pptx
PM003_SERENE-CM-PM-Training Material-EAM Maintenance Notification.pptx
afriyanrtanjung007
 
Group Presentation - Cyclic Redundancy Checks.pptx
Group Presentation - Cyclic Redundancy Checks.pptxGroup Presentation - Cyclic Redundancy Checks.pptx
Group Presentation - Cyclic Redundancy Checks.pptx
vimbaimapfumo25
 
Chapter VII RECURSION.pdf algor and data structure
Chapter VII RECURSION.pdf algor and data structureChapter VII RECURSION.pdf algor and data structure
Chapter VII RECURSION.pdf algor and data structure
benyakoubrania53
 
Professional Certificate in Applied AI and Machine Learning
Professional Certificate in Applied AI and Machine LearningProfessional Certificate in Applied AI and Machine Learning
Professional Certificate in Applied AI and Machine Learning
Nafisur Ahmed
 
Hootsuite Social Trends 2025 Report_en.pdf
Hootsuite Social Trends 2025 Report_en.pdfHootsuite Social Trends 2025 Report_en.pdf
Hootsuite Social Trends 2025 Report_en.pdf
lionardoadityabagask
 
apidays New York 2025 - Turn API Chaos Into AI-Powered Growth by Jeremy Water...
apidays New York 2025 - Turn API Chaos Into AI-Powered Growth by Jeremy Water...apidays New York 2025 - Turn API Chaos Into AI-Powered Growth by Jeremy Water...
apidays New York 2025 - Turn API Chaos Into AI-Powered Growth by Jeremy Water...
apidays
 
Day_16_LangChain_HuggingFace_Groq_Sp25.pptx
Day_16_LangChain_HuggingFace_Groq_Sp25.pptxDay_16_LangChain_HuggingFace_Groq_Sp25.pptx
Day_16_LangChain_HuggingFace_Groq_Sp25.pptx
nealonkyle
 
TUG BD Kick Off Meet up 21 May Slide Deck.pptx
TUG BD Kick Off Meet up 21 May Slide Deck.pptxTUG BD Kick Off Meet up 21 May Slide Deck.pptx
TUG BD Kick Off Meet up 21 May Slide Deck.pptx
SaidAlHaque
 
An Algorithmic Test Using The Game of Poker
An Algorithmic Test Using The Game of PokerAn Algorithmic Test Using The Game of Poker
An Algorithmic Test Using The Game of Poker
Graham Ware
 
PN_Junction_Diode_Typdbhghfned_Notes.pdf
PN_Junction_Diode_Typdbhghfned_Notes.pdfPN_Junction_Diode_Typdbhghfned_Notes.pdf
PN_Junction_Diode_Typdbhghfned_Notes.pdf
AryanGohil1
 
apidays New York 2025 - How AI is Transforming Product Management by Shereen ...
apidays New York 2025 - How AI is Transforming Product Management by Shereen ...apidays New York 2025 - How AI is Transforming Product Management by Shereen ...
apidays New York 2025 - How AI is Transforming Product Management by Shereen ...
apidays
 
Drowning in Data but Not Seeing Results?
Drowning in Data but Not Seeing Results?Drowning in Data but Not Seeing Results?
Drowning in Data but Not Seeing Results?
42Signals
 
Monterey College of Law’s mission is to z
Monterey College of Law’s mission is to zMonterey College of Law’s mission is to z
Monterey College of Law’s mission is to z
seoali2660
 
463.8-Bitcoin from university of illinois
463.8-Bitcoin from university of illinois463.8-Bitcoin from university of illinois
463.8-Bitcoin from university of illinois
8gqtkfzwbb
 
STRABAG SE - Investor Presentation - February 2024.pdf
STRABAG SE - Investor Presentation - February 2024.pdfSTRABAG SE - Investor Presentation - February 2024.pdf
STRABAG SE - Investor Presentation - February 2024.pdf
andrianalampka
 
Understanding LLM Temperature: A comprehensive Guide
Understanding LLM Temperature: A comprehensive GuideUnderstanding LLM Temperature: A comprehensive Guide
Understanding LLM Temperature: A comprehensive Guide
Tamanna36
 
390713553-Introduction-to-Apportionment-and-Voting.pptx
390713553-Introduction-to-Apportionment-and-Voting.pptx390713553-Introduction-to-Apportionment-and-Voting.pptx
390713553-Introduction-to-Apportionment-and-Voting.pptx
KhimJDAbordo
 
Lec 12.pdfghhjjhhjkkkkkkkkkkkjfcvhiiugcvvh
Lec 12.pdfghhjjhhjkkkkkkkkkkkjfcvhiiugcvvhLec 12.pdfghhjjhhjkkkkkkkkkkkjfcvhiiugcvvh
Lec 12.pdfghhjjhhjkkkkkkkkkkkjfcvhiiugcvvh
saifalroby72
 
FT Partners Research - FinTech in Africa-2.pdf
FT Partners Research - FinTech in Africa-2.pdfFT Partners Research - FinTech in Africa-2.pdf
FT Partners Research - FinTech in Africa-2.pdf
Obinna8
 
PM003_SERENE-CM-PM-Training Material-EAM Maintenance Notification.pptx
PM003_SERENE-CM-PM-Training Material-EAM Maintenance Notification.pptxPM003_SERENE-CM-PM-Training Material-EAM Maintenance Notification.pptx
PM003_SERENE-CM-PM-Training Material-EAM Maintenance Notification.pptx
afriyanrtanjung007
 
Group Presentation - Cyclic Redundancy Checks.pptx
Group Presentation - Cyclic Redundancy Checks.pptxGroup Presentation - Cyclic Redundancy Checks.pptx
Group Presentation - Cyclic Redundancy Checks.pptx
vimbaimapfumo25
 
Chapter VII RECURSION.pdf algor and data structure
Chapter VII RECURSION.pdf algor and data structureChapter VII RECURSION.pdf algor and data structure
Chapter VII RECURSION.pdf algor and data structure
benyakoubrania53
 
Professional Certificate in Applied AI and Machine Learning
Professional Certificate in Applied AI and Machine LearningProfessional Certificate in Applied AI and Machine Learning
Professional Certificate in Applied AI and Machine Learning
Nafisur Ahmed
 
Hootsuite Social Trends 2025 Report_en.pdf
Hootsuite Social Trends 2025 Report_en.pdfHootsuite Social Trends 2025 Report_en.pdf
Hootsuite Social Trends 2025 Report_en.pdf
lionardoadityabagask
 
apidays New York 2025 - Turn API Chaos Into AI-Powered Growth by Jeremy Water...
apidays New York 2025 - Turn API Chaos Into AI-Powered Growth by Jeremy Water...apidays New York 2025 - Turn API Chaos Into AI-Powered Growth by Jeremy Water...
apidays New York 2025 - Turn API Chaos Into AI-Powered Growth by Jeremy Water...
apidays
 
Day_16_LangChain_HuggingFace_Groq_Sp25.pptx
Day_16_LangChain_HuggingFace_Groq_Sp25.pptxDay_16_LangChain_HuggingFace_Groq_Sp25.pptx
Day_16_LangChain_HuggingFace_Groq_Sp25.pptx
nealonkyle
 
TUG BD Kick Off Meet up 21 May Slide Deck.pptx
TUG BD Kick Off Meet up 21 May Slide Deck.pptxTUG BD Kick Off Meet up 21 May Slide Deck.pptx
TUG BD Kick Off Meet up 21 May Slide Deck.pptx
SaidAlHaque
 
An Algorithmic Test Using The Game of Poker
An Algorithmic Test Using The Game of PokerAn Algorithmic Test Using The Game of Poker
An Algorithmic Test Using The Game of Poker
Graham Ware
 
PN_Junction_Diode_Typdbhghfned_Notes.pdf
PN_Junction_Diode_Typdbhghfned_Notes.pdfPN_Junction_Diode_Typdbhghfned_Notes.pdf
PN_Junction_Diode_Typdbhghfned_Notes.pdf
AryanGohil1
 
apidays New York 2025 - How AI is Transforming Product Management by Shereen ...
apidays New York 2025 - How AI is Transforming Product Management by Shereen ...apidays New York 2025 - How AI is Transforming Product Management by Shereen ...
apidays New York 2025 - How AI is Transforming Product Management by Shereen ...
apidays
 
Drowning in Data but Not Seeing Results?
Drowning in Data but Not Seeing Results?Drowning in Data but Not Seeing Results?
Drowning in Data but Not Seeing Results?
42Signals
 
Monterey College of Law’s mission is to z
Monterey College of Law’s mission is to zMonterey College of Law’s mission is to z
Monterey College of Law’s mission is to z
seoali2660
 
463.8-Bitcoin from university of illinois
463.8-Bitcoin from university of illinois463.8-Bitcoin from university of illinois
463.8-Bitcoin from university of illinois
8gqtkfzwbb
 

A Future Look of Data Stream Processing as an Architecture for AI

  • 1. !1 Full Human-Level Artificial Intelligence and … Data Stream Processing Heh, I got this! Paris Carbone
  • 2. ▪ There is one known runtime for human-level intelligence
  • 3. ▪ What is so special about the human brain structure? !3 Neurobiological Foundations of Action Planning and Execution - Human Action Control — B.Hommel et al. ▪ Diverse functionality/workloads ▪ Common runtime (neurons)
  • 4. ▪ The Brain Neural Network Runtime !4 ▪ Distributed ▪ Organised in Logical Units ▪ Embedded State with Computation ▪ Shared Network ▪ Configured Data Dependencies ▪ Messages (signals) ▪ Supports Low latency Serving ▪ Supports Incremental Updates ▪ Supports Iterative Tasks ▪ Asynchronous Processing ▪ 100% Organic
  • 5. ▪ Distributed ▪ Organised in Logical Units ▪ Embedded State with Computation ▪ Shared Network ▪ Configured Data Dependencies ▪ Messages (signals) ▪ Supports Low latency Serving ▪ Supports Incremental Updates ▪ Supports Iterative Tasks ▪ Asynchronous Processing ▪ 100% Organic ▪ The Data Stream Processing Runtime !5
  • 6. !6 ▪ Compilers - Our first and best “super-human” invention ▪ Instead, compilers can understand instructions… ▪ explained by humans in a high-level declarative language ▪ and then optimise them ▪ and translate to stupid machines to execute them reliably “A revolutionary technology that does NOT require you to throw tons of data to your problem to be able to solve it”
  • 7. !7 ▪ Our ‘Continuous Deep Analytics’ Project Compilers + Data Streams
  • 8. ▪ Modern Data Pipelines need to combine diverse workloads! (ML Training & Serving, Relational Algebra, Streams, Tensors, Graphs) !8 ⋈ ⋈ ⋈ σθ σθ σθ σθ π π Relational Data Streams Feature Learning Tensor Programming Dynamic Graphs
  • 9. !9 Arc Compiler ▪ diverse workloads ▪ common runtime
  • 10. !10 Intelligence: Smart Choice / Responce Time Pipeline (CPU) - Optimised Pipeline (GPU/TPU) - Optimised time until decision Pipeline (CPU) Pipeline (GPU/TPU) critical decision making
  • 11. !11 ▪ It will be able to solve complex Climate Science problems, fast val rawStreams = streams("models/*/ts*.nc"). withType[LabelledTensor[Inf x Int x Int -> Double, Float x (Float, Float) x (Float, Float)]]. dimensionLabels('time x 'lat x 'lon); val averageStreams = rawStreams.map { raw => val timeSliced = raw.sliceBy('time); val aligned = timeSlices.tile(360 x 720). map(grid => average(grid)); val gridSlices = aligned.sliceBy('lat, 'lon); val agg12h = gridSlices.window('time, t => t.between(TimeOfDay(6.h), TimeOfDay(18.h))). average; val agg1d = gridSlices.window('time, t => Day(t)).average; val agg1month = gridSlices.window('time, t => Month(t)).average; val agg1Season = gridSlices.window('time, t => Month(t).in( Set(Dec, Jan, Feb), Set(Mar, Apr, May), Set(Jun, Jul, Aug), Set(Sep, Oct, Nov)).average; (agg12h, agg1d, agg1month, agg1season) }.unzip4; val diffs = averageStreams.map { inv => val merged = inv.mergeOn('time, 'lat, 'lon); val averageModels = merged.map(models => (models, average(models))); averageModels.map { case (models, avg) => models.map(t => t-avg) }; }
  • 12. !12 equi-join time slices then map: average then diff sink: 12h sink: 1d sink: month sink: season src20 window: 12h aggregate with shared tree of partials: average window: 1d window: month window: season src1 tile map: average window: 12h aggregate with shared tree of partials: average window: 1d window: month window: season equi-join time slices then map: average then diff equi-join time slices then map: average then diff equi-join time slices then map: average then diff ▪ And generate an optimised stream process graph (program)
  • 13. !13 Using an Intermediate Representation (IR) f f’…. ….Data knowledge f+f’ IR IR IR f f’
  • 14. !14 Weld IR (Stanford DAWN Project) + supports large number of existing libraries - currently limited to short-lived local task execution Matei Zaharia (Spark architect) et. al. !14
  • 15. The Arc Compilation Stack Available Resources Stream Metadata Intermediate Representation (IR) Frontends Logically Optimised IR Physically Optimised IR Binaries Arc: Weld for Streams
  • 16. !16 JIT - Live Rewiring of Continuous Programs Physically Optimised IR Binaries Change in Resources Change in Load Distribution Monitoring Discovered better Plan
  • 17. !17 The Current CDA Team (RISE SICS + KTH) Computer Systems Machine Learning Lars Kroll Paris Carbone Christian Schulte Seif Haridi Theodore Vasiloudis Daniel Gillblad MSc Students • Klas Segeljakt • Oscar Bjuhr • Johan Mickos
  • 18. ▪ The Brain Neural Network Runtime !18 ▪ Distributed ▪ Organised in Logical Units ▪ Embedded State with Computation ▪ Shared Network ▪ Configured Data Dependencies ▪ Messages (signals) ▪ Supports Low latency Serving ▪ Supports Incremental Updates ▪ Supports Iterative Tasks ▪ Asynchronous Processing ▪ 100% Organic ▪ Just in Time Reconfiguration ▪ Executes Declarative Instructions Reliably
  翻译: