SlideShare a Scribd company logo
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Customer Journey with Streaming Data on AWS
Amazon Web Services (AWS) offers over 165 fully featured cloud services from data
centers globally. AWS launched its first data streaming service, Amazon Kinesis Data
Streams, over five years ago. Now, customers are using streaming data across most
AWS services including two that support running Apache Flink, Amazon EMR and
Amazon Kinesis Data Analytics. In this keynote, we will describe how customers and
their use of streaming data has evolved on AWS. We will look at how streaming data
and Apache Flink are used externally and internally on AWS, and where we see usage
of Apache Flink growing.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Rahul Pathak
General Manager of Databases, Analytics, and Blockchain
Amazon Web Services
Customer Journey with Streaming
Data on AWS
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Building with our customers
50+ fully managed streaming
capabilities deployed globally in 22
AWS Regions
2019
2013
First fully managed streaming
service,
Amazon Kinesis Data Streams
2018
Support for Apache Flink
based apps in
Amazon Kinesis Data Analytics
Support for Apache Flink in
Amazon EMR
2016
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Streaming data on AWS in 2013
Internal and external customer struggling with high volume data
• Low latency, continuous data capture
• Durable storage to quickly get data from
unreliable sources
• Scale to cost effectively handle lots of data
AWS Metering
and Billing
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What did customers look like in the past?
The “Average” customer… was attracted by ease of use
• Java developer
• Had one application processing 10s of millions of events per day
• Application performed extract, buffer, and load to Amazon S3
The “Large” customer… was attracted by performance and scale
• Had distributed systems experience and was familiar with Hadoop
• Had two applications processing billions of events per day
• Application performed advanced ETL like joins
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Early streaming data customers on AWS
Streaming extract
and load to Amazon
S3
50 billion daily
ad impressions,
sub-50 ms
responses
100 GB/day
clickstreams from
250+ sites
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Supercell Delivers World-Class Mobile Games
Supercell is a Finnish game company known for the hit
games Clash of Clans, Hay Day, and Boom Beach.
The world of gaming never sleeps
... We owe every player a great
experience, and AWS is our
platform to make that happen.
Sami Yliharju
Services Lead, Supercell
”
“ • Started with a non-
streaming architecture
• Use streaming data for
faster ETL and analytics
• Started with archival apps
and kept adding use cases
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Supercell’s data pipeline circa 2012
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Supercell’s streaming pipeline circa 2018
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Simplify common use cases
Every customer built a streaming delivery app
• Load streaming data into streams, data lakes and
warehouses
• Zero administration and seamless elasticity
• Direct-to-data store integration
• Serverless continuous data transformations
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Make advanced use cases accessible
Few customers were able to move to real-time analytics
• Analyze data streams in real time
• Interact with streaming data in real-time using Apache
Flink-based apps
• Build fully managed and serverless stream processing
applications
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Apache Flink at AWS
Customers run Apache Flink on AWS on different services with varying degree
of flexibility and management
Amazon Kinesis
Data Analytics
Amazon Elastic
Kubernetes Service
Amazon EMR
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Streaming data on AWS today
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Streaming data on AWS in 2019
<40%
delivery
apps
3
apps
3
data
stores
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What do customers do today?
The “Average” customer… is attracted by the ease of use and rich features
• New to streaming data (~50% of customers are still net new to streaming)
• Has several applications processing 100s of millions of events per day
The “Large” customer… is attracted to the above but needs flexibility and elasticity
• Uses many languages including SQL, Java, Python
• Has 10s of applications processing 10s of billions of events per day
The “Platform” customer is…attracted to all of the above plus performance
• Builds abstracted platforms on top of streaming services
• Has 100s of applications processing trillions of events per day
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Streaming services are foundational at Amazon
Amazon Go
video analytics
Amazon.com
online catalog
Amazon
CloudWatch
logs
Amazon
S3 events
AWS
metering
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Customers built a wide variety of use cases
Near-real-time
home valuation
(Zestimates)
Live clickstream
dashboards refreshed
under 10s
1 billion events per
week from
connected devices
Real-time
game events
analytics
Built event driven,
micro services arch
StreamHub
Online stylist
processing
10 million
events/day
Facilitate
communications
between 100+
microservices
IoT predictive
analytics
Log analytics for
real-time “single
pane of glass”
Serverless event bus
and ingestion
pipeline
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Autodesk makes software for people who make things
Autodesk, a leading provider of 3D design and
engineering software, wants to do more than create
and deliver software.
“Ultimately, we are improving our
software products and offering
better service to our customers
because of the real-time visibility
we’re getting into log data.”
Tommy Li
Senior Software Architect, Autodesk
”
“ • Provides cloud services for
its design software
• Uses streaming analytics to
monitor and improve their
customer experience
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Autodesk’s Streaming Architecture
Amazon Kinesis
Data Streams
Amazon EC2
Amazon Elastic Container
Service
AWS Lambda
Amazon Kinesis
Data Firehose
Amazon Kinesis
Data Analytics
Amazon Elasticsearch
Service
Amazon Athena
Amazon CloudWatch
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Data Center Services (DCS) uses Apache Flink
DCS maintains and monitors electrical and physical
topologies for all AWS Data Centers
“We chose Apache Flink because it
provided simple, extensible interfaces and
scalability. We chose Kinesis Data Analytics
because of its guaranteed uptime, simple
deployment mode, and lower ops cost.
AWS Data Center Team
”
“ • Write software to interface
with equipment in data
centers
• Includes electrical power
draw, water usage, weather,
fan speeds, and host
temperatures
• Use insights to drive down
cost of data center
operations
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Data Center Services
• Data is captured from Kinesis Data
Streams and CDC from a noSQL store
(Amazon DynamoDB streams)
• Analytics are calculated using Apache
Flink on Kinesis Data Analytics
• Analytics include drift in circuit breaker
settings to power utilization and much
more
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What will customers look like in the future?
The “Average” customer… doesn’t even know they are using Flink or a
streaming data service
The “Large” customer… has teams across the company with varying levels of
technical sophistication
The “Platform” customer is… any company, no longer requiring teams of
engineers and years of investment
You may already be here but streaming is still new for most
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Apache Flink at AWS in the future
• Flink is the fastest growing framework reading Kinesis Data
Streams
• Usage is still relatively small compared to simplest of
solutions (e.g. KafkaConsumer, Kinesis Clients) running on EC2
• Excited to work with community of further simplifying
running Flink both on AWS and anywhere else
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
It is still Day 1
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!
Ad

More Related Content

What's hot (20)

Process Batch transaction using AzureBlob Integration with Apache Camel
Process Batch transaction using AzureBlob Integration with Apache CamelProcess Batch transaction using AzureBlob Integration with Apache Camel
Process Batch transaction using AzureBlob Integration with Apache Camel
Srikant Mantha
 
AWS Outage Analysis
AWS Outage AnalysisAWS Outage Analysis
AWS Outage Analysis
ThousandEyes
 
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
HostedbyConfluent
 
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
DataWorks Summit
 
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
confluent
 
Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...
Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...
Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...
confluent
 
Riverbed Granite 2.5
Riverbed Granite 2.5Riverbed Granite 2.5
Riverbed Granite 2.5
Riverbed Technology
 
A Solution for Leveraging Kafka to Provide End-to-End ACID Transactions
A Solution for Leveraging Kafka to Provide End-to-End ACID TransactionsA Solution for Leveraging Kafka to Provide End-to-End ACID Transactions
A Solution for Leveraging Kafka to Provide End-to-End ACID Transactions
confluent
 
Kafka & InfluxDB: BFFs for Enterprise Data Applications | Russ Savage, Influx...
Kafka & InfluxDB: BFFs for Enterprise Data Applications | Russ Savage, Influx...Kafka & InfluxDB: BFFs for Enterprise Data Applications | Russ Savage, Influx...
Kafka & InfluxDB: BFFs for Enterprise Data Applications | Russ Savage, Influx...
HostedbyConfluent
 
Improving Veteran benefit services through efficient data streaming | Robert ...
Improving Veteran benefit services through efficient data streaming | Robert ...Improving Veteran benefit services through efficient data streaming | Robert ...
Improving Veteran benefit services through efficient data streaming | Robert ...
HostedbyConfluent
 
Real time data processing and model inferncing platform with Kafka streams (N...
Real time data processing and model inferncing platform with Kafka streams (N...Real time data processing and model inferncing platform with Kafka streams (N...
Real time data processing and model inferncing platform with Kafka streams (N...
KafkaZone
 
Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...
Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...
Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...
StreamNative
 
How does a Modern Integration Platform Innovate
How does a Modern Integration Platform InnovateHow does a Modern Integration Platform Innovate
How does a Modern Integration Platform Innovate
SEEBURGER
 
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
confluent
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kai Wähner
 
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache KafkaTop 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
Kai Wähner
 
I'm being followed by drones
I'm being followed by dronesI'm being followed by drones
I'm being followed by drones
DataWorks Summit/Hadoop Summit
 
Apache Kafka and the Data Mesh | Michael Noll, Confluent
Apache Kafka and the Data Mesh | Michael Noll, ConfluentApache Kafka and the Data Mesh | Michael Noll, Confluent
Apache Kafka and the Data Mesh | Michael Noll, Confluent
HostedbyConfluent
 
Making the Most of Data in Multiple Data Sources (with Virtual Data Lakes)
Making the Most of Data in Multiple Data Sources (with Virtual Data Lakes)Making the Most of Data in Multiple Data Sources (with Virtual Data Lakes)
Making the Most of Data in Multiple Data Sources (with Virtual Data Lakes)
DataWorks Summit
 
Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...
Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...
Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...
HostedbyConfluent
 
Process Batch transaction using AzureBlob Integration with Apache Camel
Process Batch transaction using AzureBlob Integration with Apache CamelProcess Batch transaction using AzureBlob Integration with Apache Camel
Process Batch transaction using AzureBlob Integration with Apache Camel
Srikant Mantha
 
AWS Outage Analysis
AWS Outage AnalysisAWS Outage Analysis
AWS Outage Analysis
ThousandEyes
 
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
HostedbyConfluent
 
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
DataWorks Summit
 
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
confluent
 
Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...
Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...
Fully-Managed, Multi-Tenant Kafka Clusters: Tips, Tricks, and Tools (Christop...
confluent
 
A Solution for Leveraging Kafka to Provide End-to-End ACID Transactions
A Solution for Leveraging Kafka to Provide End-to-End ACID TransactionsA Solution for Leveraging Kafka to Provide End-to-End ACID Transactions
A Solution for Leveraging Kafka to Provide End-to-End ACID Transactions
confluent
 
Kafka & InfluxDB: BFFs for Enterprise Data Applications | Russ Savage, Influx...
Kafka & InfluxDB: BFFs for Enterprise Data Applications | Russ Savage, Influx...Kafka & InfluxDB: BFFs for Enterprise Data Applications | Russ Savage, Influx...
Kafka & InfluxDB: BFFs for Enterprise Data Applications | Russ Savage, Influx...
HostedbyConfluent
 
Improving Veteran benefit services through efficient data streaming | Robert ...
Improving Veteran benefit services through efficient data streaming | Robert ...Improving Veteran benefit services through efficient data streaming | Robert ...
Improving Veteran benefit services through efficient data streaming | Robert ...
HostedbyConfluent
 
Real time data processing and model inferncing platform with Kafka streams (N...
Real time data processing and model inferncing platform with Kafka streams (N...Real time data processing and model inferncing platform with Kafka streams (N...
Real time data processing and model inferncing platform with Kafka streams (N...
KafkaZone
 
Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...
Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...
Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...
StreamNative
 
How does a Modern Integration Platform Innovate
How does a Modern Integration Platform InnovateHow does a Modern Integration Platform Innovate
How does a Modern Integration Platform Innovate
SEEBURGER
 
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
confluent
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kai Wähner
 
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache KafkaTop 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
Kai Wähner
 
Apache Kafka and the Data Mesh | Michael Noll, Confluent
Apache Kafka and the Data Mesh | Michael Noll, ConfluentApache Kafka and the Data Mesh | Michael Noll, Confluent
Apache Kafka and the Data Mesh | Michael Noll, Confluent
HostedbyConfluent
 
Making the Most of Data in Multiple Data Sources (with Virtual Data Lakes)
Making the Most of Data in Multiple Data Sources (with Virtual Data Lakes)Making the Most of Data in Multiple Data Sources (with Virtual Data Lakes)
Making the Most of Data in Multiple Data Sources (with Virtual Data Lakes)
DataWorks Summit
 
Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...
Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...
Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...
HostedbyConfluent
 

Similar to Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS (10)

Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
AWS Summits
 
Modern Applications Development on AWS
Modern Applications Development on AWSModern Applications Development on AWS
Modern Applications Development on AWS
Boaz Ziniman
 
Building a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWSBuilding a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWS
Injae Kwak
 
Perfecting the Media Workflow Experience on AWS - Ben Masek, 월드와이드 미디어 사업개발 헤...
Perfecting the Media Workflow Experience on AWS - Ben Masek, 월드와이드 미디어 사업개발 헤...Perfecting the Media Workflow Experience on AWS - Ben Masek, 월드와이드 미디어 사업개발 헤...
Perfecting the Media Workflow Experience on AWS - Ben Masek, 월드와이드 미디어 사업개발 헤...
Amazon Web Services Korea
 
AWS Outposts Update
AWS Outposts UpdateAWS Outposts Update
AWS Outposts Update
AWS Daily News
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudModern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Alluxio, Inc.
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
Steven Hsieh
 
Serverless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best PracticesServerless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best Practices
Vladimir Simek
 
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...
Amazon Web Services Korea
 
Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.
javier ramirez
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
AWS Summits
 
Modern Applications Development on AWS
Modern Applications Development on AWSModern Applications Development on AWS
Modern Applications Development on AWS
Boaz Ziniman
 
Building a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWSBuilding a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWS
Injae Kwak
 
Perfecting the Media Workflow Experience on AWS - Ben Masek, 월드와이드 미디어 사업개발 헤...
Perfecting the Media Workflow Experience on AWS - Ben Masek, 월드와이드 미디어 사업개발 헤...Perfecting the Media Workflow Experience on AWS - Ben Masek, 월드와이드 미디어 사업개발 헤...
Perfecting the Media Workflow Experience on AWS - Ben Masek, 월드와이드 미디어 사업개발 헤...
Amazon Web Services Korea
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudModern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Alluxio, Inc.
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
Steven Hsieh
 
Serverless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best PracticesServerless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best Practices
Vladimir Simek
 
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...
Amazon Web Services Korea
 
Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.
javier ramirez
 
Ad

More from Flink Forward (20)

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
Flink Forward
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Flink Forward
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Flink Forward
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
Flink Forward
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
Flink Forward
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Flink Forward
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
Flink Forward
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
Flink Forward
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
Flink Forward
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
Flink Forward
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Flink Forward
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
Flink Forward
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
Flink Forward
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
Flink Forward
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Flink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
Flink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
Flink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
Flink Forward
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
Flink Forward
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Flink Forward
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Flink Forward
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
Flink Forward
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
Flink Forward
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Flink Forward
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
Flink Forward
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
Flink Forward
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
Flink Forward
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
Flink Forward
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Flink Forward
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
Flink Forward
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
Flink Forward
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
Flink Forward
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Flink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
Flink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
Flink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
Flink Forward
 
Ad

Recently uploaded (20)

論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
Toru Tamaki
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
Google DeepMind’s New AI Coding Agent AlphaEvolve.pdf
Google DeepMind’s New AI Coding Agent AlphaEvolve.pdfGoogle DeepMind’s New AI Coding Agent AlphaEvolve.pdf
Google DeepMind’s New AI Coding Agent AlphaEvolve.pdf
derrickjswork
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
DNF 2.0 Implementations Challenges in Nepal
DNF 2.0 Implementations Challenges in NepalDNF 2.0 Implementations Challenges in Nepal
DNF 2.0 Implementations Challenges in Nepal
ICT Frame Magazine Pvt. Ltd.
 
Refactoring meta-rauc-community: Cleaner Code, Better Maintenance, More Machines
Refactoring meta-rauc-community: Cleaner Code, Better Maintenance, More MachinesRefactoring meta-rauc-community: Cleaner Code, Better Maintenance, More Machines
Refactoring meta-rauc-community: Cleaner Code, Better Maintenance, More Machines
Leon Anavi
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 
Secondary Storage for a microcontroller system
Secondary Storage for a microcontroller systemSecondary Storage for a microcontroller system
Secondary Storage for a microcontroller system
fizarcse
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
How Top Companies Benefit from Outsourcing
How Top Companies Benefit from OutsourcingHow Top Companies Benefit from Outsourcing
How Top Companies Benefit from Outsourcing
Nascenture
 
Computer Systems Quiz Presentation in Purple Bold Style (4).pdf
Computer Systems Quiz Presentation in Purple Bold Style (4).pdfComputer Systems Quiz Presentation in Purple Bold Style (4).pdf
Computer Systems Quiz Presentation in Purple Bold Style (4).pdf
fizarcse
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Alan Dix
 
accessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electricaccessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electric
UXPA Boston
 
Building a research repository that works by Clare Cady
Building a research repository that works by Clare CadyBuilding a research repository that works by Clare Cady
Building a research repository that works by Clare Cady
UXPA Boston
 
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
Toru Tamaki
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
Google DeepMind’s New AI Coding Agent AlphaEvolve.pdf
Google DeepMind’s New AI Coding Agent AlphaEvolve.pdfGoogle DeepMind’s New AI Coding Agent AlphaEvolve.pdf
Google DeepMind’s New AI Coding Agent AlphaEvolve.pdf
derrickjswork
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
Refactoring meta-rauc-community: Cleaner Code, Better Maintenance, More Machines
Refactoring meta-rauc-community: Cleaner Code, Better Maintenance, More MachinesRefactoring meta-rauc-community: Cleaner Code, Better Maintenance, More Machines
Refactoring meta-rauc-community: Cleaner Code, Better Maintenance, More Machines
Leon Anavi
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 
Secondary Storage for a microcontroller system
Secondary Storage for a microcontroller systemSecondary Storage for a microcontroller system
Secondary Storage for a microcontroller system
fizarcse
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
How Top Companies Benefit from Outsourcing
How Top Companies Benefit from OutsourcingHow Top Companies Benefit from Outsourcing
How Top Companies Benefit from Outsourcing
Nascenture
 
Computer Systems Quiz Presentation in Purple Bold Style (4).pdf
Computer Systems Quiz Presentation in Purple Bold Style (4).pdfComputer Systems Quiz Presentation in Purple Bold Style (4).pdf
Computer Systems Quiz Presentation in Purple Bold Style (4).pdf
fizarcse
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Alan Dix
 
accessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electricaccessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electric
UXPA Boston
 
Building a research repository that works by Clare Cady
Building a research repository that works by Clare CadyBuilding a research repository that works by Clare Cady
Building a research repository that works by Clare Cady
UXPA Boston
 

Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS

  • 1. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Customer Journey with Streaming Data on AWS Amazon Web Services (AWS) offers over 165 fully featured cloud services from data centers globally. AWS launched its first data streaming service, Amazon Kinesis Data Streams, over five years ago. Now, customers are using streaming data across most AWS services including two that support running Apache Flink, Amazon EMR and Amazon Kinesis Data Analytics. In this keynote, we will describe how customers and their use of streaming data has evolved on AWS. We will look at how streaming data and Apache Flink are used externally and internally on AWS, and where we see usage of Apache Flink growing.
  • 2. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Rahul Pathak General Manager of Databases, Analytics, and Blockchain Amazon Web Services Customer Journey with Streaming Data on AWS
  • 3. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Building with our customers 50+ fully managed streaming capabilities deployed globally in 22 AWS Regions 2019 2013 First fully managed streaming service, Amazon Kinesis Data Streams 2018 Support for Apache Flink based apps in Amazon Kinesis Data Analytics Support for Apache Flink in Amazon EMR 2016
  • 4. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Streaming data on AWS in 2013 Internal and external customer struggling with high volume data • Low latency, continuous data capture • Durable storage to quickly get data from unreliable sources • Scale to cost effectively handle lots of data AWS Metering and Billing
  • 5. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What did customers look like in the past? The “Average” customer… was attracted by ease of use • Java developer • Had one application processing 10s of millions of events per day • Application performed extract, buffer, and load to Amazon S3 The “Large” customer… was attracted by performance and scale • Had distributed systems experience and was familiar with Hadoop • Had two applications processing billions of events per day • Application performed advanced ETL like joins
  • 6. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Early streaming data customers on AWS Streaming extract and load to Amazon S3 50 billion daily ad impressions, sub-50 ms responses 100 GB/day clickstreams from 250+ sites
  • 7. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Supercell Delivers World-Class Mobile Games Supercell is a Finnish game company known for the hit games Clash of Clans, Hay Day, and Boom Beach. The world of gaming never sleeps ... We owe every player a great experience, and AWS is our platform to make that happen. Sami Yliharju Services Lead, Supercell ” “ • Started with a non- streaming architecture • Use streaming data for faster ETL and analytics • Started with archival apps and kept adding use cases
  • 8. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Supercell’s data pipeline circa 2012
  • 9. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Supercell’s streaming pipeline circa 2018
  • 10. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Simplify common use cases Every customer built a streaming delivery app • Load streaming data into streams, data lakes and warehouses • Zero administration and seamless elasticity • Direct-to-data store integration • Serverless continuous data transformations
  • 11. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Make advanced use cases accessible Few customers were able to move to real-time analytics • Analyze data streams in real time • Interact with streaming data in real-time using Apache Flink-based apps • Build fully managed and serverless stream processing applications
  • 12. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Apache Flink at AWS Customers run Apache Flink on AWS on different services with varying degree of flexibility and management Amazon Kinesis Data Analytics Amazon Elastic Kubernetes Service Amazon EMR
  • 13. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Streaming data on AWS today
  • 14. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Streaming data on AWS in 2019 <40% delivery apps 3 apps 3 data stores
  • 15. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What do customers do today? The “Average” customer… is attracted by the ease of use and rich features • New to streaming data (~50% of customers are still net new to streaming) • Has several applications processing 100s of millions of events per day The “Large” customer… is attracted to the above but needs flexibility and elasticity • Uses many languages including SQL, Java, Python • Has 10s of applications processing 10s of billions of events per day The “Platform” customer is…attracted to all of the above plus performance • Builds abstracted platforms on top of streaming services • Has 100s of applications processing trillions of events per day
  • 16. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Streaming services are foundational at Amazon Amazon Go video analytics Amazon.com online catalog Amazon CloudWatch logs Amazon S3 events AWS metering
  • 17. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Customers built a wide variety of use cases Near-real-time home valuation (Zestimates) Live clickstream dashboards refreshed under 10s 1 billion events per week from connected devices Real-time game events analytics Built event driven, micro services arch StreamHub Online stylist processing 10 million events/day Facilitate communications between 100+ microservices IoT predictive analytics Log analytics for real-time “single pane of glass” Serverless event bus and ingestion pipeline
  • 18. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Autodesk makes software for people who make things Autodesk, a leading provider of 3D design and engineering software, wants to do more than create and deliver software. “Ultimately, we are improving our software products and offering better service to our customers because of the real-time visibility we’re getting into log data.” Tommy Li Senior Software Architect, Autodesk ” “ • Provides cloud services for its design software • Uses streaming analytics to monitor and improve their customer experience
  • 19. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Autodesk’s Streaming Architecture Amazon Kinesis Data Streams Amazon EC2 Amazon Elastic Container Service AWS Lambda Amazon Kinesis Data Firehose Amazon Kinesis Data Analytics Amazon Elasticsearch Service Amazon Athena Amazon CloudWatch
  • 20. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Data Center Services (DCS) uses Apache Flink DCS maintains and monitors electrical and physical topologies for all AWS Data Centers “We chose Apache Flink because it provided simple, extensible interfaces and scalability. We chose Kinesis Data Analytics because of its guaranteed uptime, simple deployment mode, and lower ops cost. AWS Data Center Team ” “ • Write software to interface with equipment in data centers • Includes electrical power draw, water usage, weather, fan speeds, and host temperatures • Use insights to drive down cost of data center operations
  • 21. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Data Center Services • Data is captured from Kinesis Data Streams and CDC from a noSQL store (Amazon DynamoDB streams) • Analytics are calculated using Apache Flink on Kinesis Data Analytics • Analytics include drift in circuit breaker settings to power utilization and much more
  • 22. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What will customers look like in the future? The “Average” customer… doesn’t even know they are using Flink or a streaming data service The “Large” customer… has teams across the company with varying levels of technical sophistication The “Platform” customer is… any company, no longer requiring teams of engineers and years of investment You may already be here but streaming is still new for most
  • 23. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Apache Flink at AWS in the future • Flink is the fastest growing framework reading Kinesis Data Streams • Usage is still relatively small compared to simplest of solutions (e.g. KafkaConsumer, Kinesis Clients) running on EC2 • Excited to work with community of further simplifying running Flink both on AWS and anywhere else
  • 24. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. It is still Day 1
  • 25. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you!
  翻译: