SlideShare a Scribd company logo
Building scalable IoT apps
using OSS technologies
Pavel Hardak
Basho Technologies
Disclaimer: some of the opinions expressed here are mine and might not fully agree with those of
IOT & INDUSTRY VERTICALS
IoT market - growth prediction
Number of connected “things”
• 2016 – about 6.4 B
• 30% YoY growth, 5.5M activations per day
• 2020 – about 21 B
“By 2020 more than half of new major business processes
and systems will incorporate some element of Internet of
Things”
Reality Check - let us get a second opinion
Building Scalable IoT Apps (QCon S-F)
IoT Project Plan
• Investigate those “things” and figure out
• What protocols they support (CoAP, MQTT, HTTP, …)
• What data they generate (temperature, humidity, location, speed, ...)
• Collect this data in our data center
• Implement protocols and parsing routines
• Store into persistent storage (“Data Lake” architecture)
• Once stored in Data Lake
• Analyze, summarize, “slice and dice”
• Predict, discover insights
• Declare a victory – make profit & go for IPO
Data Lake
IoT
Devices
SQL
Apps &
AnalyticsMQTT, CoAP and
HTTP
REFERENCE ARCHITECTURE (?)
Not so fast, my friend.
What is wrong with “Data Lake” for IoT ?
Building Scalable IoT Apps (QCon S-F)
Building Scalable IoT Apps (QCon S-F)
Building Scalable IoT Apps (QCon S-F)
Building Scalable IoT Apps (QCon S-F)
Auto Insurance - Micro Case Study
• One of top 5 auto insurance companies in USA, appears in Fortune-500 list
• More than $10B in annual revenue, above $15B in assets
• About 20,000 employees and 50,000 insurance agents
• More than 19 million individual policies across all 50 states
How this “rating info” influences your payment ?
• Garaging Zip – what neighborhood is the car parked when it is
not used? There is a high correlation between Zip code and the
probability of car being stolen or vandalized.
• Current and Previous Annual Mileage – if the insured drives
for longer distances, it leads to the higher probability of road
accidents or car malfunctions.
• Vehicle Usage – do you use your car for work or pleasure? Are
you commuter, student, stay-at-home parent or Uber driver?
Depending on your usage, the company will calculate the risk
and adjust the rate.
• Years of Driving Experience – young drivers are put into
higher risk categories, where older people are considered safer
drivers due to more time behind the wheel. Note - average
young driver vs. average experienced driver.
Building Scalable IoT Apps (QCon S-F)
Sampling Frequency and Dataset Size
• Mileage
• From one sample per year to 52 (weekly) or 365 (daily)
• Better - let us do hourly to “see” the car usage (commuter, …)
• Location (used to be “Garaging Zip”)
• From one sample per year to 365 (daily)
• Better - hourly, allows to learn when car is parked for several hours
• New factors for rating algorithm based on weekly summaries
• Hard brakes, hard accelerations, going above the speed limit, …
• Amount of time series data to be stored and analyzed
• Grows by factor of 365x, then by another 24x = 8760x
Each week – at least 50x more data than the whole previous year.
Building Scalable IoT Apps (QCon S-F)
What is different special about IoT?
It is about the “things”… and more.
Building Scalable IoT Apps (QCon S-F)
Building Scalable IoT Apps (QCon S-F)
IoT Data Categories
Category Description
Metadata
&
Profiles
Devices Device info (model, SN, firmware, sensors, ..), configuration, owner, …
Users Personal info, preferences, billing info, registered devices, …
Time
Series
Ingested
(“Raw”)
Measurements, statuses and events from devices.
Aggregated
(“Derived”)
Calculated data - from devices & profiles
• Rollups – aggregate metrics from low resolution to higher ones (min -
hour – day) using min, max, avg, ...
• Aggregations – aggregate measurements, configuration and profiles
(model, region, …) over time ranges
IOT - NETWORKING TECHNOLOGIES
NETWORK WISH LIST
• Extreme Reliability
• Guaranteed Delivery
• End-to-End Low Latency
• Quality of Service
• Engineered Topology
• Committed Bandwidth (CIR)
• Fiber-optic network
• Dedicated Channel
• Strong Signal
• Interference and Crosstalk Resistant
• High SNR (Signal to Noise Ratio)
• Very Low BER (Bit Error Rate)
REALITY CHECK - LET US LOOK AGAIN
IOT & NETWORK - REALITY
• Wireless technologies
• Shared transmission media
• Limited bandwidth
• Mesh or Ad-hoc Topology
• Possible signals interference
• Mis-ordered or lost packets
• Low cost hardware components
• Low power radio transmitters
• Very small antennas
• “Custom-made” firmware
• Constrained Application Protocol
(CoAP)
• “Best Effort” QoS (“shoot and forget”)
IoT is “Big Data” - by definition.
Actually, lots and lots of Big Data.
Five “V”s IoT data
Velocity Torrent of small writes (sensors). Reads – millions of low-latency queries: user and device
profiles, range queries for TS data (slices). Stream of updates (profiles) - beware of
conflicts.
Five “V”s IoT data
Velocity Torrent of small writes (sensors). Reads – millions of low-latency queries, user and device
profiles, range queries for TS data (slices). Stream of updates (profiles) - beware of
conflicts.
Variety Sensors data (time series), users and devices profiles, also time series “derived” data
(e.g. rollups, aggregations).
Five “V”s IoT data
Velocity Torrent of small writes (sensors). Reads – millions of low-latency queries, user and device
profiles, range queries for TS data (slices). Stream of updates (profiles) - beware of
conflicts.
Variety Sensors data (time series), users and devices profiles, also time series “derived” data
(e.g. rollups, aggregations).
Volume Starts small, grows quickly, keeps coming 24x7x365 (nights, weekends and holidays).
Spikes up on new model launches or successful marketing campaign. But can slow down,
but will keep growing. Efficient data retention policy is critical to prevent overflows.
Five “V”s IoT data
Velocity Torrent of small writes (sensors). Reads – millions of low-latency queries, user and device
profiles, range queries for TS data (slices). Stream of updates (profiles) - beware of
conflicts.
Variety Sensors data (time series), users and devices profiles, also time series “derived” data
(e.g. rollups, aggregations).
Volume Starts small, grows quickly, keeps coming 24x7x365 (nights, weekends and holidays).
Spikes up on new model launches or successful marketing campaign. But can slow down,
but will keep growing. Efficient data retention policy is critical to prevent overflows.
Veracity Generally trustworthy, but beware of “low cost” sensors with low accuracy. Sent over not-
so-reliable transport - expect that some data will be corrupted or arrive late or might be
lost. (Hopefully the devices were not hijacked or impersonated by hackers)
Five “V”s IoT data
Velocity Torrent of small writes (sensors). Reads – millions of low-latency queries, user and device
profiles, range queries for TS data (slices). Stream of updates (profiles) - beware of
conflicts.
Variety Sensors data (time series), users and devices profiles, also time series “derived” data
(e.g. rollups, aggregations).
Volume Starts small, grows quickly, keeps coming 24x7x365 (nights, weekends and holidays).
Spikes up on new model launches or successful marketing campaign. But can slow down,
but will keep growing. Efficient data retention policy is critical to prevent overflows.
Veracity Generally trustworthy, but beware of “low cost” sensors with low accuracy. Sent over not-
so-reliable transport - expect that some data will be corrupted or arrive late or might be
lost. (Hopefully the devices were not hijacked or impersonated by hackers)
Value Profiles and summaries are much more valuable than raw data samples. The value of
“raw” time series quickly goes down was processed and clock advances. Aggregated
(”derived”) data are more valuable than raw data.
Exceptions: financial transactions, life support, nuclear plants, oil rigs, …
Five “V”s IoT data
Velocity Torrent of small writes (sensors). Reads – millions of low-latency queries, user and device
profiles, range queries for TS data (slices). Stream of updates (profiles) - beware of
conflicts.
Variety Sensors data (time series), users and devices profiles, also time series “derived” data
(e.g. rollups, aggregations).
Volume Starts small, grows quickly, keeps coming 24x7x365 (nights, weekends and holidays).
Spikes up on new model launches or successful marketing campaign. But can slow down,
but will keep growing. Efficient data retention policy is critical to prevent overflows.
Veracity Generally trustworthy, but beware of “low cost” sensors with low accuracy. Sent over not-
so-reliable transport - expect that some data will be corrupted or arrive late or might be
lost. (Hopefully the devices were not hijacked or impersonated by hackers)
Value Profiles and summaries are much more valuable than raw data samples. The value of
“raw” time series quickly goes down was processed and clock advances. Aggregated
(”derived”) data are more valuable than raw data.
Exceptions: financial transactions, life support, nuclear plants, oil rigs, …
Complexity Poly-structured using simple schemas and simple relations (usually implicit). Some data is
treated as unstructured (”opaque”) for speed or flexibility.
What architecture would work ?
Architectural Blueprints
• Lambda Architecture by Nathan Marz (ex-Twitter)
• Kappa Architecture by Jay Kreps (Confluent)
• Zeta Architecture by Jim Scott (MapR)
• … and their variants
Lambda
Kappa
Zeta
Data Processing Framework for IoT
• Uses “Best of breed” OSS technologies
• Combines two paradigms
• “Speed Layer” – pipeline for Stream Processing for “Data in Motion”
• “Serving Layer” – analytics for “Data in Motion” and “Data at Rest”
• Every component is “Distributed by Design”
• Collection Layer
• Message Queue
• Stream Processing
• Data Storage (Database, Object System, Data Warehouse)
• Query and Analytics Engines
Data Access Patterns
Category Description R:W %
Metadata
&
Profiles
Devices &
Users
Many low latency small reads - all over the dataset. Occasional
updates – possibly by different “actors” (web, device, app), conflicts
need to be prevented or resolved. Fewer creates and deletes.
90:10
Time
Series
Data Access Patterns
Category Description R:W %
Metadata
&
Profiles
Devices &
Users
Many low latency small reads - all over the dataset. Occasional
updates – possibly by different “actors” (web, device, app), conflicts
need to be prevented or resolved. Fewer creates and deletes.
90:10
Time
Series
Ingested
(“Raw”)
Very high throughout of relatively small writes. Most reads are over
recent time range “slice”. Updates are rare (corrections).
This category is a biggest part of the IoT application dataset.
10:90
Data Access Patterns
Category Description R:W %
Metadata
&
Profiles
Devices &
Users
Many low latency small reads - all over the dataset. Occasional
updates – possibly by different “actors” (web, device, app), conflicts
need to be prevented or resolved. Fewer creates and deletes.
90:10
Time
Series
Ingested
(“Raw”)
Very high throughout of relatively small writes. Most reads are over
recent time range “slice”. Updates are rare (corrections).
This category is a biggest part of the IoT application dataset.
10:90
Aggregated
(“Derived”)
Mostly reads – users, platform services, reports. Writes are
periodical on each time interval or from batch jobs.
80:20
Data store for IoT – “Wish list”
• Ingested (Raw) Time Series
• Very high write throughput
• Fast slice (time range) reads
• Aggregated (Derived) Time Series
• Auto-distributed + slice locality
• SQL-like queries
• Aggregations
• Bulk queries (analytics)
• Secondary Indexes (Tags)
• Efficient Storage
• Auto Data Retention (TTL)
• Build-in anti entropy
• Compression
• Hot Backups
• Profiles and Metadata
• Many concurrent reads with low latency
• Reliable writes (ACID or conflict
resolution)
• Unstructured or partially structured
• Secondary Indexes + Text Search
• Scalability and Availability
• Distributed architecture, no SPoF
• Linearly scalable - up and down
• Operational simplicity
• Master-less architecture
• Automatic rebalancing
• Metrics, logs, events
• Rolling upgrades
What DB type is a good fit for TS use cases?
Database Type For IoT or Time Series
Relational Key Value Document Wide Column Graph
MySQL Riak KV MongoDB Cassandra Neo4J
PostgreSQL DynamoDB CouchBase HBase Titan
Oracle Voldemort RethinkDB Accumulo Infinite Graph
There is a need for a new type of NoSQL database – Time
Series
None of existing DB types was designed to handle time series data
• Wide column DBs have high write throughput, but reads and updates are not their strength
• Key Value and Document DBs handle metadata well, but struggle with heavy writes and time-slicing
reads
• Relational - good with metadata (unless number of updates is high), but a bad choice for TS data
• Graph DB – not a good choice for either time series or metadata, can be added later on
Database Type For IoT or Time Series
Relational Key Value Document Wide Column Graph
MySQL Riak KV MongoDB Cassandra Neo4J
PostgreSQL DynamoDB CouchBase HBase Titan
Oracle Voldemort RethinkDB Accumulo Infinite Graph
Time Series
InfluxDB Riak TS Blueflood
KairosDB Prometeus Druid
OpenTSDB Dalmatiner Graphite
Iot Sensors Data – Hot to Cold
SENSORS DATA – HOT N’ COLD
Temp Purpose Description Immutable?
Boiling
Hot
App usage
Last known value(s) and/or for last N minutes, useful for
immediate responses, very frequently accessed
No
Hot Operational
dataset
Last 24 hours to several days or weeks (rarely months),
frequently accessed, dashboards and online analytics
Almost*
Warm Historical data
Older data, less frequently accessed, used mostly for
offline analytics and historical analysis
Yes
Cold Archives
Used only in rare situations, kept in long term storage for
regulatory or unpredicted purposes
Yes
STORAGE TIERS – FROM HOT TO COLD
RAM → Database (TSDB) → Object Storage → Archive
Data Lake
Temp Purpose Storage Products Immutable?
Boiling
Hot
App usage Internal app cache, Redis or Memcached No
Hot Operational
dataset
NoSQL Database (preferably Time Series DB)
Riak TS, OpenTSDB, KairosDB, Cassandra, HBase
Almost*
Warm Historical data
Object storage – HDFS (Hadoop), Ceph, Minio,
Riak S2 or AWS S3
Yes
Cold Archives Various Yes
STORAGE TIERS – REALITY CHECK
RAM → Database (TSDB) → Object Storage → Archive
Elastic Cache (Redis) → Database (Postgres, DynamoDB) → AWS S3 →
Glacier
Data Lake
Temp AWS Service Storage price, GB per month
Boiling Hot Elastic Cache (Redis) $15-45
Hot DynamoDB
RDS (Postgres)
$ 0.25-0.35 (SSD)
from $0.1 (Magnetic)
Warm Simple Storage Service (S3) $0.024 to $0.030
Cold Glacier $0.007
OSS technologies for scalable IoT apps
Component Open Source Technologies
Load balancer Ngnix, HA Proxy
Ingestion Kafka, RabbitMQ, ZeroMQ, Flume
Stream Computing Spark Streaming, Apache Flink, Kafka Streams, Samza
Time Series Store InfluxDB, KairosDB, Riak, Cassandra, OpenTSDB
Profiles Store CouchBase, Riak, MySQL, Postgres, MongoDB
Search Solr, Elastic Search
Object Storage HDFS (Hadoop), Minio, Riak S2, Ceph
Analytics Framework Apache Spark, MapReduce, Hive
SQL Query Engine Spark SQL, Presto, Impala, Drill
Cluster Manager Mesosphere DC/OS or Mesos, Kubernetes, Docker Swarm
Checklist for IoT technology stack
❑Is it vendor lock-in or open source software? Are there open APIs?
❑Can it be deployed in cloud? At the edge? In a data center? Using hybrid
approach?
❑Can it be used it for free or low cost (no big upfront investment)?
❑Can you develop your app on your laptop? How many “moving parts”?
❑Are the components pre-integrated or can be easily integrated together?
❑Can you easily scale each component in this architecture by 10x? 20x? 50x?
❑Is there a roadmap, actively worked on, which is aligned with your vision?
❑Is there a company behind the technology to provide 24x7 support when needed?
Come to Basho booth to learn about
• Riak TS (Time Series) - highly scalable NoSQL database for IoT and Time
Series
… and more
• Riak Spark Connector for Apache Spark
• Riak Integrations with Redis and Kafka
• Riak Mesos Framework (RMF) for DC/OS
QUESTIONS?
Building Scalable IoT Apps (QCon S-F)
Ad

More Related Content

What's hot (20)

Powering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopPowering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache Hadoop
Cloudera, Inc.
 
IOT Platform as a Service
IOT Platform as a ServiceIOT Platform as a Service
IOT Platform as a Service
kidozen
 
ParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream - Big Data for Business Users
ParStream - Big Data for Business Users
ParStream Inc.
 
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStreamIoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
gogo6
 
AI as a Catalyst for IoT
AI as a Catalyst for IoTAI as a Catalyst for IoT
AI as a Catalyst for IoT
marina romanovich
 
IoT-Use-Case-eBook
IoT-Use-Case-eBookIoT-Use-Case-eBook
IoT-Use-Case-eBook
Nicolas Delorme
 
The IOT scenario in the digital age
The IOT scenario in the digital ageThe IOT scenario in the digital age
The IOT scenario in the digital age
Sreenivasa Akshinthala
 
Overcoming the AIoT Obstacles through Smart Component Integration
Overcoming the AIoT Obstacles through Smart Component IntegrationOvercoming the AIoT Obstacles through Smart Component Integration
Overcoming the AIoT Obstacles through Smart Component Integration
Innodisk Corporation
 
Architect Your IoT Platform for Success
Architect Your IoT Platform for SuccessArchitect Your IoT Platform for Success
Architect Your IoT Platform for Success
Solace
 
CL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and PlanningCL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and Planning
Cisco
 
Brian Isle: The Internet of Things: Manufacturing Panacea - or - Hacker's Dream?
Brian Isle: The Internet of Things: Manufacturing Panacea - or - Hacker's Dream?Brian Isle: The Internet of Things: Manufacturing Panacea - or - Hacker's Dream?
Brian Isle: The Internet of Things: Manufacturing Panacea - or - Hacker's Dream?
360mnbsu
 
2015-09-16 IoT in Oil and Gas Conference
2015-09-16 IoT in Oil and Gas Conference2015-09-16 IoT in Oil and Gas Conference
2015-09-16 IoT in Oil and Gas Conference
Mark Reynolds
 
Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...
Andrei Khurshudov
 
Powering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big Data
Powering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big DataPowering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big Data
Powering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big Data
DataWorks Summit
 
World of Watson IoT Journey Map
World of Watson IoT Journey MapWorld of Watson IoT Journey Map
World of Watson IoT Journey Map
IBM Internet of Things
 
Connected barrels_IoT in Oil and Gas_deloitte
Connected barrels_IoT in Oil and Gas_deloitteConnected barrels_IoT in Oil and Gas_deloitte
Connected barrels_IoT in Oil and Gas_deloitte
Anshu Mittal
 
Real-Time Communications and the Industrial Internet of Things
 Real-Time Communications and the Industrial Internet of Things Real-Time Communications and the Industrial Internet of Things
Real-Time Communications and the Industrial Internet of Things
Real-Time Innovations (RTI)
 
HPE Presentation on Internet of Things at IoT World 2016 - Dubai
HPE Presentation on Internet of Things at IoT World 2016 - DubaiHPE Presentation on Internet of Things at IoT World 2016 - Dubai
HPE Presentation on Internet of Things at IoT World 2016 - Dubai
Alpha Data
 
Internet of Things Stack - Presentation Version
Internet of Things Stack - Presentation VersionInternet of Things Stack - Presentation Version
Internet of Things Stack - Presentation Version
Postscapes
 
The Prospect of IoT in the Oil & Gas
The Prospect of IoT in the Oil & Gas The Prospect of IoT in the Oil & Gas
The Prospect of IoT in the Oil & Gas
Ghazi Wadi, PMP
 
Powering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopPowering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache Hadoop
Cloudera, Inc.
 
IOT Platform as a Service
IOT Platform as a ServiceIOT Platform as a Service
IOT Platform as a Service
kidozen
 
ParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream - Big Data for Business Users
ParStream - Big Data for Business Users
ParStream Inc.
 
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStreamIoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
gogo6
 
Overcoming the AIoT Obstacles through Smart Component Integration
Overcoming the AIoT Obstacles through Smart Component IntegrationOvercoming the AIoT Obstacles through Smart Component Integration
Overcoming the AIoT Obstacles through Smart Component Integration
Innodisk Corporation
 
Architect Your IoT Platform for Success
Architect Your IoT Platform for SuccessArchitect Your IoT Platform for Success
Architect Your IoT Platform for Success
Solace
 
CL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and PlanningCL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and Planning
Cisco
 
Brian Isle: The Internet of Things: Manufacturing Panacea - or - Hacker's Dream?
Brian Isle: The Internet of Things: Manufacturing Panacea - or - Hacker's Dream?Brian Isle: The Internet of Things: Manufacturing Panacea - or - Hacker's Dream?
Brian Isle: The Internet of Things: Manufacturing Panacea - or - Hacker's Dream?
360mnbsu
 
2015-09-16 IoT in Oil and Gas Conference
2015-09-16 IoT in Oil and Gas Conference2015-09-16 IoT in Oil and Gas Conference
2015-09-16 IoT in Oil and Gas Conference
Mark Reynolds
 
Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...
Andrei Khurshudov
 
Powering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big Data
Powering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big DataPowering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big Data
Powering the Intelligent Edge: HPE's Strategy and Direction for IoT & Big Data
DataWorks Summit
 
Connected barrels_IoT in Oil and Gas_deloitte
Connected barrels_IoT in Oil and Gas_deloitteConnected barrels_IoT in Oil and Gas_deloitte
Connected barrels_IoT in Oil and Gas_deloitte
Anshu Mittal
 
Real-Time Communications and the Industrial Internet of Things
 Real-Time Communications and the Industrial Internet of Things Real-Time Communications and the Industrial Internet of Things
Real-Time Communications and the Industrial Internet of Things
Real-Time Innovations (RTI)
 
HPE Presentation on Internet of Things at IoT World 2016 - Dubai
HPE Presentation on Internet of Things at IoT World 2016 - DubaiHPE Presentation on Internet of Things at IoT World 2016 - Dubai
HPE Presentation on Internet of Things at IoT World 2016 - Dubai
Alpha Data
 
Internet of Things Stack - Presentation Version
Internet of Things Stack - Presentation VersionInternet of Things Stack - Presentation Version
Internet of Things Stack - Presentation Version
Postscapes
 
The Prospect of IoT in the Oil & Gas
The Prospect of IoT in the Oil & Gas The Prospect of IoT in the Oil & Gas
The Prospect of IoT in the Oil & Gas
Ghazi Wadi, PMP
 

Viewers also liked (17)

Event Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoTEvent Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoT
Lei Xu
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Eric Evans
 
DataStax et Apache Cassandra pour la gestion des flux IoT
DataStax et Apache Cassandra pour la gestion des flux IoTDataStax et Apache Cassandra pour la gestion des flux IoT
DataStax et Apache Cassandra pour la gestion des flux IoT
Victor Coustenoble
 
Armando scannone recopilación de recetas
Armando scannone   recopilación de recetasArmando scannone   recopilación de recetas
Armando scannone recopilación de recetas
Free lancer
 
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)
Eric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
Eric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
Eric Evans
 
Creator Ci40 IoT kit & Framework - scalable LWM2M IoT dev platform for business
Creator Ci40 IoT kit & Framework - scalable LWM2M IoT dev platform for businessCreator Ci40 IoT kit & Framework - scalable LWM2M IoT dev platform for business
Creator Ci40 IoT kit & Framework - scalable LWM2M IoT dev platform for business
Paul Evans
 
Lightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and CassandraLightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and Cassandra
nickmbailey
 
Hands-on with AWS IoT
Hands-on with AWS IoTHands-on with AWS IoT
Hands-on with AWS IoT
Julien SIMON
 
Time Series Processing with Apache Spark
Time Series Processing with Apache SparkTime Series Processing with Apache Spark
Time Series Processing with Apache Spark
Josef Adersberger
 
Spark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas GeerdinkSpark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas Geerdink
Spark Summit
 
Why Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data WorldWhy Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data World
Dean Wampler
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
Patrick McFadin
 
A reference architecture for the internet of things
A reference architecture for the internet of thingsA reference architecture for the internet of things
A reference architecture for the internet of things
Charles Gibbons
 
Introduction to Network Function Virtualization (NFV)
Introduction to Network Function Virtualization (NFV)Introduction to Network Function Virtualization (NFV)
Introduction to Network Function Virtualization (NFV)
rjain51
 
IoT architecture
IoT architectureIoT architecture
IoT architecture
Sumit Sharma
 
Event Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoTEvent Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoT
Lei Xu
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Eric Evans
 
DataStax et Apache Cassandra pour la gestion des flux IoT
DataStax et Apache Cassandra pour la gestion des flux IoTDataStax et Apache Cassandra pour la gestion des flux IoT
DataStax et Apache Cassandra pour la gestion des flux IoT
Victor Coustenoble
 
Armando scannone recopilación de recetas
Armando scannone   recopilación de recetasArmando scannone   recopilación de recetas
Armando scannone recopilación de recetas
Free lancer
 
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)
Eric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
Eric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
Eric Evans
 
Creator Ci40 IoT kit & Framework - scalable LWM2M IoT dev platform for business
Creator Ci40 IoT kit & Framework - scalable LWM2M IoT dev platform for businessCreator Ci40 IoT kit & Framework - scalable LWM2M IoT dev platform for business
Creator Ci40 IoT kit & Framework - scalable LWM2M IoT dev platform for business
Paul Evans
 
Lightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and CassandraLightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and Cassandra
nickmbailey
 
Hands-on with AWS IoT
Hands-on with AWS IoTHands-on with AWS IoT
Hands-on with AWS IoT
Julien SIMON
 
Time Series Processing with Apache Spark
Time Series Processing with Apache SparkTime Series Processing with Apache Spark
Time Series Processing with Apache Spark
Josef Adersberger
 
Spark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas GeerdinkSpark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas Geerdink
Spark Summit
 
Why Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data WorldWhy Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data World
Dean Wampler
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
Patrick McFadin
 
A reference architecture for the internet of things
A reference architecture for the internet of thingsA reference architecture for the internet of things
A reference architecture for the internet of things
Charles Gibbons
 
Introduction to Network Function Virtualization (NFV)
Introduction to Network Function Virtualization (NFV)Introduction to Network Function Virtualization (NFV)
Introduction to Network Function Virtualization (NFV)
rjain51
 
Ad

Similar to Building Scalable IoT Apps (QCon S-F) (20)

Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...
Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...
Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...
Spark Summit
 
Analytics&IoT
Analytics&IoTAnalytics&IoT
Analytics&IoT
Selvaraj Kesavan
 
Unushs susus susujss. Ssuusussjjsjsit 4.pptx
Unushs susus susujss. Ssuusussjjsjsit 4.pptxUnushs susus susujss. Ssuusussjjsjsit 4.pptx
Unushs susus susujss. Ssuusussjjsjsit 4.pptx
AshishHiwale1
 
Lecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in detailsLecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in details
AbhishekKumarAgrahar2
 
Lecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdfLecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdf
ahmedibrahimghnnam01
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptx
AIMLSEMINARS
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
RojaT4
 
Barga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 KeynoteBarga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 Keynote
Roger Barga
 
Dell Digital Transformation Through AI and Data Analytics Webinar
Dell Digital Transformation Through AI and  Data Analytics WebinarDell Digital Transformation Through AI and  Data Analytics Webinar
Dell Digital Transformation Through AI and Data Analytics Webinar
Bill Wong
 
SplunkLive! London - Splunk App for Stream & MINT Breakout
SplunkLive! London - Splunk App for Stream & MINT BreakoutSplunkLive! London - Splunk App for Stream & MINT Breakout
SplunkLive! London - Splunk App for Stream & MINT Breakout
Splunk
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
ElsonPaul2
 
How to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT OperationsHow to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT Operations
ExtraHop Networks
 
CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar
Caroline Stewart
 
bigdataintro.pptx
bigdataintro.pptxbigdataintro.pptx
bigdataintro.pptx
Albert Alex
 
Big data.ppt
Big data.pptBig data.ppt
Big data.ppt
IdontKnow66967
 
Lecture1
Lecture1Lecture1
Lecture1
Manish Singh
 
DNA: an overview
DNA: an overviewDNA: an overview
DNA: an overview
Cisco DevNet
 
Harnessing Big Data_UCLA
Harnessing Big Data_UCLAHarnessing Big Data_UCLA
Harnessing Big Data_UCLA
Paul Barsch
 
Partner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_dataPartner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_data
Treasure Data, Inc.
 
Big Data Analytics PPT - S1 working .pptx
Big Data Analytics PPT - S1 working .pptxBig Data Analytics PPT - S1 working .pptx
Big Data Analytics PPT - S1 working .pptx
VivekChaurasia43
 
Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...
Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...
Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...
Spark Summit
 
Unushs susus susujss. Ssuusussjjsjsit 4.pptx
Unushs susus susujss. Ssuusussjjsjsit 4.pptxUnushs susus susujss. Ssuusussjjsjsit 4.pptx
Unushs susus susujss. Ssuusussjjsjsit 4.pptx
AshishHiwale1
 
Lecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in detailsLecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in details
AbhishekKumarAgrahar2
 
Lecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdfLecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdf
ahmedibrahimghnnam01
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptx
AIMLSEMINARS
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
RojaT4
 
Barga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 KeynoteBarga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 Keynote
Roger Barga
 
Dell Digital Transformation Through AI and Data Analytics Webinar
Dell Digital Transformation Through AI and  Data Analytics WebinarDell Digital Transformation Through AI and  Data Analytics Webinar
Dell Digital Transformation Through AI and Data Analytics Webinar
Bill Wong
 
SplunkLive! London - Splunk App for Stream & MINT Breakout
SplunkLive! London - Splunk App for Stream & MINT BreakoutSplunkLive! London - Splunk App for Stream & MINT Breakout
SplunkLive! London - Splunk App for Stream & MINT Breakout
Splunk
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
ElsonPaul2
 
How to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT OperationsHow to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT Operations
ExtraHop Networks
 
CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar
Caroline Stewart
 
bigdataintro.pptx
bigdataintro.pptxbigdataintro.pptx
bigdataintro.pptx
Albert Alex
 
Harnessing Big Data_UCLA
Harnessing Big Data_UCLAHarnessing Big Data_UCLA
Harnessing Big Data_UCLA
Paul Barsch
 
Partner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_dataPartner webinar presentation aws pebble_treasure_data
Partner webinar presentation aws pebble_treasure_data
Treasure Data, Inc.
 
Big Data Analytics PPT - S1 working .pptx
Big Data Analytics PPT - S1 working .pptxBig Data Analytics PPT - S1 working .pptx
Big Data Analytics PPT - S1 working .pptx
VivekChaurasia43
 
Ad

Recently uploaded (20)

A Comprehensive Guide to CRM Software Benefits for Every Business Stage
A Comprehensive Guide to CRM Software Benefits for Every Business StageA Comprehensive Guide to CRM Software Benefits for Every Business Stage
A Comprehensive Guide to CRM Software Benefits for Every Business Stage
SynapseIndia
 
Programs as Values - Write code and don't get lost
Programs as Values - Write code and don't get lostPrograms as Values - Write code and don't get lost
Programs as Values - Write code and don't get lost
Pierangelo Cecchetto
 
Time Estimation: Expert Tips & Proven Project Techniques
Time Estimation: Expert Tips & Proven Project TechniquesTime Estimation: Expert Tips & Proven Project Techniques
Time Estimation: Expert Tips & Proven Project Techniques
Livetecs LLC
 
Exchange Migration Tool- Shoviv Software
Exchange Migration Tool- Shoviv SoftwareExchange Migration Tool- Shoviv Software
Exchange Migration Tool- Shoviv Software
Shoviv Software
 
Why Tapitag Ranks Among the Best Digital Business Card Providers
Why Tapitag Ranks Among the Best Digital Business Card ProvidersWhy Tapitag Ranks Among the Best Digital Business Card Providers
Why Tapitag Ranks Among the Best Digital Business Card Providers
Tapitag
 
Top Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdf
Top Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdfTop Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdf
Top Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdf
evrigsolution
 
Reinventing Microservices Efficiency and Innovation with Single-Runtime
Reinventing Microservices Efficiency and Innovation with Single-RuntimeReinventing Microservices Efficiency and Innovation with Single-Runtime
Reinventing Microservices Efficiency and Innovation with Single-Runtime
Natan Silnitsky
 
[gbgcpp] Let's get comfortable with concepts
[gbgcpp] Let's get comfortable with concepts[gbgcpp] Let's get comfortable with concepts
[gbgcpp] Let's get comfortable with concepts
Dimitrios Platis
 
Memory Management and Leaks in Postgres from pgext.day 2025
Memory Management and Leaks in Postgres from pgext.day 2025Memory Management and Leaks in Postgres from pgext.day 2025
Memory Management and Leaks in Postgres from pgext.day 2025
Phil Eaton
 
Passive House Canada Conference 2025 Presentation [Final]_v4.ppt
Passive House Canada Conference 2025 Presentation [Final]_v4.pptPassive House Canada Conference 2025 Presentation [Final]_v4.ppt
Passive House Canada Conference 2025 Presentation [Final]_v4.ppt
IES VE
 
Download 4k Video Downloader Crack Pre-Activated
Download 4k Video Downloader Crack Pre-ActivatedDownload 4k Video Downloader Crack Pre-Activated
Download 4k Video Downloader Crack Pre-Activated
Web Designer
 
Unit Two - Java Architecture and OOPS
Unit Two  -   Java Architecture and OOPSUnit Two  -   Java Architecture and OOPS
Unit Two - Java Architecture and OOPS
Nabin Dhakal
 
The Elixir Developer - All Things Open
The Elixir Developer - All Things OpenThe Elixir Developer - All Things Open
The Elixir Developer - All Things Open
Carlo Gilmar Padilla Santana
 
Solar-wind hybrid engery a system sustainable power
Solar-wind  hybrid engery a system sustainable powerSolar-wind  hybrid engery a system sustainable power
Solar-wind hybrid engery a system sustainable power
bhoomigowda12345
 
Adobe Media Encoder Crack FREE Download 2025
Adobe Media Encoder  Crack FREE Download 2025Adobe Media Encoder  Crack FREE Download 2025
Adobe Media Encoder Crack FREE Download 2025
zafranwaqar90
 
Download MathType Crack Version 2025???
Download MathType Crack  Version 2025???Download MathType Crack  Version 2025???
Download MathType Crack Version 2025???
Google
 
How to Install and Activate ListGrabber Plugin
How to Install and Activate ListGrabber PluginHow to Install and Activate ListGrabber Plugin
How to Install and Activate ListGrabber Plugin
eGrabber
 
Digital Twins Software Service in Belfast
Digital Twins Software Service in BelfastDigital Twins Software Service in Belfast
Digital Twins Software Service in Belfast
julia smits
 
Deploying & Testing Agentforce - End-to-end with Copado - Ewenb Clark
Deploying & Testing Agentforce - End-to-end with Copado - Ewenb ClarkDeploying & Testing Agentforce - End-to-end with Copado - Ewenb Clark
Deploying & Testing Agentforce - End-to-end with Copado - Ewenb Clark
Peter Caitens
 
Top 12 Most Useful AngularJS Development Tools to Use in 2025
Top 12 Most Useful AngularJS Development Tools to Use in 2025Top 12 Most Useful AngularJS Development Tools to Use in 2025
Top 12 Most Useful AngularJS Development Tools to Use in 2025
GrapesTech Solutions
 
A Comprehensive Guide to CRM Software Benefits for Every Business Stage
A Comprehensive Guide to CRM Software Benefits for Every Business StageA Comprehensive Guide to CRM Software Benefits for Every Business Stage
A Comprehensive Guide to CRM Software Benefits for Every Business Stage
SynapseIndia
 
Programs as Values - Write code and don't get lost
Programs as Values - Write code and don't get lostPrograms as Values - Write code and don't get lost
Programs as Values - Write code and don't get lost
Pierangelo Cecchetto
 
Time Estimation: Expert Tips & Proven Project Techniques
Time Estimation: Expert Tips & Proven Project TechniquesTime Estimation: Expert Tips & Proven Project Techniques
Time Estimation: Expert Tips & Proven Project Techniques
Livetecs LLC
 
Exchange Migration Tool- Shoviv Software
Exchange Migration Tool- Shoviv SoftwareExchange Migration Tool- Shoviv Software
Exchange Migration Tool- Shoviv Software
Shoviv Software
 
Why Tapitag Ranks Among the Best Digital Business Card Providers
Why Tapitag Ranks Among the Best Digital Business Card ProvidersWhy Tapitag Ranks Among the Best Digital Business Card Providers
Why Tapitag Ranks Among the Best Digital Business Card Providers
Tapitag
 
Top Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdf
Top Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdfTop Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdf
Top Magento Hyvä Theme Features That Make It Ideal for E-commerce.pdf
evrigsolution
 
Reinventing Microservices Efficiency and Innovation with Single-Runtime
Reinventing Microservices Efficiency and Innovation with Single-RuntimeReinventing Microservices Efficiency and Innovation with Single-Runtime
Reinventing Microservices Efficiency and Innovation with Single-Runtime
Natan Silnitsky
 
[gbgcpp] Let's get comfortable with concepts
[gbgcpp] Let's get comfortable with concepts[gbgcpp] Let's get comfortable with concepts
[gbgcpp] Let's get comfortable with concepts
Dimitrios Platis
 
Memory Management and Leaks in Postgres from pgext.day 2025
Memory Management and Leaks in Postgres from pgext.day 2025Memory Management and Leaks in Postgres from pgext.day 2025
Memory Management and Leaks in Postgres from pgext.day 2025
Phil Eaton
 
Passive House Canada Conference 2025 Presentation [Final]_v4.ppt
Passive House Canada Conference 2025 Presentation [Final]_v4.pptPassive House Canada Conference 2025 Presentation [Final]_v4.ppt
Passive House Canada Conference 2025 Presentation [Final]_v4.ppt
IES VE
 
Download 4k Video Downloader Crack Pre-Activated
Download 4k Video Downloader Crack Pre-ActivatedDownload 4k Video Downloader Crack Pre-Activated
Download 4k Video Downloader Crack Pre-Activated
Web Designer
 
Unit Two - Java Architecture and OOPS
Unit Two  -   Java Architecture and OOPSUnit Two  -   Java Architecture and OOPS
Unit Two - Java Architecture and OOPS
Nabin Dhakal
 
Solar-wind hybrid engery a system sustainable power
Solar-wind  hybrid engery a system sustainable powerSolar-wind  hybrid engery a system sustainable power
Solar-wind hybrid engery a system sustainable power
bhoomigowda12345
 
Adobe Media Encoder Crack FREE Download 2025
Adobe Media Encoder  Crack FREE Download 2025Adobe Media Encoder  Crack FREE Download 2025
Adobe Media Encoder Crack FREE Download 2025
zafranwaqar90
 
Download MathType Crack Version 2025???
Download MathType Crack  Version 2025???Download MathType Crack  Version 2025???
Download MathType Crack Version 2025???
Google
 
How to Install and Activate ListGrabber Plugin
How to Install and Activate ListGrabber PluginHow to Install and Activate ListGrabber Plugin
How to Install and Activate ListGrabber Plugin
eGrabber
 
Digital Twins Software Service in Belfast
Digital Twins Software Service in BelfastDigital Twins Software Service in Belfast
Digital Twins Software Service in Belfast
julia smits
 
Deploying & Testing Agentforce - End-to-end with Copado - Ewenb Clark
Deploying & Testing Agentforce - End-to-end with Copado - Ewenb ClarkDeploying & Testing Agentforce - End-to-end with Copado - Ewenb Clark
Deploying & Testing Agentforce - End-to-end with Copado - Ewenb Clark
Peter Caitens
 
Top 12 Most Useful AngularJS Development Tools to Use in 2025
Top 12 Most Useful AngularJS Development Tools to Use in 2025Top 12 Most Useful AngularJS Development Tools to Use in 2025
Top 12 Most Useful AngularJS Development Tools to Use in 2025
GrapesTech Solutions
 

Building Scalable IoT Apps (QCon S-F)

  • 1. Building scalable IoT apps using OSS technologies Pavel Hardak Basho Technologies Disclaimer: some of the opinions expressed here are mine and might not fully agree with those of
  • 2. IOT & INDUSTRY VERTICALS
  • 3. IoT market - growth prediction Number of connected “things” • 2016 – about 6.4 B • 30% YoY growth, 5.5M activations per day • 2020 – about 21 B “By 2020 more than half of new major business processes and systems will incorporate some element of Internet of Things”
  • 4. Reality Check - let us get a second opinion
  • 6. IoT Project Plan • Investigate those “things” and figure out • What protocols they support (CoAP, MQTT, HTTP, …) • What data they generate (temperature, humidity, location, speed, ...) • Collect this data in our data center • Implement protocols and parsing routines • Store into persistent storage (“Data Lake” architecture) • Once stored in Data Lake • Analyze, summarize, “slice and dice” • Predict, discover insights • Declare a victory – make profit & go for IPO
  • 7. Data Lake IoT Devices SQL Apps & AnalyticsMQTT, CoAP and HTTP REFERENCE ARCHITECTURE (?) Not so fast, my friend.
  • 8. What is wrong with “Data Lake” for IoT ?
  • 13. Auto Insurance - Micro Case Study • One of top 5 auto insurance companies in USA, appears in Fortune-500 list • More than $10B in annual revenue, above $15B in assets • About 20,000 employees and 50,000 insurance agents • More than 19 million individual policies across all 50 states
  • 14. How this “rating info” influences your payment ? • Garaging Zip – what neighborhood is the car parked when it is not used? There is a high correlation between Zip code and the probability of car being stolen or vandalized. • Current and Previous Annual Mileage – if the insured drives for longer distances, it leads to the higher probability of road accidents or car malfunctions. • Vehicle Usage – do you use your car for work or pleasure? Are you commuter, student, stay-at-home parent or Uber driver? Depending on your usage, the company will calculate the risk and adjust the rate. • Years of Driving Experience – young drivers are put into higher risk categories, where older people are considered safer drivers due to more time behind the wheel. Note - average young driver vs. average experienced driver.
  • 16. Sampling Frequency and Dataset Size • Mileage • From one sample per year to 52 (weekly) or 365 (daily) • Better - let us do hourly to “see” the car usage (commuter, …) • Location (used to be “Garaging Zip”) • From one sample per year to 365 (daily) • Better - hourly, allows to learn when car is parked for several hours • New factors for rating algorithm based on weekly summaries • Hard brakes, hard accelerations, going above the speed limit, … • Amount of time series data to be stored and analyzed • Grows by factor of 365x, then by another 24x = 8760x Each week – at least 50x more data than the whole previous year.
  • 18. What is different special about IoT? It is about the “things”… and more.
  • 21. IoT Data Categories Category Description Metadata & Profiles Devices Device info (model, SN, firmware, sensors, ..), configuration, owner, … Users Personal info, preferences, billing info, registered devices, … Time Series Ingested (“Raw”) Measurements, statuses and events from devices. Aggregated (“Derived”) Calculated data - from devices & profiles • Rollups – aggregate metrics from low resolution to higher ones (min - hour – day) using min, max, avg, ... • Aggregations – aggregate measurements, configuration and profiles (model, region, …) over time ranges
  • 22. IOT - NETWORKING TECHNOLOGIES
  • 23. NETWORK WISH LIST • Extreme Reliability • Guaranteed Delivery • End-to-End Low Latency • Quality of Service • Engineered Topology • Committed Bandwidth (CIR) • Fiber-optic network • Dedicated Channel • Strong Signal • Interference and Crosstalk Resistant • High SNR (Signal to Noise Ratio) • Very Low BER (Bit Error Rate)
  • 24. REALITY CHECK - LET US LOOK AGAIN
  • 25. IOT & NETWORK - REALITY • Wireless technologies • Shared transmission media • Limited bandwidth • Mesh or Ad-hoc Topology • Possible signals interference • Mis-ordered or lost packets • Low cost hardware components • Low power radio transmitters • Very small antennas • “Custom-made” firmware • Constrained Application Protocol (CoAP) • “Best Effort” QoS (“shoot and forget”)
  • 26. IoT is “Big Data” - by definition. Actually, lots and lots of Big Data.
  • 27. Five “V”s IoT data Velocity Torrent of small writes (sensors). Reads – millions of low-latency queries: user and device profiles, range queries for TS data (slices). Stream of updates (profiles) - beware of conflicts.
  • 28. Five “V”s IoT data Velocity Torrent of small writes (sensors). Reads – millions of low-latency queries, user and device profiles, range queries for TS data (slices). Stream of updates (profiles) - beware of conflicts. Variety Sensors data (time series), users and devices profiles, also time series “derived” data (e.g. rollups, aggregations).
  • 29. Five “V”s IoT data Velocity Torrent of small writes (sensors). Reads – millions of low-latency queries, user and device profiles, range queries for TS data (slices). Stream of updates (profiles) - beware of conflicts. Variety Sensors data (time series), users and devices profiles, also time series “derived” data (e.g. rollups, aggregations). Volume Starts small, grows quickly, keeps coming 24x7x365 (nights, weekends and holidays). Spikes up on new model launches or successful marketing campaign. But can slow down, but will keep growing. Efficient data retention policy is critical to prevent overflows.
  • 30. Five “V”s IoT data Velocity Torrent of small writes (sensors). Reads – millions of low-latency queries, user and device profiles, range queries for TS data (slices). Stream of updates (profiles) - beware of conflicts. Variety Sensors data (time series), users and devices profiles, also time series “derived” data (e.g. rollups, aggregations). Volume Starts small, grows quickly, keeps coming 24x7x365 (nights, weekends and holidays). Spikes up on new model launches or successful marketing campaign. But can slow down, but will keep growing. Efficient data retention policy is critical to prevent overflows. Veracity Generally trustworthy, but beware of “low cost” sensors with low accuracy. Sent over not- so-reliable transport - expect that some data will be corrupted or arrive late or might be lost. (Hopefully the devices were not hijacked or impersonated by hackers)
  • 31. Five “V”s IoT data Velocity Torrent of small writes (sensors). Reads – millions of low-latency queries, user and device profiles, range queries for TS data (slices). Stream of updates (profiles) - beware of conflicts. Variety Sensors data (time series), users and devices profiles, also time series “derived” data (e.g. rollups, aggregations). Volume Starts small, grows quickly, keeps coming 24x7x365 (nights, weekends and holidays). Spikes up on new model launches or successful marketing campaign. But can slow down, but will keep growing. Efficient data retention policy is critical to prevent overflows. Veracity Generally trustworthy, but beware of “low cost” sensors with low accuracy. Sent over not- so-reliable transport - expect that some data will be corrupted or arrive late or might be lost. (Hopefully the devices were not hijacked or impersonated by hackers) Value Profiles and summaries are much more valuable than raw data samples. The value of “raw” time series quickly goes down was processed and clock advances. Aggregated (”derived”) data are more valuable than raw data. Exceptions: financial transactions, life support, nuclear plants, oil rigs, …
  • 32. Five “V”s IoT data Velocity Torrent of small writes (sensors). Reads – millions of low-latency queries, user and device profiles, range queries for TS data (slices). Stream of updates (profiles) - beware of conflicts. Variety Sensors data (time series), users and devices profiles, also time series “derived” data (e.g. rollups, aggregations). Volume Starts small, grows quickly, keeps coming 24x7x365 (nights, weekends and holidays). Spikes up on new model launches or successful marketing campaign. But can slow down, but will keep growing. Efficient data retention policy is critical to prevent overflows. Veracity Generally trustworthy, but beware of “low cost” sensors with low accuracy. Sent over not- so-reliable transport - expect that some data will be corrupted or arrive late or might be lost. (Hopefully the devices were not hijacked or impersonated by hackers) Value Profiles and summaries are much more valuable than raw data samples. The value of “raw” time series quickly goes down was processed and clock advances. Aggregated (”derived”) data are more valuable than raw data. Exceptions: financial transactions, life support, nuclear plants, oil rigs, … Complexity Poly-structured using simple schemas and simple relations (usually implicit). Some data is treated as unstructured (”opaque”) for speed or flexibility.
  • 34. Architectural Blueprints • Lambda Architecture by Nathan Marz (ex-Twitter) • Kappa Architecture by Jay Kreps (Confluent) • Zeta Architecture by Jim Scott (MapR) • … and their variants Lambda Kappa Zeta
  • 35. Data Processing Framework for IoT • Uses “Best of breed” OSS technologies • Combines two paradigms • “Speed Layer” – pipeline for Stream Processing for “Data in Motion” • “Serving Layer” – analytics for “Data in Motion” and “Data at Rest” • Every component is “Distributed by Design” • Collection Layer • Message Queue • Stream Processing • Data Storage (Database, Object System, Data Warehouse) • Query and Analytics Engines
  • 36. Data Access Patterns Category Description R:W % Metadata & Profiles Devices & Users Many low latency small reads - all over the dataset. Occasional updates – possibly by different “actors” (web, device, app), conflicts need to be prevented or resolved. Fewer creates and deletes. 90:10 Time Series
  • 37. Data Access Patterns Category Description R:W % Metadata & Profiles Devices & Users Many low latency small reads - all over the dataset. Occasional updates – possibly by different “actors” (web, device, app), conflicts need to be prevented or resolved. Fewer creates and deletes. 90:10 Time Series Ingested (“Raw”) Very high throughout of relatively small writes. Most reads are over recent time range “slice”. Updates are rare (corrections). This category is a biggest part of the IoT application dataset. 10:90
  • 38. Data Access Patterns Category Description R:W % Metadata & Profiles Devices & Users Many low latency small reads - all over the dataset. Occasional updates – possibly by different “actors” (web, device, app), conflicts need to be prevented or resolved. Fewer creates and deletes. 90:10 Time Series Ingested (“Raw”) Very high throughout of relatively small writes. Most reads are over recent time range “slice”. Updates are rare (corrections). This category is a biggest part of the IoT application dataset. 10:90 Aggregated (“Derived”) Mostly reads – users, platform services, reports. Writes are periodical on each time interval or from batch jobs. 80:20
  • 39. Data store for IoT – “Wish list” • Ingested (Raw) Time Series • Very high write throughput • Fast slice (time range) reads • Aggregated (Derived) Time Series • Auto-distributed + slice locality • SQL-like queries • Aggregations • Bulk queries (analytics) • Secondary Indexes (Tags) • Efficient Storage • Auto Data Retention (TTL) • Build-in anti entropy • Compression • Hot Backups • Profiles and Metadata • Many concurrent reads with low latency • Reliable writes (ACID or conflict resolution) • Unstructured or partially structured • Secondary Indexes + Text Search • Scalability and Availability • Distributed architecture, no SPoF • Linearly scalable - up and down • Operational simplicity • Master-less architecture • Automatic rebalancing • Metrics, logs, events • Rolling upgrades
  • 40. What DB type is a good fit for TS use cases?
  • 41. Database Type For IoT or Time Series Relational Key Value Document Wide Column Graph MySQL Riak KV MongoDB Cassandra Neo4J PostgreSQL DynamoDB CouchBase HBase Titan Oracle Voldemort RethinkDB Accumulo Infinite Graph There is a need for a new type of NoSQL database – Time Series None of existing DB types was designed to handle time series data • Wide column DBs have high write throughput, but reads and updates are not their strength • Key Value and Document DBs handle metadata well, but struggle with heavy writes and time-slicing reads • Relational - good with metadata (unless number of updates is high), but a bad choice for TS data • Graph DB – not a good choice for either time series or metadata, can be added later on
  • 42. Database Type For IoT or Time Series Relational Key Value Document Wide Column Graph MySQL Riak KV MongoDB Cassandra Neo4J PostgreSQL DynamoDB CouchBase HBase Titan Oracle Voldemort RethinkDB Accumulo Infinite Graph Time Series InfluxDB Riak TS Blueflood KairosDB Prometeus Druid OpenTSDB Dalmatiner Graphite
  • 43. Iot Sensors Data – Hot to Cold
  • 44. SENSORS DATA – HOT N’ COLD Temp Purpose Description Immutable? Boiling Hot App usage Last known value(s) and/or for last N minutes, useful for immediate responses, very frequently accessed No Hot Operational dataset Last 24 hours to several days or weeks (rarely months), frequently accessed, dashboards and online analytics Almost* Warm Historical data Older data, less frequently accessed, used mostly for offline analytics and historical analysis Yes Cold Archives Used only in rare situations, kept in long term storage for regulatory or unpredicted purposes Yes
  • 45. STORAGE TIERS – FROM HOT TO COLD RAM → Database (TSDB) → Object Storage → Archive Data Lake Temp Purpose Storage Products Immutable? Boiling Hot App usage Internal app cache, Redis or Memcached No Hot Operational dataset NoSQL Database (preferably Time Series DB) Riak TS, OpenTSDB, KairosDB, Cassandra, HBase Almost* Warm Historical data Object storage – HDFS (Hadoop), Ceph, Minio, Riak S2 or AWS S3 Yes Cold Archives Various Yes
  • 46. STORAGE TIERS – REALITY CHECK RAM → Database (TSDB) → Object Storage → Archive Elastic Cache (Redis) → Database (Postgres, DynamoDB) → AWS S3 → Glacier Data Lake Temp AWS Service Storage price, GB per month Boiling Hot Elastic Cache (Redis) $15-45 Hot DynamoDB RDS (Postgres) $ 0.25-0.35 (SSD) from $0.1 (Magnetic) Warm Simple Storage Service (S3) $0.024 to $0.030 Cold Glacier $0.007
  • 47. OSS technologies for scalable IoT apps Component Open Source Technologies Load balancer Ngnix, HA Proxy Ingestion Kafka, RabbitMQ, ZeroMQ, Flume Stream Computing Spark Streaming, Apache Flink, Kafka Streams, Samza Time Series Store InfluxDB, KairosDB, Riak, Cassandra, OpenTSDB Profiles Store CouchBase, Riak, MySQL, Postgres, MongoDB Search Solr, Elastic Search Object Storage HDFS (Hadoop), Minio, Riak S2, Ceph Analytics Framework Apache Spark, MapReduce, Hive SQL Query Engine Spark SQL, Presto, Impala, Drill Cluster Manager Mesosphere DC/OS or Mesos, Kubernetes, Docker Swarm
  • 48. Checklist for IoT technology stack ❑Is it vendor lock-in or open source software? Are there open APIs? ❑Can it be deployed in cloud? At the edge? In a data center? Using hybrid approach? ❑Can it be used it for free or low cost (no big upfront investment)? ❑Can you develop your app on your laptop? How many “moving parts”? ❑Are the components pre-integrated or can be easily integrated together? ❑Can you easily scale each component in this architecture by 10x? 20x? 50x? ❑Is there a roadmap, actively worked on, which is aligned with your vision? ❑Is there a company behind the technology to provide 24x7 support when needed?
  • 49. Come to Basho booth to learn about • Riak TS (Time Series) - highly scalable NoSQL database for IoT and Time Series … and more • Riak Spark Connector for Apache Spark • Riak Integrations with Redis and Kafka • Riak Mesos Framework (RMF) for DC/OS
  翻译: