Real-time Hadoop: The Ideal Messaging System for Hadoop

© 2016 MapR Technologies 2
Contact Information
Ted Dunning
Chief Applications Architect at MapR Technologies
Committer & PMC for Apache’s Drill, Zookeeper & others
VP of Incubator at Apache Foundation
Email tdunning@apache.org tdunning@maprtech.com
Twitter @ted_dunning
Hashtags today: #stratahadoop #ojai

Streaming Architecture
by Ted Dunning and Ellen Friedman © 2016 (published by O’Reilly)
Free copies at book
signing today
3:40PM @ MapR booth
http://bit.ly/mapr-ebook-streams

Goals
• Real-time or near-time
– Includes situations with deadlines
– Also includes situations where delay is simply undesirable
– Even includes situations where delay is just fine
• Micro-services
– Streaming is a convenient idiom for design
– Micro-services … you know we wanted it
– Service isolation is a key requirement

Real-time or Near-time?
• The real point is flow versus state (see talk later today)
• One consequence of flow-based computing is real-time and
near-time become relatively easy
• Life may be a bitch, but it doesn’t happen in batches!

Agenda
• Background / micro-services
• Global requirements
• Scale

A microservice is
loosely coupled
with bounded context

How to Couple Services and Break micro-ness
• Shared schemas, relational stores
• Ad hoc communication between services
• Enterprise service busses
• Brittle protocols
• Poor protocol versioning

How to Decouple Services
• Use self-describing data
• Private databases
• Infrastructural communication between services
• Use modern protocols
• Adopt future-proof protocol practices
• Use shared storage where necessary due to scale

What is the Right Structure for Flow Compute?
• Traditional message queues?
– Message queues are classic answer
– Key feature/bug is out-of-order acknowledgement
– Many implementations
– You pay a huge performance hit for persistence
• Kafka-esque Logs?
– Logs are like queues, but with ordering
– Out of order consumption is possible, acknowledgement not so much
– Canonical base implementation is Kafka
– Performance plus persistence

Scenarios
Profile Database

The task
?
POS 1
location, t, card #
yes/no?
POS 2
location, t, card #
yes/no?

Traditional Solution
POS
1..n
Fraud
detector
Last card
use

What Happens Next?
POS
1..n
Fraud
detector
Last card
use
POS
1..n
Fraud
detector
POS
1..n
Fraud
detector

How to Get Service Isolation
POS
1..n
Fraud
detector
Last card
use
Updater
card activity

New Uses of Data
POS
1..n
Fraud
detector
Last card
use
Updater
Card
location
history
Other
card activity

Scaling Through Isolation
POS
1..n
Last card
use
Updater
POS
1..n
Last card
use
Updater
card activity
Fraud
detector
Fraud
detector

Lessons
• De-coupling and isolation are key
• Private data stores/tables are important,
– but local storage of private data is a bug
• Propagate events, not table updates

Scenarios
IoT Data Aggregation

Basic Situation
Each location
has many
pumps
pump data
Multiple
locations

What Does a Pump Look Like
inlet
out let
m ot or
Temperature
Pressure
Flow
Temperature
Pressure
Flow
Winding temperature
Voltage
Current

Basic Situation
Each location
has many
pumps
pump data
Multiple
locations

pump data
pump data
pump data
pump data
Basic Architecture Reflects Business Structure

Lessons
• Data architecture should reflect business structure
• Even very modest designs involve multiple data centers
• Schemas cannot be frozen in the real world
• Security must follow data ownership

Scenarios
Global Data Recovery

Tokyo
Corporate
HQ

Singapore
Tokyo
Corporate
HQ

Lessons
• Arbitrary number of topics important for simplicity + performance
• Updates happen in many places
• Mobility implies change in replication patterns
• Multi-master updates simplify design massively

Converged Requirements

What Have We Learned?
• Need persistence and performance
– Possibly for years and to 100’s of millions t/s
• Must have convergence
– Need files, tables AND streams
– Need volumes, snapshots, mirrors, permissions and …
• Must have platform security
– Cannot depend on perimeter
– Must follow business structure
• Must have global scale and scope
– Millions of topics for natural designs
– Multi-master replication and update

The Importance of Common API’s
• Commonality and interoperability are critical
– Compare Hadoop eco-system and the noSQL world
• Table stakes
– Persistence
– Performance
– Polymorphism
• Major trend so far is to adopt Kafka API
– 0.9 API and beyond remove major abstraction leaks
– Kafka API supported by all major Hadoop vendors

What we do

Evolution of Data Storage
Functionality
Compatibility
Scalability
Linux
POSIX
Over decades of progress,
Unix-based systems have set the
standard for compatibility and
functionality

Functionality
Compatibility
Scalability
Linux
POSIX
Hadoop
Hadoop achieves much higher
scalability by trading away
essentially all of this compatibility

Functionality
Compatibility
Scalability
Linux
POSIX
Hadoop
MapR enhanced Apache Hadoop by
restoring the compatibility while
increasing scalability and performance
Functionality
Compatibility
Scalability
POSIX

Functionality
Compatibility
Scalability
Linux
POSIX
Hadoop
Adding tables and streams enhances
the functionality of the base file
system

http://bit.ly/fastest-big-data

How we do this with MapR
• MapR Streams is a C++ reimplementation of Kafka API
– Advantages in predictability, performance, scale
– Common security and permissions with entire MapR converged data
platform
• Semantic extensions
– A cluster contains volumes, files, tables … and now streams
– Streams contain topics
– Can have default stream or can name stream by path name
• Core MapR capabilities preserved
– Consistent snapshots, mirrors, multi-master replication

MapR original Innovations
• Volumes
– Distributed management
– Data placement
• Read/write random access file system
– Allows distributed meta-data
– Improved scaling
– Enables NFS access
• Application-level NIC bonding
• Transactionally correct snapshots and mirrors

MapR's Containers
 Each container contains
 Directories & files
 Data blocks
 Replicated on servers
 No need to manage
directly
Files/directories are sharded into blocks, which
are placed into containers on disks
Containers are 16-
32 GB segments of
disk, placed on
nodes

MapR's Containers
 Each container has a
replication chain
 Updates are transactional
 Failures are handled by
rearranging replication

Container locations and replication
CLDB
N1, N2
N3, N2
N1, N2
N1, N3
N3, N2
N1
N2
N3Container location database
(CLDB) keeps track of nodes
hosting each container and
replication chain order

MapR Scaling
Containers represent 16 - 32GB of data
 Each can hold up to 1 Billion files and directories
 100M containers = ~ 2 Exabytes (a very large cluster)
250 bytes DRAM to cache a container
 25GB to cache all containers for 2EB cluster
 But not necessary, can page to disk
 Typical large 10PB cluster needs 2GB
Container-reports are 100x - 1000x < HDFS block-reports
 Serve 100x more data-nodes
 Increase container size to 64G to serve 4EB cluster
 Map/reduce not affected

But Wait, There’s More
• Directories and files are implemented in terms of B-trees
– Key is offset, value is data blob
– Internal transactional semantics guarantees safety and consistency
– Layout algorithms give very high layout linearization
• Tables are implemented in terms of B-trees
– Twisted B-tree implementation allows virtues of log-structured merge
tree without the compaction delays
– Tablet splitting without pausing, integration with file system transactions
• Common security and permissions scheme

Table
Tablet Partition
Similar to LSM implementations,
tables are decomposed by key ranges
Distinct from HBase and Level DB,
MapR tables used fixed number
(greater than 1) of decompositions
Very unusually, relative to LSM and
cousins, data structures at the leaf are
mutable

Re-use of Proven Technology
Partitions are
distributed just
like file chunks
Same replication and
transaction technology

And More …
• Streams are implemented in terms of B-trees as well
– Topics and consumer offsets are kept in stream, not ZK
– Similar splitting technology as MapR DB tables
– Consistent permissions, security, data replication
• Standard Kafka 0.9 API
• Plans to add OJAI for high-level structuring
• Performance is very high

Example
Files
Table
Streams
Directories
Cluster
Volume mount point

Cluster
Volume mount point

Lessons
• API’s matter more than implementations
• There is plenty of room to innovate ahead of the community
• Posix, HDFS, HBASE all define useful API’s
• Kafka 0.9+ does the same

Call to action:
Support the common API’s

Call to action:
Support the Kafka API’s
And come by the MapR booth
to check out MapR Streams

Streaming Architecture
by Ted Dunning and Ellen Friedman © 2016 (published by O’Reilly)
Free copies at book
signing today
http://bit.ly/mapr-ebook-streams

Read online mapr.com/6ebooks-read
Download pdfs mapr.com/6ebooks-pdf
6 Free ebooks
Streaming
Architecture
Ted Dunning &
Ellen Friedman
and MapR Streams
6 Free ebooks
Streaming
Architecture
Ted Dunning &
Ellen Friedman
and MapR Streams
6 Free ebooks
Streaming
Architecture
Ted Dunning &
Ellen Friedman
and MapR Streams
6 Free ebooks
Streaming
Architecture
Ted Dunning &
Ellen Friedman
and MapR Streams
6 Free ebooks
Streaming
Architecture
Ted Dunning &
Ellen Friedman
and MapR Streams
6 Free ebooks
Streaming
Architecture
Ted Dunning &
Ellen Friedman
and MapR Streams
6 Free ebooks
Streaming
Architecture
Ted Dunning &
Ellen Friedman
and MapR Streams
6 Free ebooks
Streaming
Architecture
Ted Dunning &
Ellen Friedman
and MapR Streams

Thank you for coming today!

…helping you put data technology to work
● Find answers
● Ask technical questions
● Join on-demand training course
discussions
● Follow release announcements
● Share and vote on product ideas
● Find Meetup and event listings
Connect with fellow Apache
Hadoop and Spark professionals
community.mapr.com

Q&A
@mapr maprtech
tdunning@maprtech.com
Engage with us!
MapR
maprtech
mapr-technologies

Real-time Hadoop: The Ideal Messaging System for Hadoop

Recommended

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Real-time Hadoop: The Ideal Messaging System for Hadoop (20)

More from DataWorks Summit/Hadoop Summit (20)

Recently uploaded (20)

Real-time Hadoop: The Ideal Messaging System for Hadoop