SlideShare a Scribd company logo
#MongoDB 
Using MongoDB and Hadoop 
Together For Success 
Buzz Moschetti 
buzz.moschetti@mongodb.com 
Enterprise Architect, MongoDB
Who is your Presenter? 
• Yes, I use “Buzz” on my business cards 
• Former Investment Bank Chief Architect at 
JPMorganChase and Bear Stearns before that 
• Over 25 years of designing and building systems 
• Big and small 
• Super-specialized to broadly useful in any vertical 
• “Traditional” to completely disruptive 
• Advocate of language leverage and strong factoring 
• Still programming – using emacs, of course
Agenda 
• (Occasionally) Brutal Truths about Big Data 
• The Key To Success in Large Scale Data Management 
• Review of Directed Content Business Architecture 
• Technical Implementation Examples 
• Recommendation Capability 
• Realtime Trade / Position Risk 
• Q & A
Truths 
• Clear definition of Big Data still maturing 
• Efficiently operationalizing Big Data is non-trivial 
• Developing, debugging, understanding MapReduce 
• Cluster monitoring & management, job scheduling/recovery 
• If you thought regular ETL Hell was bad…. 
• Big Data is not about math/set accuracy 
• The last 25000 items in a 25,497,612 set “don’t matter” 
• Big Data questions are best asked periodically 
• “Are we there yet?” 
• Realtime means … realtime
It’s About The Functions, not the Terms 
DON’T ASK: 
• Is this an operations or an analytics problem? 
• Is this online or offline? 
• What query language should we use? 
• What is my integration strategy across tools? 
ASK INSTEAD: 
• Am I incrementally addressing data (esp. writes)? 
• Am I computing a precise answer or a trend? 
• Do I need to operate on this data in realtime? 
• What is my holistic architecture?
Success in Big Data: MongoDB + Hadoop 
• Efficient Operationalization 
• Robust data movements 
• Clarity and fidelity of data movements 
• Designing for change 
• Analysis Feedback 
• Data computed in Hadoop integrated back into 
MongoDB
What We’re Going to “Build” today 
Realtime Directed Content System 
• Based on what users click, “recommended” 
content is returned in addition to the target 
• The example is sector (manufacturing, financial 
services, retail) neutral 
• System dynamically updates behavior in response 
to user activity
The Participants and Their Roles 
Directed 
Content 
System 
Customers 
Analysts/ 
Data Scientists 
Content 
Creators 
Management/ 
Strategy 
Operate on data to 
identify trends and 
develop tag domains 
Generate and tag 
content from a known 
domain of tags 
Make decisions based 
on trends and other 
summarized data 
Developers/ 
ProdOps 
Bring it all together: 
apps, SDLC, integration, 
etc.
Priority #1: Maximizing User value 
Considerations/Requirements 
Maximize realtime user value and experience 
Provide management reporting and trend analysis 
Engineer for Day 2 agility on recommendation engine 
Provide scrubbed click history for customer 
Permit low-cost horizontal scaling 
Minimize technical integration 
Minimize technical footprint 
Use conventional and/or approved tools 
Provide a RESTful service layer 
…..
The Architecture 
App(s) MongoDB Hadoop MapReduce
Complementary Strengths 
App(s) MongoDB Hadoop MapReduce 
• Standard design paradigm (objects, 
tools, 3rd party products, IDEs, test 
drivers, skill pool, etc. etc.) 
• Language flexibility (Java, C#, C++ 
python, Scala, …) 
• Webscale deployment model 
• appservers, DMZ, monitoring 
• High performance rich shape CRUD 
• MapReduce design paradigm 
• Node deployment model 
• Very large set operations 
• Computationally intensive, longer 
duration 
• Read-dominated workload
“Legacy” Approach: Somewhat unidirectional 
ETL 
App(s) MongoDB Hadoop MapReduce 
• Extract data from mongoDB and other 
sources nightly (or weekly) 
• Generate reports for people to read 
• Same pains as existing ETL: 
reconciliation, transformation, change 
management …
Somewhat better approach 
ETL 
App(s) MongoDB Hadoop MapReduce 
ETL 
• Extract data from mongoDB and other 
sources nightly (or weekly) 
• Generate reports for people to read 
• Move important summary data back to 
mongoDB for consumption by apps. 
• Still in ETL-dominated landscape
…but the overall problem remains: 
• How to realtime integrate and operate upon both 
periodically generated data and realtime current 
data? 
• Lackluster integration between OLTP and Hadoop 
• It’s not just about the database: you need a 
realtime profile and profile update function
The legacy problem in pseudocode 
onContentClick() {! 
String[] tags = content.getTags();! 
Resource[] r = f1(database, tags);! 
}! 
• Realtime intraday state not well-handled 
• Baselining is a different problem than click 
handling
The Right Approach 
• Users have a specific Profile entity 
• The Profile captures trend analytics as baselining 
information 
• The Profile has per-tag “counters” that are updated with 
each interaction / click 
• Counters plus baselining are passed to fetch function 
• The fetch function itself could be dynamic!
24 hours in the life of The System 
• Assume some content has been created and tagged 
• Two systemetized tags: Pets & PowerTools
Monday, 1:30AM EST 
App(s) MongoDB Hadoop MapReduce 
• Fetch all user Profiles from MongoDB; load into Hadoop 
• Or skip if using the MongoDB-Hadoop connector!
MongoDB-Hadoop MapReduce Example 
public class ProfileMapper ! 
extends Mapper<Object, BSONObject, IntWritable, IntWritable> 
{! 
@Override! 
public void map(final Object pKey,! 
! ! ! !final BSONObject pValue,! 
! ! ! !final Context pContext )! 
!throws IOException, InterruptedException{! 
String user = (String)pValue.get(”user");! 
Date d1 = (Date)pValue.get(“lastUpdate”);! 
int count = 0;! 
List<String> keys = pValue.get(“tags”).keys();! 
for ( String tag : keys) {! 
count += pValue.get(tag).get(“hist”).size();! 
)! 
int avg = count / keys.size();! 
pContext.write( new IntWritable( count), new 
IntWritable( avg ) );! 
}! 
}!
MongoDB-Hadoop v1 (today) 
Hadoop 
MR Mapper 
v1 
MongoDB-Hadoop 
ü V1 adapter draws data directly from MongoDB 
ü No ETL, scripts, change management, etc. 
ü Storage optimized: NO data copies
MongoDB-Hadoop v2 (soon) 
Hadoop 
MR Mapper 
HDFS 
ü V2 flows data directly into HDFS via a special 
MongoDB secondary 
ü No ETL, scripts, change management, etc. 
ü Data is copied – but still one data fabric 
ü Realtime data with snapshotting as an option
Monday, 1:45AM EST 
App(s) MongoDB Hadoop MapReduce 
• Grind through all content data and user Profile data to produce: 
• Tags based on feature extraction (vs. creator-applied tags) 
• Trend baseline per user for tags Pets and PowerTools 
• Load Profiles with new baseline back into MongoDB
Monday, 8AM EST 
App(s) MongoDB Hadoop MapReduce 
• User Bob logs in and Profile retrieved from MongoDB 
• Bob clicks on Content X which is already tagged as “Pets” 
• Bob has clicked on Pets tagged content many times 
• Adjust Profile for tag “Pets” and save back to MongoDB 
• Analysis = f(Profile) 
• Analysis can be “anything”; it is simply a result. It could trigger 
an ad, a compliance alert, etc.
Monday, 8:02AM EST 
App(s) MongoDB Hadoop MapReduce 
• Bob clicks on Content Y which is already tagged as “Spices” 
• Spice is a new tag type for Bob 
• Adjust Profile for tag “Spices” and save back to MongoDB 
• Analysis = f(profile)
Profile in Detail 
{! 
user: “Bob”,! 
personalData: {! 
zip: “10024”,! 
gender: “M”! 
},! 
tags: {! 
PETS: { algo: “A4”, ! 
baseline: [0,0,10,4,1322,44,23, … ],! 
hist: [! 
{ ts: datetime1, url: url1 },! 
{ ts: datetime2, url: url2 } // 100 more! 
]},! 
SPICE: { hist: [! 
{ ts: datetime3, url: url3 }! 
]}! 
}! 
}!
Tag-based algorithm detail 
getRecommendedContent(profile, [“PETS”, other]) { 
if algo for a tag available {! 
!filter = algo(profile, tag);! 
}! 
fetch N recommendations (filter);! 
}! 
! 
A4(profile, tag) {! 
weight = get tag (“PETS”) global weighting;! 
adjustForPersonalBaseline(weight, “PETS” baseline); ! 
if “PETS” clicked more than 2 times in past 10 mins! 
then weight += 10;! 
if “PETS” clicked more than 10 times in past 2 days! 
then weight += 3; !! 
! 
return new filter({“PETS”, weight}, globals)! 
}!
Tuesday, 1AM EST 
App(s) MongoDB Hadoop MapReduce 
• Fetch all user Profiles from MongoDB; load into Hadoop 
• Or skip if using the MongoDB-Hadoop connector!
Tuesday, 1:30AM EST 
App(s) MongoDB Hadoop MapReduce 
• Grind through all content data and user profile data to produce: 
• Tags based on feature extraction (vs. creator-applied tags) 
• Trend baseline for Pets and PowerTools and Spice 
• Data can be specific to individual or by group 
• Load new baselines back into MongoDB
New Profile in Detail 
{! 
user: “Bob”,! 
personalData: {! 
zip: “10024”,! 
gender: “M”! 
},! 
tags: {! 
PETS: { algo: “A4”, ! 
baseline: [0,4,10,4,1322,44,23, … ],! 
hist: [! 
{ ts: datetime1, url: url1 },! 
{ ts: datetime2, url: url2 } // 100 more! 
]},! 
SPICE: { hist: [! 
baseline: [1],! 
{ ts: datetime3, url: url3 }! 
]}! 
}! 
}!
Tuesday, 1:35AM EST 
App(s) MongoDB Hadoop MapReduce 
• Perform maintenance on user Profiles 
• Click history trimming (variety of algorithms) 
• “Dead tag” removal 
• Update of auxiliary reference data
New Profile in Detail 
{! 
user: “Bob”,! 
personalData: {! 
zip: “10022”,! 
gender: “M”! 
},! 
tags: {! 
PETS: { algo: “A4”, ! 
baseline: [ 1322,44,23, … ],! 
hist: [! 
{ ts: datetime1, url: url1 } // 50 more! 
]},! 
SPICE: { algo: “Z1”, hist: [! 
baseline: [1],! 
{ ts: datetime3, url: url3 }! 
]}! 
}! 
}!
Feel free to run the baselining more frequently 
App(s) MongoDB Hadoop MapReduce 
… but avoid “Are We There Yet?”
Nearterm / Realtime Questions & Actions 
With respect to the Customer: 
• What has Bob done over the past 24 hours? 
• Given an input, make a logic decision in 100ms or less 
With respect to the Provider: 
• What are all current users doing or looking at? 
• Can we nearterm correlate single events to shifts in behavior?
Longterm/ Not Realtime Questions & Actions 
With respect to the Customer: 
• Any way to explain historic performance / actions? 
• What are recommendations for the future? 
With respect to the Provider: 
• Can we correlate multiple events from multiple sources 
over a long period of time to identify trends? 
• What is my entire customer base doing over 2 years? 
• Show me a time vs. aggregate tag hit chart 
• Slice and dice and aggregate tags vs. XYZ 
• What tags are trending up or down?
Another Example: Realtime Risk 
Applications 
Trade Processing 
Risk 
Risk Service 
Calculation 
(Spark) 
Log trade 
activities 
Query 
trades 
Query 
Risk 
Risk 
Params 
Admin 
Analysis/ 
Reporting 
(Impala) 
OTHER 
HDFS DATA 
OTHER 
HDFS DATA
Recording a trade 
Applications 
Trade Processing 
1. Bank makes a trade 
2. Trade sent to Trade Processing 
3. Trade Processing writes trade to MongoDB 
4. Realtime replicate trade to Hadoop/HDFS 
Non-functional notes: 
• High volume of data ingestion (10,000s or more 
events per second) 
• Durable storage of trade data 
• Store trade events across all asset classes 
1 
2 
3 
4
Querying deal / trade / event data 
1. Query on deal attributes (id, counterparty, asset 
class, termination date, notional amount, book) 
2. MongoDB performs index-optimized query and 
Trade Processing assembles Deal/Trade/Event data 
into response packet 
3. Return response packet to caller 
Non-functional notes: 
• System can support very high volume (10,000s 
or more queries per second) 
• Millisecond response times 
Applications 
1 
Trade Processing 
2 
3
Updating intra-day risk data 
1. Mirror of trade data already stored in HDFS 
Trade data partitioned into time windows 
2. Signal/timer kicks off a “run” 
3. Spark ingests new partition of trade data as RDD 
and calculates and merges risk data based on 
latest trade data 
4. Risk data written directly to MongoDB and indexed 
and available for online queries / aggregations / 
applications logic 
Applications 
Risk Service 
1 
Risk 
Calculation 
(Spark) 
2 
4 
3
Querying detail & aggregated risk on demand 
1. Applications can use full MongoDB query API to 
access risk data and trade data 
2. Risk data can be indexed on multiple fields for fast 
access by multiple dimensions 
3. Hadoop jobs periodically apply incremental 
updates to risk data with no down time 
4. Interpolated / matrix risk can be computed on-the-fly 
Non-functional notes 
• System can support very high volume (10,000s 
or more queries per second) 
• Millisecond response times 
Applications 
1 
Risk Service 
2 
3
Trade Analytics & Reporting 
1. Impala provides full SQL access to all content in 
Hadoop 
2. Dashboards and Reporting frameworks deliver 
periodic information to consumers 
3. Breadth of data discovery / ad-hoc analysis tools 
can be brought bear on all data in Hadoop 
Non-functional notes: 
• Lower query frequency 
• Full SQL query flexibility 
• Most queries / analysis yield value accessing large 
volumes of data (e.g. all events in the last 30 days 
– or 30 months) 
Applications 
Impala 
Dashboards Reports 
Ad-hoc 
Analysis
The Key To Success: It is One System 
MongoDB 
App(s) 
Hadoop 
MapReduce
Q&A 
buzz.moschetti@mongodb.com
#MongoDB 
Thank You 
Buzz Moschetti 
buzz.moschetti@mongodb.com
Ad

More Related Content

What's hot (20)

Sas bulk load
Sas bulk loadSas bulk load
Sas bulk load
mprabhuram
 
Neo4j Bloom: What’s New with Neo4j's Data Visualization Tool
Neo4j Bloom: What’s New with Neo4j's Data Visualization ToolNeo4j Bloom: What’s New with Neo4j's Data Visualization Tool
Neo4j Bloom: What’s New with Neo4j's Data Visualization Tool
Neo4j
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
DataStax
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
HostedbyConfluent
 
The Graph Database Universe: Neo4j Overview
The Graph Database Universe: Neo4j OverviewThe Graph Database Universe: Neo4j Overview
The Graph Database Universe: Neo4j Overview
Neo4j
 
Distributing Queries the Citus Way | PostgresConf US 2018 | Marco Slot
Distributing Queries the Citus Way | PostgresConf US 2018 | Marco SlotDistributing Queries the Citus Way | PostgresConf US 2018 | Marco Slot
Distributing Queries the Citus Way | PostgresConf US 2018 | Marco Slot
Citus Data
 
4 Cliques Clusters
4 Cliques Clusters4 Cliques Clusters
4 Cliques Clusters
Maksim Tsvetovat
 
MongoDB Atlas
MongoDB AtlasMongoDB Atlas
MongoDB Atlas
MongoDB
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data Science
Neo4j
 
Terraform 101
Terraform 101Terraform 101
Terraform 101
Haggai Philip Zagury
 
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon
 
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4
Timothy Spann
 
Engineering data quality
Engineering data qualityEngineering data quality
Engineering data quality
Lars Albertsson
 
MapReduce
MapReduceMapReduce
MapReduce
Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL
 
Neo4j Webinar: Graphs in banking
Neo4j Webinar:  Graphs in banking Neo4j Webinar:  Graphs in banking
Neo4j Webinar: Graphs in banking
Neo4j
 
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Databricks
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache Spark
Databricks
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
MongoDB
 
Building Data Lakes with Apache Airflow
Building Data Lakes with Apache AirflowBuilding Data Lakes with Apache Airflow
Building Data Lakes with Apache Airflow
Gary Stafford
 
The Case for Graphs in Supply Chains
The Case for Graphs in Supply ChainsThe Case for Graphs in Supply Chains
The Case for Graphs in Supply Chains
Neo4j
 
Neo4j Bloom: What’s New with Neo4j's Data Visualization Tool
Neo4j Bloom: What’s New with Neo4j's Data Visualization ToolNeo4j Bloom: What’s New with Neo4j's Data Visualization Tool
Neo4j Bloom: What’s New with Neo4j's Data Visualization Tool
Neo4j
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
DataStax
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
HostedbyConfluent
 
The Graph Database Universe: Neo4j Overview
The Graph Database Universe: Neo4j OverviewThe Graph Database Universe: Neo4j Overview
The Graph Database Universe: Neo4j Overview
Neo4j
 
Distributing Queries the Citus Way | PostgresConf US 2018 | Marco Slot
Distributing Queries the Citus Way | PostgresConf US 2018 | Marco SlotDistributing Queries the Citus Way | PostgresConf US 2018 | Marco Slot
Distributing Queries the Citus Way | PostgresConf US 2018 | Marco Slot
Citus Data
 
MongoDB Atlas
MongoDB AtlasMongoDB Atlas
MongoDB Atlas
MongoDB
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data Science
Neo4j
 
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon
 
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4
Timothy Spann
 
Engineering data quality
Engineering data qualityEngineering data quality
Engineering data quality
Lars Albertsson
 
Neo4j Webinar: Graphs in banking
Neo4j Webinar:  Graphs in banking Neo4j Webinar:  Graphs in banking
Neo4j Webinar: Graphs in banking
Neo4j
 
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Databricks
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache Spark
Databricks
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
MongoDB
 
Building Data Lakes with Apache Airflow
Building Data Lakes with Apache AirflowBuilding Data Lakes with Apache Airflow
Building Data Lakes with Apache Airflow
Gary Stafford
 
The Case for Graphs in Supply Chains
The Case for Graphs in Supply ChainsThe Case for Graphs in Supply Chains
The Case for Graphs in Supply Chains
Neo4j
 

Similar to Using MongoDB + Hadoop Together (20)

Big Data: Guidelines and Examples for the Enterprise Decision Maker
Big Data: Guidelines and Examples for the Enterprise Decision MakerBig Data: Guidelines and Examples for the Enterprise Decision Maker
Big Data: Guidelines and Examples for the Enterprise Decision Maker
MongoDB
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data Presentation
MongoDB
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
Norberto Leite
 
Webinar: How Banks Use MongoDB as a Tick Database
Webinar: How Banks Use MongoDB as a Tick DatabaseWebinar: How Banks Use MongoDB as a Tick Database
Webinar: How Banks Use MongoDB as a Tick Database
MongoDB
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
MongoDB
 
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Databricks
 
Mongo db and hadoop driving business insights - final
Mongo db and hadoop   driving business insights - finalMongo db and hadoop   driving business insights - final
Mongo db and hadoop driving business insights - final
MongoDB
 
Pre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDBPre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDB
Rackspace
 
Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB
MongoDB
 
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with TableauWebinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
MongoDB
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business Insights
MongoDB
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
Denny Lee
 
MongoDB Days Germany: Data Processing with MongoDB
MongoDB Days Germany: Data Processing with MongoDBMongoDB Days Germany: Data Processing with MongoDB
MongoDB Days Germany: Data Processing with MongoDB
MongoDB
 
Learn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBLearn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDB
Marakana Inc.
 
MediaGlu and Mongo DB
MediaGlu and Mongo DBMediaGlu and Mongo DB
MediaGlu and Mongo DB
Sundar Nathikudi
 
Dev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBDev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDB
MongoDB
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
MongoDB
 
Large scale computing
Large scale computing Large scale computing
Large scale computing
Bhupesh Bansal
 
FDMEE Scripting - Cloud and On-Premises - It Ain't Groovy, But It's My Bread ...
FDMEE Scripting - Cloud and On-Premises - It Ain't Groovy, But It's My Bread ...FDMEE Scripting - Cloud and On-Premises - It Ain't Groovy, But It's My Bread ...
FDMEE Scripting - Cloud and On-Premises - It Ain't Groovy, But It's My Bread ...
Joseph Alaimo Jr
 
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
WebExpo
 
Big Data: Guidelines and Examples for the Enterprise Decision Maker
Big Data: Guidelines and Examples for the Enterprise Decision MakerBig Data: Guidelines and Examples for the Enterprise Decision Maker
Big Data: Guidelines and Examples for the Enterprise Decision Maker
MongoDB
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data Presentation
MongoDB
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
Norberto Leite
 
Webinar: How Banks Use MongoDB as a Tick Database
Webinar: How Banks Use MongoDB as a Tick DatabaseWebinar: How Banks Use MongoDB as a Tick Database
Webinar: How Banks Use MongoDB as a Tick Database
MongoDB
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
MongoDB
 
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Databricks
 
Mongo db and hadoop driving business insights - final
Mongo db and hadoop   driving business insights - finalMongo db and hadoop   driving business insights - final
Mongo db and hadoop driving business insights - final
MongoDB
 
Pre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDBPre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDB
Rackspace
 
Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB
MongoDB
 
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with TableauWebinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
MongoDB
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business Insights
MongoDB
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
Denny Lee
 
MongoDB Days Germany: Data Processing with MongoDB
MongoDB Days Germany: Data Processing with MongoDBMongoDB Days Germany: Data Processing with MongoDB
MongoDB Days Germany: Data Processing with MongoDB
MongoDB
 
Learn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBLearn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDB
Marakana Inc.
 
Dev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBDev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDB
MongoDB
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
MongoDB
 
Large scale computing
Large scale computing Large scale computing
Large scale computing
Bhupesh Bansal
 
FDMEE Scripting - Cloud and On-Premises - It Ain't Groovy, But It's My Bread ...
FDMEE Scripting - Cloud and On-Premises - It Ain't Groovy, But It's My Bread ...FDMEE Scripting - Cloud and On-Premises - It Ain't Groovy, But It's My Bread ...
FDMEE Scripting - Cloud and On-Premises - It Ain't Groovy, But It's My Bread ...
Joseph Alaimo Jr
 
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
WebExpo
 
Ad

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
Ad

Recently uploaded (20)

Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier VroomAI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
UXPA Boston
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
IT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information TechnologyIT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information Technology
SHEHABALYAMANI
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
ACE Aarhus - Team'25 wrap-up presentation
ACE Aarhus - Team'25 wrap-up presentationACE Aarhus - Team'25 wrap-up presentation
ACE Aarhus - Team'25 wrap-up presentation
DanielEriksen5
 
Cybersecurity Tools and Technologies - Microsoft Certificate
Cybersecurity Tools and Technologies - Microsoft CertificateCybersecurity Tools and Technologies - Microsoft Certificate
Cybersecurity Tools and Technologies - Microsoft Certificate
VICTOR MAESTRE RAMIREZ
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdfICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
Eryk Budi Pratama
 
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
Toru Tamaki
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 
DNF 2.0 Implementations Challenges in Nepal
DNF 2.0 Implementations Challenges in NepalDNF 2.0 Implementations Challenges in Nepal
DNF 2.0 Implementations Challenges in Nepal
ICT Frame Magazine Pvt. Ltd.
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier VroomAI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
UXPA Boston
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
IT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information TechnologyIT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information Technology
SHEHABALYAMANI
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
ACE Aarhus - Team'25 wrap-up presentation
ACE Aarhus - Team'25 wrap-up presentationACE Aarhus - Team'25 wrap-up presentation
ACE Aarhus - Team'25 wrap-up presentation
DanielEriksen5
 
Cybersecurity Tools and Technologies - Microsoft Certificate
Cybersecurity Tools and Technologies - Microsoft CertificateCybersecurity Tools and Technologies - Microsoft Certificate
Cybersecurity Tools and Technologies - Microsoft Certificate
VICTOR MAESTRE RAMIREZ
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdfICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
Eryk Budi Pratama
 
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
Toru Tamaki
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 

Using MongoDB + Hadoop Together

  • 1. #MongoDB Using MongoDB and Hadoop Together For Success Buzz Moschetti buzz.moschetti@mongodb.com Enterprise Architect, MongoDB
  • 2. Who is your Presenter? • Yes, I use “Buzz” on my business cards • Former Investment Bank Chief Architect at JPMorganChase and Bear Stearns before that • Over 25 years of designing and building systems • Big and small • Super-specialized to broadly useful in any vertical • “Traditional” to completely disruptive • Advocate of language leverage and strong factoring • Still programming – using emacs, of course
  • 3. Agenda • (Occasionally) Brutal Truths about Big Data • The Key To Success in Large Scale Data Management • Review of Directed Content Business Architecture • Technical Implementation Examples • Recommendation Capability • Realtime Trade / Position Risk • Q & A
  • 4. Truths • Clear definition of Big Data still maturing • Efficiently operationalizing Big Data is non-trivial • Developing, debugging, understanding MapReduce • Cluster monitoring & management, job scheduling/recovery • If you thought regular ETL Hell was bad…. • Big Data is not about math/set accuracy • The last 25000 items in a 25,497,612 set “don’t matter” • Big Data questions are best asked periodically • “Are we there yet?” • Realtime means … realtime
  • 5. It’s About The Functions, not the Terms DON’T ASK: • Is this an operations or an analytics problem? • Is this online or offline? • What query language should we use? • What is my integration strategy across tools? ASK INSTEAD: • Am I incrementally addressing data (esp. writes)? • Am I computing a precise answer or a trend? • Do I need to operate on this data in realtime? • What is my holistic architecture?
  • 6. Success in Big Data: MongoDB + Hadoop • Efficient Operationalization • Robust data movements • Clarity and fidelity of data movements • Designing for change • Analysis Feedback • Data computed in Hadoop integrated back into MongoDB
  • 7. What We’re Going to “Build” today Realtime Directed Content System • Based on what users click, “recommended” content is returned in addition to the target • The example is sector (manufacturing, financial services, retail) neutral • System dynamically updates behavior in response to user activity
  • 8. The Participants and Their Roles Directed Content System Customers Analysts/ Data Scientists Content Creators Management/ Strategy Operate on data to identify trends and develop tag domains Generate and tag content from a known domain of tags Make decisions based on trends and other summarized data Developers/ ProdOps Bring it all together: apps, SDLC, integration, etc.
  • 9. Priority #1: Maximizing User value Considerations/Requirements Maximize realtime user value and experience Provide management reporting and trend analysis Engineer for Day 2 agility on recommendation engine Provide scrubbed click history for customer Permit low-cost horizontal scaling Minimize technical integration Minimize technical footprint Use conventional and/or approved tools Provide a RESTful service layer …..
  • 10. The Architecture App(s) MongoDB Hadoop MapReduce
  • 11. Complementary Strengths App(s) MongoDB Hadoop MapReduce • Standard design paradigm (objects, tools, 3rd party products, IDEs, test drivers, skill pool, etc. etc.) • Language flexibility (Java, C#, C++ python, Scala, …) • Webscale deployment model • appservers, DMZ, monitoring • High performance rich shape CRUD • MapReduce design paradigm • Node deployment model • Very large set operations • Computationally intensive, longer duration • Read-dominated workload
  • 12. “Legacy” Approach: Somewhat unidirectional ETL App(s) MongoDB Hadoop MapReduce • Extract data from mongoDB and other sources nightly (or weekly) • Generate reports for people to read • Same pains as existing ETL: reconciliation, transformation, change management …
  • 13. Somewhat better approach ETL App(s) MongoDB Hadoop MapReduce ETL • Extract data from mongoDB and other sources nightly (or weekly) • Generate reports for people to read • Move important summary data back to mongoDB for consumption by apps. • Still in ETL-dominated landscape
  • 14. …but the overall problem remains: • How to realtime integrate and operate upon both periodically generated data and realtime current data? • Lackluster integration between OLTP and Hadoop • It’s not just about the database: you need a realtime profile and profile update function
  • 15. The legacy problem in pseudocode onContentClick() {! String[] tags = content.getTags();! Resource[] r = f1(database, tags);! }! • Realtime intraday state not well-handled • Baselining is a different problem than click handling
  • 16. The Right Approach • Users have a specific Profile entity • The Profile captures trend analytics as baselining information • The Profile has per-tag “counters” that are updated with each interaction / click • Counters plus baselining are passed to fetch function • The fetch function itself could be dynamic!
  • 17. 24 hours in the life of The System • Assume some content has been created and tagged • Two systemetized tags: Pets & PowerTools
  • 18. Monday, 1:30AM EST App(s) MongoDB Hadoop MapReduce • Fetch all user Profiles from MongoDB; load into Hadoop • Or skip if using the MongoDB-Hadoop connector!
  • 19. MongoDB-Hadoop MapReduce Example public class ProfileMapper ! extends Mapper<Object, BSONObject, IntWritable, IntWritable> {! @Override! public void map(final Object pKey,! ! ! ! !final BSONObject pValue,! ! ! ! !final Context pContext )! !throws IOException, InterruptedException{! String user = (String)pValue.get(”user");! Date d1 = (Date)pValue.get(“lastUpdate”);! int count = 0;! List<String> keys = pValue.get(“tags”).keys();! for ( String tag : keys) {! count += pValue.get(tag).get(“hist”).size();! )! int avg = count / keys.size();! pContext.write( new IntWritable( count), new IntWritable( avg ) );! }! }!
  • 20. MongoDB-Hadoop v1 (today) Hadoop MR Mapper v1 MongoDB-Hadoop ü V1 adapter draws data directly from MongoDB ü No ETL, scripts, change management, etc. ü Storage optimized: NO data copies
  • 21. MongoDB-Hadoop v2 (soon) Hadoop MR Mapper HDFS ü V2 flows data directly into HDFS via a special MongoDB secondary ü No ETL, scripts, change management, etc. ü Data is copied – but still one data fabric ü Realtime data with snapshotting as an option
  • 22. Monday, 1:45AM EST App(s) MongoDB Hadoop MapReduce • Grind through all content data and user Profile data to produce: • Tags based on feature extraction (vs. creator-applied tags) • Trend baseline per user for tags Pets and PowerTools • Load Profiles with new baseline back into MongoDB
  • 23. Monday, 8AM EST App(s) MongoDB Hadoop MapReduce • User Bob logs in and Profile retrieved from MongoDB • Bob clicks on Content X which is already tagged as “Pets” • Bob has clicked on Pets tagged content many times • Adjust Profile for tag “Pets” and save back to MongoDB • Analysis = f(Profile) • Analysis can be “anything”; it is simply a result. It could trigger an ad, a compliance alert, etc.
  • 24. Monday, 8:02AM EST App(s) MongoDB Hadoop MapReduce • Bob clicks on Content Y which is already tagged as “Spices” • Spice is a new tag type for Bob • Adjust Profile for tag “Spices” and save back to MongoDB • Analysis = f(profile)
  • 25. Profile in Detail {! user: “Bob”,! personalData: {! zip: “10024”,! gender: “M”! },! tags: {! PETS: { algo: “A4”, ! baseline: [0,0,10,4,1322,44,23, … ],! hist: [! { ts: datetime1, url: url1 },! { ts: datetime2, url: url2 } // 100 more! ]},! SPICE: { hist: [! { ts: datetime3, url: url3 }! ]}! }! }!
  • 26. Tag-based algorithm detail getRecommendedContent(profile, [“PETS”, other]) { if algo for a tag available {! !filter = algo(profile, tag);! }! fetch N recommendations (filter);! }! ! A4(profile, tag) {! weight = get tag (“PETS”) global weighting;! adjustForPersonalBaseline(weight, “PETS” baseline); ! if “PETS” clicked more than 2 times in past 10 mins! then weight += 10;! if “PETS” clicked more than 10 times in past 2 days! then weight += 3; !! ! return new filter({“PETS”, weight}, globals)! }!
  • 27. Tuesday, 1AM EST App(s) MongoDB Hadoop MapReduce • Fetch all user Profiles from MongoDB; load into Hadoop • Or skip if using the MongoDB-Hadoop connector!
  • 28. Tuesday, 1:30AM EST App(s) MongoDB Hadoop MapReduce • Grind through all content data and user profile data to produce: • Tags based on feature extraction (vs. creator-applied tags) • Trend baseline for Pets and PowerTools and Spice • Data can be specific to individual or by group • Load new baselines back into MongoDB
  • 29. New Profile in Detail {! user: “Bob”,! personalData: {! zip: “10024”,! gender: “M”! },! tags: {! PETS: { algo: “A4”, ! baseline: [0,4,10,4,1322,44,23, … ],! hist: [! { ts: datetime1, url: url1 },! { ts: datetime2, url: url2 } // 100 more! ]},! SPICE: { hist: [! baseline: [1],! { ts: datetime3, url: url3 }! ]}! }! }!
  • 30. Tuesday, 1:35AM EST App(s) MongoDB Hadoop MapReduce • Perform maintenance on user Profiles • Click history trimming (variety of algorithms) • “Dead tag” removal • Update of auxiliary reference data
  • 31. New Profile in Detail {! user: “Bob”,! personalData: {! zip: “10022”,! gender: “M”! },! tags: {! PETS: { algo: “A4”, ! baseline: [ 1322,44,23, … ],! hist: [! { ts: datetime1, url: url1 } // 50 more! ]},! SPICE: { algo: “Z1”, hist: [! baseline: [1],! { ts: datetime3, url: url3 }! ]}! }! }!
  • 32. Feel free to run the baselining more frequently App(s) MongoDB Hadoop MapReduce … but avoid “Are We There Yet?”
  • 33. Nearterm / Realtime Questions & Actions With respect to the Customer: • What has Bob done over the past 24 hours? • Given an input, make a logic decision in 100ms or less With respect to the Provider: • What are all current users doing or looking at? • Can we nearterm correlate single events to shifts in behavior?
  • 34. Longterm/ Not Realtime Questions & Actions With respect to the Customer: • Any way to explain historic performance / actions? • What are recommendations for the future? With respect to the Provider: • Can we correlate multiple events from multiple sources over a long period of time to identify trends? • What is my entire customer base doing over 2 years? • Show me a time vs. aggregate tag hit chart • Slice and dice and aggregate tags vs. XYZ • What tags are trending up or down?
  • 35. Another Example: Realtime Risk Applications Trade Processing Risk Risk Service Calculation (Spark) Log trade activities Query trades Query Risk Risk Params Admin Analysis/ Reporting (Impala) OTHER HDFS DATA OTHER HDFS DATA
  • 36. Recording a trade Applications Trade Processing 1. Bank makes a trade 2. Trade sent to Trade Processing 3. Trade Processing writes trade to MongoDB 4. Realtime replicate trade to Hadoop/HDFS Non-functional notes: • High volume of data ingestion (10,000s or more events per second) • Durable storage of trade data • Store trade events across all asset classes 1 2 3 4
  • 37. Querying deal / trade / event data 1. Query on deal attributes (id, counterparty, asset class, termination date, notional amount, book) 2. MongoDB performs index-optimized query and Trade Processing assembles Deal/Trade/Event data into response packet 3. Return response packet to caller Non-functional notes: • System can support very high volume (10,000s or more queries per second) • Millisecond response times Applications 1 Trade Processing 2 3
  • 38. Updating intra-day risk data 1. Mirror of trade data already stored in HDFS Trade data partitioned into time windows 2. Signal/timer kicks off a “run” 3. Spark ingests new partition of trade data as RDD and calculates and merges risk data based on latest trade data 4. Risk data written directly to MongoDB and indexed and available for online queries / aggregations / applications logic Applications Risk Service 1 Risk Calculation (Spark) 2 4 3
  • 39. Querying detail & aggregated risk on demand 1. Applications can use full MongoDB query API to access risk data and trade data 2. Risk data can be indexed on multiple fields for fast access by multiple dimensions 3. Hadoop jobs periodically apply incremental updates to risk data with no down time 4. Interpolated / matrix risk can be computed on-the-fly Non-functional notes • System can support very high volume (10,000s or more queries per second) • Millisecond response times Applications 1 Risk Service 2 3
  • 40. Trade Analytics & Reporting 1. Impala provides full SQL access to all content in Hadoop 2. Dashboards and Reporting frameworks deliver periodic information to consumers 3. Breadth of data discovery / ad-hoc analysis tools can be brought bear on all data in Hadoop Non-functional notes: • Lower query frequency • Full SQL query flexibility • Most queries / analysis yield value accessing large volumes of data (e.g. all events in the last 30 days – or 30 months) Applications Impala Dashboards Reports Ad-hoc Analysis
  • 41. The Key To Success: It is One System MongoDB App(s) Hadoop MapReduce
  • 43. #MongoDB Thank You Buzz Moschetti buzz.moschetti@mongodb.com
  翻译: