Mongo db improve the performance of your application codemotion2016

MADRID · NOV 18-19 · 2016
19th November 2016
18:30-19:15
MongoDB:
Improve the performance
of your application
by @juanroycouto
@codemotion_es
1

MADRID · NOV 18-19 · 2016
#whoami
❖ @juanroycouto
❖ www.juanroy.es
❖ www.mongodbspain.com/en/author/juanroy
❖ juanroycouto@gmail.com
❖ MongoDB DBA at Grupo Undanet
2

MADRID · NOV 18-19 · 2016
#Agenda
3
● Data Model
● Application Patterns
● .explain()
● Indexes
● Storage Engines
● Query Optimizer
● Aggregation Framework
● Profiling
● Tuning
● Sharding
● Storage & Compression

MADRID · NOV 18-19 · 2016
#Data Model
4

MADRID · NOV 18-19 · 2016
#Data Model
5
● All the information in one document:
○ No joins
○ Single read
○ Higher performance
○ ACID compliance at the document level
● MMAPv1: avoid unbounded document growth
● WiredTiger rewrites the document for each update

MADRID · NOV 18-19 · 2016
#Application Patterns
6
● Modify only those fields that have changed
● Project only the fields you need
● Avoid negation in queries
● Use Covered Queries when possible
● Avoid scatter-gather queries
● Write Concern:
○ w : 0 → faster than w : all

MADRID · NOV 18-19 · 2016
#.explain()
7
● Use .explain() to get the time a query needs
● Verbose options:
○ queryPlanner (default)
○ executionStats (stats for the winner plan)
○ allPlansExecution (stats for all candidate
plans)

MADRID · NOV 18-19 · 2016
#.explain()
8
● In-memory sorting data is limited to 32MB (you
need an index)
● My index selectivity is the relationship between
nReturned and totalKeysExamined

MADRID · NOV 18-19 · 2016
#Indexes
9

MADRID · NOV 18-19 · 2016
#Indexes
10
● Improve the performance of
read-intensive apps
● Overhead:
○ Write operations
○ Disk usage
○ Memory consumption

MADRID · NOV 18-19 · 2016
● Working Set is made of:
○ Indexes
○ Your most frequently accessed data
#Indexes-Working Set
11

MADRID · NOV 18-19 · 2016
● If your Working Set does not fit into RAM:
○ Increment your RAM
○ Shard the database
○ Avoid scan all documents in the collection
■ Page faults
■ You will fill in your Working Set with
infrequent accessed data
#Indexes-Working Set
12

MADRID · NOV 18-19 · 2016
#Indexes-Before creating indexes
13
● Understand your data
● How your queries will be
● Create as few indexes as possible
● Put your highly selective fields first in the index
● Check your current indexes before creating new
ones

MADRID · NOV 18-19 · 2016
#Indexes-Advices
14
● Do not index a little cardinality field
● Check your index performance
● Monitor your system to check if:
○ Your strategy is still up to date
○ You need new indexes
○ You can remove unnecessary indexes

MADRID · NOV 18-19 · 2016
#Indexes-Compound
15
● A compound index has better performance than:
○ Single indexes
○ Intersection
● Remove indexes that are prefixes of other
● The value of the combined fields must have high
cardinality

MADRID · NOV 18-19 · 2016
#Indexes-Compound
16
>db.collection.createIndex( { equality, sort, range } );
>db.collection.find(
{
book_author : ‘John’, #equality
book_id : { $gt : 54 } #range
} )
.sort( { book_price : 1 } ); #sort

MADRID · NOV 18-19 · 2016
> use test
> db.example.createIndex( { a : 1, b : 1 } );
> db.example.explain("executionStats").find( { a : 56, b : 87 } );
"nReturned" : 100,
"executionTimeMillis" : 2,
"totalKeysExamined" : 100,
"totalDocsExamined" : 100,
> db.example.dropIndex( { a : 1, b : 1 } );
> db.example.createIndex( { a : 1 } );
> db.example.createIndex( { b : 1 } );
"nReturned" : 100,
"totalDocsExamined" : 10000
#Indexes-Compound example
17

MADRID · NOV 18-19 · 2016
#Indexes-Covered Queries
18
● They return results from the indexes
● A covered query does not read from the
document level:
○ ndocsExamined : 0
● In the index must be all the fields:
○ Included in the query
○ Returned by the query

MADRID · NOV 18-19 · 2016
#Indexes-Covered Queries example
19
> use test
> db.example.createIndex( { a : 1, b : 1 } );
"nReturned" : 100,
> db.example.explain("executionStats").find( { a : 56, b : 87 }, {
_id : 0, a : 1, b : 1 } );
"nReturned" : 100,

MADRID · NOV 18-19 · 2016
#Indexes-TTL
20
● TTL (TimeToLive) Indexes
○ Allow the user to specify a period of time after
which the data will be automatically deleted

MADRID · NOV 18-19 · 2016
#Indexes-TTL example
21
> use test
> db.timetolive.createIndex( { date : 1 } , { expireAfterSeconds: 20 } );
> db.timetolive.insert( { date : new Date() } );
> db.timetolive.find().pretty()
{
"_id" : ObjectId("581fb43f5626d57bcf66fe73"),
"date" : ISODate("2016-11-06T22:52:47.373Z")
}
> db.timetolive.count()
0
>
The background task that removes expired documents
runs every 60 seconds.

MADRID · NOV 18-19 · 2016
#Indexes-Partial
22
● Partial Indexes
○ Allow to index a subset of your data by
specifying a filtering expression during the
index creation
○ Better query performance while reducing
system overhead

MADRID · NOV 18-19 · 2016
#Indexes-Partial example
> use test
> db.sysprofile.createIndex( { ns : 1, op : 1 }, {
partialFilterExpression : { millis : { $gt : 10000 } } } );
> db.sysprofile.explain().find( { ns : 'school2.students', op :
'query' }, { millis : 1 } );
"winningPlan" : {
"inputStage" : {
"stage" : "COLLSCAN"
> db.sysprofile.explain().find( { ns : 'school2.students', op :
'query', millis : { $gte : 20000 } }, { millis : 1 } );
"winningPlan" : {
"inputStage" : {
"stage" : "IXSCAN"
23

MADRID · NOV 18-19 · 2016
#Indexes-Sparse
24
● Sparse Indexes
○ Only contain entries for documents that
contain a specific field
○ Smaller and more efficient indexes

MADRID · NOV 18-19 · 2016
> use test
> db.test.createIndex( { b : 1 } , { sparse : true } );
> db.test.stats()
16384
>
> db.test.createIndex( { b : 1 } );
> db.test.stats()
2580480
>
#Indexes-Sparse example
25

MADRID · NOV 18-19 · 2016
#Storage Engines-WiredTiger
26
● Reduced storage space and higher I/O scalability
● It compresses indexes into RAM, freeing up
Working Set for documents
● Performance (granular concurrency control)
● Native compression:
○ per collection
○ per index

MADRID · NOV 18-19 · 2016
#Storage Engines-WiredTiger
27
● Types of compression:
○ Snappy (default): Data and Journal
○ zlib: maximum compression
○ Prefix: for indexes

MADRID · NOV 18-19 · 2016
#Storage Engines: In-Memory
28
● Extreme performance coupled with real time
analytics for the most demanding,
latency-sensitive applications
● Delivers the extreme throughput and predictable
latency
● Eliminates the need for separate caching layers

MADRID · NOV 18-19 · 2016
#Query Optimizer
29
● It selects the best index to use
● It periodically runs alternate query plans and
selects the index with the best response time for
each query type
● The results of these tests are stored as a cached
query plan and are updated periodically

MADRID · NOV 18-19 · 2016
#Aggregation Framework
30

MADRID · NOV 18-19 · 2016
#Aggregation Framework
31
● Use $indexStats to determine how frequently
your index is used
● Each shard has its own statistics
● Stats are reset when you restart your mongod
service
>db.collection.aggregate([
{ $indexStats : {} }
])

MADRID · NOV 18-19 · 2016
#Aggregation Framework example
32
> db.test.aggregate([ { $indexStats : {} } ]).pretty()
{
"name" : "b_1",
"key" : {
"b" : 1
},
"host" : "hostname:27017",
"accesses" : {
"ops" : NumberLong(4),
"since" : ISODate("2016-11-07T00:05:39.771Z")
}
}
...

MADRID · NOV 18-19 · 2016
#Tools-Compass
33

MADRID · NOV 18-19 · 2016
#Tools-Compass
34
● MongoDB Compass
○ Analyze the size and usage of the indexes
○ It offers visual plan explanations
○ Index creation
○ Real time stats
○ Identify and fix performance and data problems

MADRID · NOV 18-19 · 2016
#Tools-Ops Manager
35
● The Visual Query Profiler allows to analyze a
query performance and recommends on the
addition of indexes
● It can be used to visualize output from the
profiler when identifying slow queries

MADRID · NOV 18-19 · 2016
#Tools-Cloud Manager & Atlas
36
● Use the Query Targeting chart

MADRID · NOV 18-19 · 2016
#Tools-Profiling
37
● Guess the slowest queries for improving their
indexes
● Log information for all events or only those whose
duration exceeds a configurable threshold (100ms
default)

MADRID · NOV 18-19 · 2016
#Tools-Profiling example
38
> db.setProfilingLevel(2)
{ "was" : 0, "slowms" : 100, "ok" : 1 }
> db.example.find( { c : 98 } );
> db.system.profile.find().pretty()
{
"op" : "query",
"ns" : "test.example",
"query" : {
"find" : "example",
"filter" : {
"c" : 98
}
},
…
"millis" : 6

MADRID · NOV 18-19 · 2016
#Tuning-SSD
39
● Most disk access patterns are not sequential
● Data files benefit from SSDs
● MongoDB journal files have high sequential write
patterns (good for spinning disks)
● RAM/SSDs better than spinning drives
● Use high performance storage
● Avoid networked storage

MADRID · NOV 18-19 · 2016
#Tuning-CPU
40
● MongoDB has better performance on faster CPUs
● The MongoDB WiredTiger Storage Engine is better
able to saturate multi-core processor resources than
the MMAPv1 Storage Engine

MADRID · NOV 18-19 · 2016
#Tuning-Thread
41
● The WiredTiger Storage Engine is multi-threaded
and can take advantage of many CPU Cores

MADRID · NOV 18-19 · 2016
#Sharding
42
● The balancer has been moved from the mongos to
the primary member of the config server
● Sharding improves the performance of reads
● Benefits (even with a ‘poor’ shard key and
scatter-gather):
○ A larger amount of memory to store indexes
○ Parallelize queries across shards reducing latency

MADRID · NOV 18-19 · 2016
#Storage & Compression types
43
● Balance query latency with storage density and
cost by assigning data sets based on a value
such as a timestamp to specific storage devices

MADRID · NOV 18-19 · 2016
#Storage & Compression types
44
● Recent, frequently accessed data can be assigned to
high performance SSDs with Snappy compression
● Older, less frequently accessed data is tagged to
lower-throughput hard disk drives compressed with
zlib to attain maximum storage density with a lower
cost-per-bit
● As data ages, MongoDB will automatically migrates it
between shards

MADRID · NOV 18-19 · 2016
#Questions?
45
Questions?

MongoDB:
Improve the performance of your application
by @juanroycouto
MADRID · NOV 18-19 · 2016
46
Thank you for your attention!

Mongo db improve the performance of your application codemotion2016

Recommended

More Related Content

What's hot (20)

Similar to Mongo db improve the performance of your application codemotion2016 (20)

Recently uploaded (20)

Mongo db improve the performance of your application codemotion2016