Scaling with MongoDB

Rick Copeland @rick446
Arborian Consulting, LLC

 Now a consultant, but formerly…

 Software engineer at SourceForge, early adopter of
MongoDB (version 0.8)

 Wrote the SQLAlchemy book (I love SQL when it’s
used well)

 Mainly write Python now, but have done C++, C#,
Java, Javascript, VHDL, Verilog, …

 You can do it with an RDBMS as long as you…
 Don’t use joins
 Don’t use transactions
 Use read-only slaves
 Use memcached
 Denormalize your data
 Use custom sharding/partitioning
 Do a lot of vertical scaling
▪ (we’re going to need a bigger box)

 Use documents to improve locality

 Optimize your indexes

 Be aware of your working set

 Scaling your disks

 Replication for fault-tolerance and read scaling

 Sharding for read and write scaling

Relational (SQL) MongoDB
Database Database Dynamic
Typing
Table Collection B-tree
(range-based)
Index Index
Row Document
Think JSON
Column Field
Primitive types +
arrays, documents

{
title: "Slides for Scaling with MongoDB",
author: "Rick Copeland",
date: ISODate("20012-02-29T19:30:00Z"),
text: "My slides are available on speakerdeck.com",
comments: [
{ author: "anonymous",
date: ISODate("20012-02-29T19:30:01Z"),
text: "Fristpsot!" },
{ author: "mark”,
date: ISODate("20012-02-29T19:45:23Z"),
text: "Nice slides" } ] }
Embed comment data in
blog post document

Seek = 5+ ms Read = really really fast

Post

Author

Comment
Comment
Comment
Comment
Comment

Find where x equals 7

1 2 3 4 5 6 7

Looked at 7 objects

Find where x
equals 7 4

2 6

1 3 5 7

Looked at 3 objects

Only small
portion in
RAM

 Working set =
 sizeof(frequently used data)
 + sizeof(frequently used indexes)

 Right-aligned indexes reduce working set size

 Working set should fit in available RAM for best
performance

 Page faults are the biggest cause of performance
loss in MongoDB

>db.foo.stats()
Data Size
{
"ns" : "test.foo",
"count" : 1338330,
"size" : 46915928, Average doc size
"avgObjSize" : 35.05557523181876,
"storageSize" : 86092032,
"numExtents" : 12,
"nindexes" : 2, Size on disk (or RAM!)
"lastExtentSize" : 20872960,
"paddingFactor" : 1,
"flags" : 0, Size of all indexes
"totalIndexSize" : 99860480,
"indexSizes" : {
"_id_" : 55877632,
"x_1" : 43982848},
"ok" : 1 Size of each index
}

~200 seeks / second ~200 seeks / second ~200 seeks / second

 Faster, but less reliable

~400 seeks / second ~400 seeks / second ~400 seeks / second

 Faster and more reliable ($$$ though)

 Old and busted  master/slave replication

 The new hotness  replica sets with automatic
failover
Read / Write Primary

Read Secondary

Read Secondary

 Primary handles all
writes

 Application optionally
sends reads to slaves

 Heartbeat manages
automatic failover

 Special collection (the oplog) records operations
idempotently

 Secondaries read from primary oplog and replay
operations locally

 Space is preallocated and fixed for the oplog

{
"ts" : Timestamp(1317653790000, 2),
Insert
"h" : -6022751846629753359,
"op" : "i",
"ns" : "confoo.People", Collection name
"o" : {
"_id" : ObjectId("4e89cd1e0364241932324269"),
"first" : "Rick",
"last" : "Copeland”
}
} Object to insert

 Use heartbeat signal to detect failure

 When primary can’t be reached, elect a new one

 Replica that’s the most up-to-date is chosen

 If there is skew, changes not on new primary are
saved to a .bson file for manual reconciliation

 Application can require data to be replicated to a
majority to ensure this doesn’t happen

 Priority
 Slower nodes with lower priority
 Backup or read-only nodes to never be primary

 slaveDelay
 Fat-finger protection

 Data center awareness and tagging
 Application can ensure complex replication
guarantees

 Reads scale nicely
 As long as the working set fits in RAM
 … and you don’t mind eventual consistency

 Sharding to the rescue!
 Automatically partitioned data sets
 Scale writes and reads
 Automatic load balancing between the shards

Configuration
MongoS MongoS
Config 1 Config 2 Config 3

Shard 1 Shard 2 Shard 3 Shard 4
0..10 10..20 20..30 30..40

Primary Primary Primary Primary

Secondary Secondary Secondary Secondary

Secondary Secondary Secondary Secondary

 Sharding is per-collection and range-based

 The highest-impact choice (and hardest to
change decision) you make is the shard key
 Random keys: good for writes, bad for reads
 Right-aligned index: bad for writes
 Small # of discrete keys: very bad
 Ideal: balance writes, make reads routable by mongos
 Optimal shard key selection is hard

Primary Data Center Secondary Data Center

Shard 1 Shard 1 Shard 1
Priority 1 Priority 1 Priority 0


RS3

Config 1 Config 2 Config 3

 Writes and reads both scale (with good choice of
shard key)

 Reads scale while remaining strongly consistent

 Partitioning ensures you get more usable RAM

 Pitfall: don’t wait too long to add capacity

Scaling with MongoDB

More Related Content

What's hot (20)

Viewers also liked (11)

Similar to Scaling with MongoDB (20)

More from Rick Copeland (12)

Recently uploaded (20)

Scaling with MongoDB

Editor's Notes