SlideShare a Scribd company logo
#MongoSV 2012




Schema Design
-- Inboxes!
Jared Rosoff
Technical Director, 10gen
@forjared
Agenda
• Problem overview
• Design Options
  – Fan out on Read
  – Fan out on Write
  – Fan out on Write with Bucketing

• Conclusions




                         Single Table En
Problem Overview
Let’s get
Social
Sending Messages



               ?
Reading my Inbox



                   ?
Design Options
3 Approaches (there are
more)
• Fan out on Read
• Fan out on Write
• Fan out on Write with Bucketing
Fan out on read
• Generally, not the right approach
• 1 document per message sent
• Multiple recipients in an array key
• Reading an inbox is finding all messages with
 my own name in the recipient field
• Requires scatter-gather on sharded cluster
• Then a lot of random IO on a shard to find
 everything
Fan out on Read
// Shard on “from”
db.shardCollection(”myapp.messages”, { ”from”: 1} )

// Make sure we have an index to handle inbox reads
db.messages.ensureIndex( { ”to”: 1, ”sent”: 1 } )

msg = {
   from: "Joe”,
   to: [ ”Bob”, “Jane” ],
   sent: new Date(),
   message: ”Hi!”,
}

// Send a message
db.messages.save(msg)

// Read my inbox
db.messages.find({ to: ”Joe” }).sort({ sent: -1 })
Fan out on read – Send
Message
             Send
            Message




  Shard 1             Shard 2   Shard 3
Fan out on read – Inbox Read
            Read
            Inbox




  Shard 1           Shard 2   Shard 3
Fan out on write
• Tends to scale better than fan out on read
• 1 document per recipient
• Reading my inbox is just finding all of the
 messages with me as the recipient
• Can shard on recipient, so inbox reads hit one
 shard
• But still lots of random IO on the shard
Fan out on Write
// Shard on “recipient” and “sent”
db.shardCollection(”myapp.messages”, { ”recipient”: 1, ”sent”: 1 } )

msg = {
   from: "Joe”,
   to: [ ”Bob”, “Jane” ],
   sent: new Date(),
   message: ”Hi!”,
}

// Send a message
for( recipient in msg.to ) {
     msg.recipient = recipient
     db.messages.save(msg);
}

// Read my inbox
db.messages.find({ recipient: ”Joe” }).sort({ sent: -1 })
Fan out on write – Send
Message
             Send
            Message




  Shard 1             Shard 2   Shard 3
Fan out on write– Read Inbox
            Read
            Inbox




  Shard 1           Shard 2   Shard 3
Fan out on write with
bucketing
• Generally the best approach
• Each “inbox” document is an array of messages
• Append a message onto “inbox” of recipient
• Bucket inbox documents so there’s not too many
 per document
• Can shard on recipient, so inbox reads hit one
 shard
• 1 or 2 documents to read the whole inbox
Fan out on Write
// Shard on “owner / sequence”
db.shardCollection(”myapp.inbox”, { ”owner”: 1, ”sequence”: 1 } )
db.shardCollection(”myapp.users”, { ”user_name”: 1 } )
msg = {
     from: "Joe”,
     to: [ ”Bob”, “Jane” ],
     sent: new Date(),
     message: ”Hi!”,
}
// Send a message
for( recipient in msg.to) {
     sequence = db.users.findAndModify({
           query: { user_name: recipient},
           update: { '$inc': { ‟msg_count': 1 }},
           upsert: true,
           new: true }).msg_count / 50
     db.inbox.update({ owner: recipient, sequence: sequence},
                        { $push: { „messages‟: msg } },
                        { upsert: true });
}
// Read my inbox
db.inbox.find({ owner: ”Joe” }).sort({ sequence: -1 }).limit(2)
Bucketed fan out on write -
Send
             Send
            Message




  Shard 1             Shard 2   Shard 3
Bucketed fan out on write -
Read
            Read
            Inbox




  Shard 1           Shard 2   Shard 3
Discussion
Tradeoffs
                 Fan out on              Fan out on          Bucketed Fan out
                   Read                    Write                 on Write
Send Message   Best                   Good                  Worst
Performance    Single shard           Shard per recipient   Shard per recipient
               Single write           Multiple writes       Appends (grows)
Read Inbox     Worst                  Good                  Best
Performance    Broadcast all shards   Single shard          Single shard
               Random reads           Random reads          Single read
Data Size      Best                   Worst                 Worst
               Message stored         Copy per recipient    Copy per recipient
               once
Things to consider
•   Lots of recipients
     •   Fan out on write might become prohibitive
     •   Consider introducing a “Group”

•   Very large message size
     •   Multiple copies of messages can be a burden
     •   Consider single copy of message with a “pointer” per inbox

•   More writes than reads
     •   Fan out on read might be okay
Comments – where do they
live?
Conclusion
Summary
• Multiple ways to model status updates
• Bucketed fan out on write is typically the better
 approach
• Think about how your model distributes across
 shards
• Think about how much random IO needs to
 happen on a shard
#MongoSV




Thank You
Jared Rosoff
Technical Director, 10gen

More Related Content

PDF
MongoDB Quick Reference Card
PPTX
Schema Design - Real world use case
PPTX
Data Modeling Deep Dive
PPTX
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
PPTX
Data Modeling Examples from the Real World
PPTX
Data Modeling for the Real World
PPTX
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
PPTX
Webinar: Data Modeling Examples in the Real World
MongoDB Quick Reference Card
Schema Design - Real world use case
Data Modeling Deep Dive
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
Data Modeling Examples from the Real World
Data Modeling for the Real World
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
Webinar: Data Modeling Examples in the Real World

Similar to MongoDB Schema Design -- Inboxes (20)

PPTX
MongoDB Schema Design: Four Real-World Examples
PPTX
MongoDB Schema Design: Four Real-World Examples
PPTX
Choosing a Shard key
PPTX
Globally Distributed RESTful Object Storage
PDF
Ari Zilka Cluster Architecture Patterns
PDF
Introduction to Sharding
PDF
Sharding with MongoDB -- MongoNYC 2012
PPTX
Scaling with MongoDB
PPT
2011 mongo FR - scaling with mongodb
KEY
Mongodb sharding
PDF
Шардинг в MongoDB, Henrik Ingo (MongoDB)
PPTX
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
PPTX
Back tobasicswebinar part6-rev.
PDF
One to Many: The Story of Sharding at Box
PPTX
MongoDB Auto-Sharding at Mongo Seattle
PDF
Sharding with MongoDB -- MongoDC 2012
KEY
Scaling with MongoDB
PPTX
Mongosv 2011 - Sharding
PPTX
Back to Basics: Build Something Big With MongoDB
PPTX
MongoDB Sharding
MongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World Examples
Choosing a Shard key
Globally Distributed RESTful Object Storage
Ari Zilka Cluster Architecture Patterns
Introduction to Sharding
Sharding with MongoDB -- MongoNYC 2012
Scaling with MongoDB
2011 mongo FR - scaling with mongodb
Mongodb sharding
Шардинг в MongoDB, Henrik Ingo (MongoDB)
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
Back tobasicswebinar part6-rev.
One to Many: The Story of Sharding at Box
MongoDB Auto-Sharding at Mongo Seattle
Sharding with MongoDB -- MongoDC 2012
Scaling with MongoDB
Mongosv 2011 - Sharding
Back to Basics: Build Something Big With MongoDB
MongoDB Sharding
Ad

More from Jeremy Taylor (8)

PDF
TCO - MongoDB vs. Oracle
PPTX
Building Your First App with MongoDB
PPTX
Strategies For Backing Up Mongo Db 10.2012 Copy
PDF
MongoDB on Windows Azure
PDF
MongoDB on Windows Azure
PDF
How Apollo Group Evaluted MongoDB
PDF
AWS & MongoDB
PDF
Mongodb Introduction
TCO - MongoDB vs. Oracle
Building Your First App with MongoDB
Strategies For Backing Up Mongo Db 10.2012 Copy
MongoDB on Windows Azure
MongoDB on Windows Azure
How Apollo Group Evaluted MongoDB
AWS & MongoDB
Mongodb Introduction
Ad

MongoDB Schema Design -- Inboxes

  • 1. #MongoSV 2012 Schema Design -- Inboxes! Jared Rosoff Technical Director, 10gen @forjared
  • 2. Agenda • Problem overview • Design Options – Fan out on Read – Fan out on Write – Fan out on Write with Bucketing • Conclusions Single Table En
  • 8. 3 Approaches (there are more) • Fan out on Read • Fan out on Write • Fan out on Write with Bucketing
  • 9. Fan out on read • Generally, not the right approach • 1 document per message sent • Multiple recipients in an array key • Reading an inbox is finding all messages with my own name in the recipient field • Requires scatter-gather on sharded cluster • Then a lot of random IO on a shard to find everything
  • 10. Fan out on Read // Shard on “from” db.shardCollection(”myapp.messages”, { ”from”: 1} ) // Make sure we have an index to handle inbox reads db.messages.ensureIndex( { ”to”: 1, ”sent”: 1 } ) msg = { from: "Joe”, to: [ ”Bob”, “Jane” ], sent: new Date(), message: ”Hi!”, } // Send a message db.messages.save(msg) // Read my inbox db.messages.find({ to: ”Joe” }).sort({ sent: -1 })
  • 11. Fan out on read – Send Message Send Message Shard 1 Shard 2 Shard 3
  • 12. Fan out on read – Inbox Read Read Inbox Shard 1 Shard 2 Shard 3
  • 13. Fan out on write • Tends to scale better than fan out on read • 1 document per recipient • Reading my inbox is just finding all of the messages with me as the recipient • Can shard on recipient, so inbox reads hit one shard • But still lots of random IO on the shard
  • 14. Fan out on Write // Shard on “recipient” and “sent” db.shardCollection(”myapp.messages”, { ”recipient”: 1, ”sent”: 1 } ) msg = { from: "Joe”, to: [ ”Bob”, “Jane” ], sent: new Date(), message: ”Hi!”, } // Send a message for( recipient in msg.to ) { msg.recipient = recipient db.messages.save(msg); } // Read my inbox db.messages.find({ recipient: ”Joe” }).sort({ sent: -1 })
  • 15. Fan out on write – Send Message Send Message Shard 1 Shard 2 Shard 3
  • 16. Fan out on write– Read Inbox Read Inbox Shard 1 Shard 2 Shard 3
  • 17. Fan out on write with bucketing • Generally the best approach • Each “inbox” document is an array of messages • Append a message onto “inbox” of recipient • Bucket inbox documents so there’s not too many per document • Can shard on recipient, so inbox reads hit one shard • 1 or 2 documents to read the whole inbox
  • 18. Fan out on Write // Shard on “owner / sequence” db.shardCollection(”myapp.inbox”, { ”owner”: 1, ”sequence”: 1 } ) db.shardCollection(”myapp.users”, { ”user_name”: 1 } ) msg = { from: "Joe”, to: [ ”Bob”, “Jane” ], sent: new Date(), message: ”Hi!”, } // Send a message for( recipient in msg.to) { sequence = db.users.findAndModify({ query: { user_name: recipient}, update: { '$inc': { ‟msg_count': 1 }}, upsert: true, new: true }).msg_count / 50 db.inbox.update({ owner: recipient, sequence: sequence}, { $push: { „messages‟: msg } }, { upsert: true }); } // Read my inbox db.inbox.find({ owner: ”Joe” }).sort({ sequence: -1 }).limit(2)
  • 19. Bucketed fan out on write - Send Send Message Shard 1 Shard 2 Shard 3
  • 20. Bucketed fan out on write - Read Read Inbox Shard 1 Shard 2 Shard 3
  • 22. Tradeoffs Fan out on Fan out on Bucketed Fan out Read Write on Write Send Message Best Good Worst Performance Single shard Shard per recipient Shard per recipient Single write Multiple writes Appends (grows) Read Inbox Worst Good Best Performance Broadcast all shards Single shard Single shard Random reads Random reads Single read Data Size Best Worst Worst Message stored Copy per recipient Copy per recipient once
  • 23. Things to consider • Lots of recipients • Fan out on write might become prohibitive • Consider introducing a “Group” • Very large message size • Multiple copies of messages can be a burden • Consider single copy of message with a “pointer” per inbox • More writes than reads • Fan out on read might be okay
  • 24. Comments – where do they live?
  • 26. Summary • Multiple ways to model status updates • Bucketed fan out on write is typically the better approach • Think about how your model distributes across shards • Think about how much random IO needs to happen on a shard