SlideShare a Scribd company logo
Modeling Data
 in MongoDB
   Luke Ehresman



   http://copperegg.com
Schema Design
Schema Design

Wait, isn’t MongoDB schemaless?
Schema Design

Wait, isn’t MongoDB schemaless?

         Nope!
   (just no predefined schema)
Schema Design

Wait, isn’t MongoDB schemaless?

            Nope!
    (just no predefined schema)

That means it’s up to your application.
Schema Design
    (Relational)
Schema Design
           (Relational)

• Tabular data - Tables, Rows, Columns
Schema Design
           (Relational)

• Tabular data - Tables, Rows, Columns
• Normalized - flatten your data
Schema Design
           (Relational)

• Tabular data - Tables, Rows, Columns
• Normalized - flatten your data
• Columns with simple values (int, varchar)
Schema Design
           (Relational)

• Tabular data - Tables, Rows, Columns
• Normalized - flatten your data
• Columns with simple values (int, varchar)
• Relate rows with foreign key references
Schema Design
           (Relational)

• Tabular data - Tables, Rows, Columns
• Normalized - flatten your data
• Columns with simple values (int, varchar)
• Relate rows with foreign key references
• Reuse, don’t repeat (i.e. person)
Schema Design
           (Relational)

• Tabular data - Tables, Rows, Columns
• Normalized - flatten your data
• Columns with simple values (int, varchar)
• Relate rows with foreign key references
• Reuse, don’t repeat (i.e. person)
• Indexes on values
Schema Design
 (MongoDB - Non-Relational)
Schema Design
       (MongoDB - Non-Relational)

• Databases > Collections > Documents
Schema Design
       (MongoDB - Non-Relational)

• Databases > Collections > Documents
• Simple or complex values
   (ints, strings, objects, arrays)
Schema Design
       (MongoDB - Non-Relational)

• Databases > Collections > Documents
• Simple or complex values
   (ints, strings, objects, arrays)
• Documents are monolithic units
Schema Design
       (MongoDB - Non-Relational)

• Databases > Collections > Documents
• Simple or complex values
   (ints, strings, objects, arrays)
• Documents are monolithic units
• Embedded complex data structures
Schema Design
        (MongoDB - Non-Relational)

• Databases > Collections > Documents
• Simple or complex values
    (ints, strings, objects, arrays)
• Documents are monolithic units
• Embedded complex data structures
• No joins - repeat data for faster access
Schema Design
        (MongoDB - Non-Relational)

• Databases > Collections > Documents
• Simple or complex values
    (ints, strings, objects, arrays)
• Documents are monolithic units
• Embedded complex data structures
• No joins - repeat data for faster access
• Difficult to relate documents together
How will you use it?
How will you use it?
• The best way to use MongoDB is to tailor
  your schema to how it will be used
How will you use it?
• The best way to use MongoDB is to tailor
  your schema to how it will be used
• Things to consider:
How will you use it?
• The best way to use MongoDB is to tailor
  your schema to how it will be used
• Things to consider:
 • minimize reads and/or writes
How will you use it?
• The best way to use MongoDB is to tailor
  your schema to how it will be used
• Things to consider:
 • minimize reads and/or writes
 • more writes, fewer reads? (read heavy)
How will you use it?
• The best way to use MongoDB is to tailor
  your schema to how it will be used
• Things to consider:
 • minimize reads and/or writes
 • more writes, fewer reads? (read heavy)
 • more reads, fewer writes? (write heavy)
How will you use it?
How will you use it?
• Combine objects into one document if you
  will use them together.
How will you use it?
• Combine objects into one document if you
  will use them together.
• Example: Authors and Books
How will you use it?
• Combine objects into one document if you
  will use them together.
• Example: Authors and Books
• Separate them if they need to be used
  separately -- but beware, no joins!
How will you use it?
• Combine objects into one document if you
  will use them together.
• Example: Authors and Books
• Separate them if they need to be used
  separately -- but beware, no joins!
• Or duplicate the data -- but beware!
Precompute!
Precompute!
• Philosophy: do work before reads occur
Precompute!
• Philosophy: do work before reads occur
• Disk space is cheap - compute time is not
     (it’s expensive because users wait)
Precompute!
• Philosophy: do work before reads occur
• Disk space is cheap - compute time is not
     (it’s expensive because users wait)
• Do joins on write, not on read
Precompute!
• Philosophy: do work before reads occur
• Disk space is cheap - compute time is not
     (it’s expensive because users wait)
• Do joins on write, not on read
• Do complex aggregation ahead of time
Precompute!
• Philosophy: do work before reads occur
• Disk space is cheap - compute time is not
     (it’s expensive because users wait)
• Do joins on write, not on read
• Do complex aggregation ahead of time
• Optimize for specific use cases
Precompute!
• Philosophy: do work before reads occur
• Disk space is cheap - compute time is not
     (it’s expensive because users wait)
• Do joins on write, not on read
• Do complex aggregation ahead of time
• Optimize for specific use cases
• Delayed data is not always bad in real life
Aggregation
Aggregation

• Application
Aggregation

• Application
• MapReduce (BEWARE!)
Aggregation

• Application
• MapReduce (BEWARE!)
• Group
Aggregation

• Application
• MapReduce (BEWARE!)
• Group
• Aggregation framework (coming in 2.2)
Atomicity
Atomicity

• MongoDB does have atomic transactions
Atomicity

• MongoDB does have atomic transactions
• Scope is a single document
Atomicity

• MongoDB does have atomic transactions
• Scope is a single document
• Keep this in mind when designing schemas
Atomicity
Atomicity

• $inc
Atomicity

• $inc
• $push
Atomicity

• $inc
• $push
• $addToSet
Atomicity

• $inc
• $push
• $addToSet
• upsert (create-if-none-else-update)
Atomicity
• Upsert example
  db.stats.update({_id: ‘lehresman’},
     {$inc: {logins: 1},
      $set: {last_login: new Date()}},
     true);


• {_id:‘lehresman’, logins:1, last_login:A}
• {_id:‘lehresman’, logins:2, last_login:B}
Example: Books

• Many books
• Many authors
• Authors write many books
Example: Books

                             Bad N oSQL
• Many books                  Ex ample!!
• Many authors
• Authors write many books
Example: User Stats


• You have users
• Track what pages they visit
Example: User Stats
“users” collection
{ _id: ‘lehresman’,
  first_name: ‘Luke’,
  last_name: ‘Ehresman’,
  page_visits: {
    ‘/’: 78,
    ‘/profile’: 33,
    ‘/blog/38919’: 2
  }
                   Problem: What if you want
}
                    aggregate stats across users?
Example: User Stats
“visits” collection
{ _id: ‘/’,
  visits: 73889 }

{ _id: ‘/profile’,
  visits: 9341 }

{ _id: ‘/blog/38919’
  visits: 1678 }
Example: User Stats
“visits” collection
{ _id: ‘/’,
  visits: 73889 }

{ _id: ‘/profile’,
  visits: 9341 }

{ _id: ‘/blog/38919’        Problems:
  visits: 1678 }         No user tracking;
                         What if you want
                       aggregate stats by day?
Example: User Stats
“visits” collection
{ _id: ‘/’,
  visits: 73889,
  { ‘2012-06-01’: 839,
    ‘2012-06-02’: 767,
    ‘2012-06-03’: 881 }
Example: User Stats
“visits” collection
{ _id: ‘/’,
  visits: 73889,
  { ‘2012-06-01’: 839,
    ‘2012-06-02’: 767,
    ‘2012-06-03’: 881 }

             Problems: No user tracking;
              Possibly too large eventually.
                     Always grows.
Example: User Stats
“visits” collection
{ date: ‘2012-06-01’,
  page: ‘/’,
  visits: 839,
  users: {
    ‘lehresman’: 78,
    ‘billybob’: 761
  }
}
Example: User Stats
“visits” collection
{ date: ‘2012-06-01’,
  page: ‘/’,
  visits: 839,
  users: {
    ‘lehresman’: 78,
    ‘billybob’: 761
  }
}
             No relational integrity.
   (up to your application to handle null cases)

More Related Content

PDF
MongoDB World 2019 - A Complete Methodology to Data Modeling for MongoDB
PPTX
Back to Basics 1: Thinking in documents
PPT
MongoDB Schema Design
PDF
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
PDF
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
PPTX
Socialite, the Open Source Status Feed Part 2: Managing the Social Graph
PPTX
Webinar: Schema Design
PDF
Building your first app with mongo db
MongoDB World 2019 - A Complete Methodology to Data Modeling for MongoDB
Back to Basics 1: Thinking in documents
MongoDB Schema Design
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
Socialite, the Open Source Status Feed Part 2: Managing the Social Graph
Webinar: Schema Design
Building your first app with mongo db

What's hot (20)

PDF
MongoDB and Schema Design
KEY
Practical Ruby Projects With Mongo Db
PPTX
MongoDB Schema Design: Four Real-World Examples
PPTX
Webinar: Back to Basics: Thinking in Documents
PPTX
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
PPTX
Socialite, the Open Source Status Feed
PPTX
Back to Basics Webinar 1: Introduction to NoSQL
PPTX
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
PPTX
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
PPTX
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
PPTX
PPT
5 Pitfalls to Avoid with MongoDB
PPTX
Indexing Strategies to Help You Scale
PPT
Introduction to MongoDB
PPTX
User Data Management with MongoDB
PPTX
MongoDB 101
PPTX
Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling fo...
PPT
2011 Mongo FR - MongoDB introduction
PPTX
MongoDB Schema Design by Examples
PPTX
Mongo db operations_v2
MongoDB and Schema Design
Practical Ruby Projects With Mongo Db
MongoDB Schema Design: Four Real-World Examples
Webinar: Back to Basics: Thinking in Documents
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Socialite, the Open Source Status Feed
Back to Basics Webinar 1: Introduction to NoSQL
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
5 Pitfalls to Avoid with MongoDB
Indexing Strategies to Help You Scale
Introduction to MongoDB
User Data Management with MongoDB
MongoDB 101
Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling fo...
2011 Mongo FR - MongoDB introduction
MongoDB Schema Design by Examples
Mongo db operations_v2
Ad

Viewers also liked (20)

PDF
Building a Social Network with MongoDB
PPTX
Common MongoDB Use Cases
PPTX
MongoDB Advanced Schema Design - Inboxes
PPTX
The Right (and Wrong) Use Cases for MongoDB
KEY
Schema Design with MongoDB
PDF
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
PDF
Apache Cassandra in the Real World
PDF
Apache Cassandra in the Real World
PPTX
MongoDB for Time Series Data: Schema Design
PPTX
MongoDB for Time Series Data
PPT
The MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.js
PPTX
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
PPTX
Internet of Things and Big Data: Vision and Concrete Use Cases
PPTX
Back to Basics Webinar 1: Introduction to NoSQL
PPTX
Webinar: Right and Wrong Ways to Implement MongoDB
KEY
MongoDB hearts Django? (Django NYC)
PDF
MongoDB Schema Design
KEY
Schema Design
KEY
Schema Design at Scale
PPTX
Mango Database - Web Development
Building a Social Network with MongoDB
Common MongoDB Use Cases
MongoDB Advanced Schema Design - Inboxes
The Right (and Wrong) Use Cases for MongoDB
Schema Design with MongoDB
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
Apache Cassandra in the Real World
Apache Cassandra in the Real World
MongoDB for Time Series Data: Schema Design
MongoDB for Time Series Data
The MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.js
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
Internet of Things and Big Data: Vision and Concrete Use Cases
Back to Basics Webinar 1: Introduction to NoSQL
Webinar: Right and Wrong Ways to Implement MongoDB
MongoDB hearts Django? (Django NYC)
MongoDB Schema Design
Schema Design
Schema Design at Scale
Mango Database - Web Development
Ad

Similar to Modeling Data in MongoDB (20)

KEY
Introduction to MongoDB
PPTX
PDF
MongoDB Basics
PPTX
Sharing a Startup’s Big Data Lessons
PPTX
Graph Databases
PPTX
Demystifying data engineering
PPT
No sql Database
PDF
R, Hadoop and Amazon Web Services
PDF
"R, Hadoop, and Amazon Web Services (20 December 2011)"
PDF
MongoDB: What, why, when
PPTX
MongoDB using Grails plugin by puneet behl
PDF
Scalable web architecture
PPT
Tech Gupshup Meetup On MongoDB - 24/06/2016
PDF
Building better SQL Server Databases
PPTX
Advanced Schema Design Patterns
PPTX
Webinar: When to Use MongoDB
PPTX
NoSQL and MongoDB
KEY
From 100s to 100s of Millions
Introduction to MongoDB
MongoDB Basics
Sharing a Startup’s Big Data Lessons
Graph Databases
Demystifying data engineering
No sql Database
R, Hadoop and Amazon Web Services
"R, Hadoop, and Amazon Web Services (20 December 2011)"
MongoDB: What, why, when
MongoDB using Grails plugin by puneet behl
Scalable web architecture
Tech Gupshup Meetup On MongoDB - 24/06/2016
Building better SQL Server Databases
Advanced Schema Design Patterns
Webinar: When to Use MongoDB
NoSQL and MongoDB
From 100s to 100s of Millions

Recently uploaded (20)

PPTX
Telecom Fraud Prevention Guide | Hyperlink InfoSystem
PPTX
Cloud computing and distributed systems.
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Electronic commerce courselecture one. Pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
MYSQL Presentation for SQL database connectivity
PDF
CIFDAQ's Market Wrap: Ethereum Leads, Bitcoin Lags, Institutions Shift
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Transforming Manufacturing operations through Intelligent Integrations
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Advanced IT Governance
Telecom Fraud Prevention Guide | Hyperlink InfoSystem
Cloud computing and distributed systems.
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
GamePlan Trading System Review: Professional Trader's Honest Take
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Electronic commerce courselecture one. Pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Diabetes mellitus diagnosis method based random forest with bat algorithm
MYSQL Presentation for SQL database connectivity
CIFDAQ's Market Wrap: Ethereum Leads, Bitcoin Lags, Institutions Shift
Advanced Soft Computing BINUS July 2025.pdf
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
Empathic Computing: Creating Shared Understanding
Transforming Manufacturing operations through Intelligent Integrations
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Big Data Technologies - Introduction.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Advanced IT Governance

Modeling Data in MongoDB

  • 1. Modeling Data in MongoDB Luke Ehresman http://copperegg.com
  • 3. Schema Design Wait, isn’t MongoDB schemaless?
  • 4. Schema Design Wait, isn’t MongoDB schemaless? Nope! (just no predefined schema)
  • 5. Schema Design Wait, isn’t MongoDB schemaless? Nope! (just no predefined schema) That means it’s up to your application.
  • 6. Schema Design (Relational)
  • 7. Schema Design (Relational) • Tabular data - Tables, Rows, Columns
  • 8. Schema Design (Relational) • Tabular data - Tables, Rows, Columns • Normalized - flatten your data
  • 9. Schema Design (Relational) • Tabular data - Tables, Rows, Columns • Normalized - flatten your data • Columns with simple values (int, varchar)
  • 10. Schema Design (Relational) • Tabular data - Tables, Rows, Columns • Normalized - flatten your data • Columns with simple values (int, varchar) • Relate rows with foreign key references
  • 11. Schema Design (Relational) • Tabular data - Tables, Rows, Columns • Normalized - flatten your data • Columns with simple values (int, varchar) • Relate rows with foreign key references • Reuse, don’t repeat (i.e. person)
  • 12. Schema Design (Relational) • Tabular data - Tables, Rows, Columns • Normalized - flatten your data • Columns with simple values (int, varchar) • Relate rows with foreign key references • Reuse, don’t repeat (i.e. person) • Indexes on values
  • 13. Schema Design (MongoDB - Non-Relational)
  • 14. Schema Design (MongoDB - Non-Relational) • Databases > Collections > Documents
  • 15. Schema Design (MongoDB - Non-Relational) • Databases > Collections > Documents • Simple or complex values (ints, strings, objects, arrays)
  • 16. Schema Design (MongoDB - Non-Relational) • Databases > Collections > Documents • Simple or complex values (ints, strings, objects, arrays) • Documents are monolithic units
  • 17. Schema Design (MongoDB - Non-Relational) • Databases > Collections > Documents • Simple or complex values (ints, strings, objects, arrays) • Documents are monolithic units • Embedded complex data structures
  • 18. Schema Design (MongoDB - Non-Relational) • Databases > Collections > Documents • Simple or complex values (ints, strings, objects, arrays) • Documents are monolithic units • Embedded complex data structures • No joins - repeat data for faster access
  • 19. Schema Design (MongoDB - Non-Relational) • Databases > Collections > Documents • Simple or complex values (ints, strings, objects, arrays) • Documents are monolithic units • Embedded complex data structures • No joins - repeat data for faster access • Difficult to relate documents together
  • 20. How will you use it?
  • 21. How will you use it? • The best way to use MongoDB is to tailor your schema to how it will be used
  • 22. How will you use it? • The best way to use MongoDB is to tailor your schema to how it will be used • Things to consider:
  • 23. How will you use it? • The best way to use MongoDB is to tailor your schema to how it will be used • Things to consider: • minimize reads and/or writes
  • 24. How will you use it? • The best way to use MongoDB is to tailor your schema to how it will be used • Things to consider: • minimize reads and/or writes • more writes, fewer reads? (read heavy)
  • 25. How will you use it? • The best way to use MongoDB is to tailor your schema to how it will be used • Things to consider: • minimize reads and/or writes • more writes, fewer reads? (read heavy) • more reads, fewer writes? (write heavy)
  • 26. How will you use it?
  • 27. How will you use it? • Combine objects into one document if you will use them together.
  • 28. How will you use it? • Combine objects into one document if you will use them together. • Example: Authors and Books
  • 29. How will you use it? • Combine objects into one document if you will use them together. • Example: Authors and Books • Separate them if they need to be used separately -- but beware, no joins!
  • 30. How will you use it? • Combine objects into one document if you will use them together. • Example: Authors and Books • Separate them if they need to be used separately -- but beware, no joins! • Or duplicate the data -- but beware!
  • 32. Precompute! • Philosophy: do work before reads occur
  • 33. Precompute! • Philosophy: do work before reads occur • Disk space is cheap - compute time is not (it’s expensive because users wait)
  • 34. Precompute! • Philosophy: do work before reads occur • Disk space is cheap - compute time is not (it’s expensive because users wait) • Do joins on write, not on read
  • 35. Precompute! • Philosophy: do work before reads occur • Disk space is cheap - compute time is not (it’s expensive because users wait) • Do joins on write, not on read • Do complex aggregation ahead of time
  • 36. Precompute! • Philosophy: do work before reads occur • Disk space is cheap - compute time is not (it’s expensive because users wait) • Do joins on write, not on read • Do complex aggregation ahead of time • Optimize for specific use cases
  • 37. Precompute! • Philosophy: do work before reads occur • Disk space is cheap - compute time is not (it’s expensive because users wait) • Do joins on write, not on read • Do complex aggregation ahead of time • Optimize for specific use cases • Delayed data is not always bad in real life
  • 42. Aggregation • Application • MapReduce (BEWARE!) • Group • Aggregation framework (coming in 2.2)
  • 44. Atomicity • MongoDB does have atomic transactions
  • 45. Atomicity • MongoDB does have atomic transactions • Scope is a single document
  • 46. Atomicity • MongoDB does have atomic transactions • Scope is a single document • Keep this in mind when designing schemas
  • 51. Atomicity • $inc • $push • $addToSet • upsert (create-if-none-else-update)
  • 52. Atomicity • Upsert example db.stats.update({_id: ‘lehresman’}, {$inc: {logins: 1}, $set: {last_login: new Date()}}, true); • {_id:‘lehresman’, logins:1, last_login:A} • {_id:‘lehresman’, logins:2, last_login:B}
  • 53. Example: Books • Many books • Many authors • Authors write many books
  • 54. Example: Books Bad N oSQL • Many books Ex ample!! • Many authors • Authors write many books
  • 55. Example: User Stats • You have users • Track what pages they visit
  • 56. Example: User Stats “users” collection { _id: ‘lehresman’, first_name: ‘Luke’, last_name: ‘Ehresman’, page_visits: { ‘/’: 78, ‘/profile’: 33, ‘/blog/38919’: 2 } Problem: What if you want } aggregate stats across users?
  • 57. Example: User Stats “visits” collection { _id: ‘/’, visits: 73889 } { _id: ‘/profile’, visits: 9341 } { _id: ‘/blog/38919’ visits: 1678 }
  • 58. Example: User Stats “visits” collection { _id: ‘/’, visits: 73889 } { _id: ‘/profile’, visits: 9341 } { _id: ‘/blog/38919’ Problems: visits: 1678 } No user tracking; What if you want aggregate stats by day?
  • 59. Example: User Stats “visits” collection { _id: ‘/’, visits: 73889, { ‘2012-06-01’: 839, ‘2012-06-02’: 767, ‘2012-06-03’: 881 }
  • 60. Example: User Stats “visits” collection { _id: ‘/’, visits: 73889, { ‘2012-06-01’: 839, ‘2012-06-02’: 767, ‘2012-06-03’: 881 } Problems: No user tracking; Possibly too large eventually. Always grows.
  • 61. Example: User Stats “visits” collection { date: ‘2012-06-01’, page: ‘/’, visits: 839, users: { ‘lehresman’: 78, ‘billybob’: 761 } }
  • 62. Example: User Stats “visits” collection { date: ‘2012-06-01’, page: ‘/’, visits: 839, users: { ‘lehresman’: 78, ‘billybob’: 761 } } No relational integrity. (up to your application to handle null cases)

Editor's Notes