SlideShare a Scribd company logo
Building Your First Application in Java




                               1
is a…

• High performance
• Highly available
• Easily scalable
• Easy to use
• Feature rich

             Document store
Data Model
• A Mongo system holds a set of databases
• A database holds a set of collections
• A collection holds a set of documents
• A document is a set of fields
• A field is a key-value pair
• A key is a name (string)
• A value is a
        basic type like
string, integer, float, timestamp, binary, etc.,
       a document, or
       an array of values
High Availability: Replica Sets
•   Initialize -> Election
•   Primary + data replication from primary to secondary




       Node 1                                 Node 2
      Secondary               Heartbeat      Secondary



                              Node 3
                              Primary               Replication
                Replication
Replica Set - Failure
•   Primary down/network failure
•   Automatic election of new primary if majority exists



                            Primary Election
       Node 1                                   Node 2
      Secondary              Heartbeat         Secondary



                              Node 3
                              Primary
Replica Set - Failover
•   New primary elected
•   Replication established from new primary




       Node 1                                  Node 2
      Secondary             Heartbeat          Primary



                             Node 3
                             Primary
Durability
• Fire and forget
• Wait for error
• Wait for journal sync
• Wait for flush to disk
• Wait for replication
Read Preferences

  •   PRIMARY
  •   PRIMARY PREFERRED
  •   SECONDARY
  •   SECONDARY PREFERRED
  •   NEAREST
Let’s build a location based surf reporting app!
Let’s build a location based surf reporting app!




• Report current conditions
Let’s build a location based surf reporting app!




• Report current conditions
• Get current local conditions
Let’s build a location based surf reporting app!




• Report current conditions
• Get current local conditions
• Determine best conditions per beach
Document Structure
{
    "_id" : ObjectId("504ceb3d30042d707af96fef"),
    "reporter" : "test",
    "location" : {
               "coordinates" : [
                          -122.477222,
                          37.810556
               ],
               "name" : "Fort Point"
    },
    "conditions" : {
               "height" : 0,
               "period" : 9,
               "rating" : 1
    },
    "date" : ISODate("2011-11-16T20:17:17.277Z")
}
Document Structure
{
    "_id" : ObjectId("504ceb3d30042d707af96fef"),   Primary Key,
    "reporter" : "test",
                                                    Unique,
    "location" : {
               "coordinates" : [                    Auto-indexed
                          -122.477222,
                          37.810556
               ],
               "name" : "Fort Point"
    },
    "conditions" : {
               "height" : 0,
               "period" : 9,
               "rating" : 1
    },
    "date" : ISODate("2011-11-16T20:17:17.277Z")
}
Document Structure
{
    "_id" : ObjectId("504ceb3d30042d707af96fef"),   Primary Key,
    "reporter" : "test",
                                                    Unique,
    "location" : {
               "coordinates" : [                    Autoindexed
                          -122.477222,
                          37.810556
               ],                                   Compound Index,
               "name" : "Fort Point"                Geospacial
    },
    "conditions" : {
               "height" : 0,
               "period" : 9,
               "rating" : 1
    },
    "date" : ISODate("2011-11-16T20:17:17.277Z")
}
Document Structure
{
    "_id" : ObjectId("504ceb3d30042d707af96fef"),   Primary Key,
    "reporter" : "test",
                                                    Unique,
    "location" : {
               "coordinates" : [                    Autoindexed
                          -122.477222,
                          37.810556
               ],                                   Compound Index,
               "name" : "Fort Point"                Geospacial
    },
    "conditions" : {
               "height" : 0,
               "period" : 9,
               "rating" : 1
    },                                               Indexed for
    "date" : ISODate("2011-11-16T20:17:17.277Z")     Time-To-Live
}
Get local surf conditions
 db.reports.find(
           {
           "location.coordinates" : { $near : [-122, 37] ,
           $maxDistance : 0.9},
           date : { $gte : new Date(2012, 8, 9)}
           },
           {"date" : 1, "location.name" :1, _id : 0, "conditions" :1}
 ).sort({"conditions.rating" : -1})
Get local surf conditions
 db.reports.find(
           {
           "location.coordinates" : { $near : [-122, 37] ,
           $maxDistance : 0.9},
           date : { $gte : new Date(2012, 8, 9)}
           },
           {"date" : 1, "location.name" :1, _id : 0, "conditions" :1}
 ).sort({"conditions.rating" : -1})

 • Get local reports
Get local surf conditions
 db.reports.find(
           {
           "location.coordinates" : { $near : [-122, 37] ,
           $maxDistance : 0.9},
           date : { $gte : new Date(2012, 8, 9)}
           },
           {"date" : 1, "location.name" :1, _id : 0, "conditions" :1}
 ).sort({"conditions.rating" : -1})

 • Get local reports
 • Get today’s reports
Get local surf conditions
 db.reports.find(
           {
           "location.coordinates" : { $near : [-122, 37] ,
           $maxDistance : 0.9},
           date : { $gte : new Date(2012, 8, 9)}
           },
           {"location.name" :1, _id : 0, "conditions" :1}
 ).sort({"conditions.rating" : -1})

 • Get local reports
 • Get today’s reports
 • Return only the relevant info
Get local surf conditions
 db.reports.find(
           {
           "location.coordinates" : { $near : [-122, 37] ,
           $maxDistance : 0.9},
           date : { $gte : new Date(2012, 8, 9)}
           },
           {"location.name" :1, _id : 0, "conditions" :1}
 ).sort({"conditions.rating" : -1})

 •   Get local reports
 •   Get today’s reports
 •   Return only the relevant info
 •   Show me the best surf first
Get local surf conditions: Connecting
DBObjects




 Output:
            { "name" : "test"}
            parsed
Building the query
Results
{ "location" : { "name" : "Montara" }, "conditions" : { "height" : 6, "period" : 20, "rating" : 5 } }
{ "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 5, "period" : 13, "rating" : 3 } }
{ "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 3, "period" : 15, "rating" : 3 } }
{ "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 3, "period" : 16, "rating" : 2 } }
{ "location" : { "name" : "Montara" }, "conditions" : { "height" : 0, "period" : 8, "rating" : 1 } }
{ "location" : { "name" : "Linda Mar" }, "conditions" : { "height" : 3, "period" : 10, "rating" : 1 } }
{ "location" : { "name" : "Sharp Park" }, "conditions" : { "height" : 1, "period" : 15, "rating" : 1 } }
{ "location" : { "name" : "Sharp Park" }, "conditions" : { "height" : 5, "period" : 6, "rating" : 1 } }
{ "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 1, "period" : 6, "rating" : 1 } }
{ "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 0, "period" : 10, "rating" : 1 } }
{ "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 4, "period" : 6, "rating" : 1 } }
{ "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 0, "period" : 14, "rating" : 1 } }
Analysis Features:
  Aggregation Framework




  What are the best conditions for my local beach?
Pipelining Operations
  $match    Match “Linda Mar”


 $project   Only interested in conditions

            Group by rating, averaging
  $group
            wave height and wave period

   $sort    Order by best conditions
Aggregation Framework
{ "aggregate" : "reports" ,
   "pipeline" : [
     { "$match" : { "location.name" : "Linda Mar"}} ,
     { "$project" : { "conditions" : 1}} ,
     { "$group" : {
        "_id" : "$conditions.rating" ,
        "average height" : { "$avg" : "$conditions.height"} ,
        "average period" : { "$avg" : "$conditions.period"}}} ,
     { "$sort" : { "_id" : -1}}
   ]
}
Aggregation Framework
{ "aggregate" : "reports" ,
   "pipeline" : [
     { "$match" : { "location.name" : "Linda Mar"}} ,
     { "$project" : { "conditions" : 1}} ,
     { "$group" : {
        "_id" : "$conditions.rating" ,
        "average height" : { "$avg" : "$conditions.height"} ,
        "average period" : { "$avg" : "$conditions.period"}}} ,
     { "$sort" : { "_id" : -1}}
   ]
}


                     Match “Linda Mar”
Aggregation Framework
{ "aggregate" : "reports" ,
   "pipeline" : [
     { "$match" : { "location.name" : "Linda Mar"}} ,
     { "$project" : { "conditions" : 1}} ,
     { "$group" : {
        "_id" : "$conditions.rating" ,
        "average height" : { "$avg" : "$conditions.height"} ,
        "average period" : { "$avg" : "$conditions.period"}}} ,
     { "$sort" : { "_id" : -1}}
   ]
}


                 Only interested in conditions
Aggregation Framework
{ "aggregate" : "reports" ,
   "pipeline" : [
     { "$match" : { "location.name" : "Linda Mar"}} ,
     { "$project" : { "conditions" : 1}} ,
     { "$group" : {
        "_id" : "$conditions.rating" ,
        "average height" : { "$avg" : "$conditions.height"} ,
        "average period" : { "$avg" : "$conditions.period"}}} ,
     { "$sort" : { "_id" : -1}}
   ]
}


       Group by rating & average conditions
Aggregation Framework
{ "aggregate" : "reports" ,
   "pipeline" : [
     { "$match" : { "location.name" : "Linda Mar"}} ,
     { "$project" : { "conditions" : 1}} ,
     { "$group" : {
        "_id" : "$conditions.rating" ,
        "average height" : { "$avg" : "$conditions.height"} ,
        "average period" : { "$avg" : "$conditions.period"}}} ,
     { "$sort" : { "_id" : -1}}
   ]
}


            Show me best conditions first
The Aggregation Helper
Scaling
• Sharding is the partitioning of data among
  multiple machines
• Balancing occurs when the load on any one
  node grows out of proportion
Scaling MongoDB

  Sharded cluster


                MongoDB

              Single Instance
                    Or
                Replica Set
                                  Client
                                Application
The Mechanism of Sharding
                         Complete Data Set

Define Shard Key on Location Name




    Fort Point         Linda Mar    Maverick’s   Ocean Beach   Rockaway
The Mechanism of Sharding
                 Chunk                                  Chunk

Define Shard Key on Location Name




    Fort Point           Linda Mar   Maverick’s   Ocean Beach   Rockaway
The Mechanism of Sharding
 Chunk        Chunk               Chunk               Chunk




 Fort Point   Linda Mar   Maverick’s   Ocean Beach   Rockaway
The Mechanism of Sharding
 Chunk        Chunk             Chunk               Chunk




 Fort Point   Linda Mar   Maverick’s   Ocean Beach Rockaway


  Shard 1     Shard 2             Shard 3            Shard 4
The Mechanism of Sharding




 Chu      Chu
 nkc      nkc

 Chu      Chu    Chu   Chu        Chu    Chu   Chu   Chu
 nkc      nkc    nkc   nkc        nkc    nkc   nkc   nkc




       Shard 1   Shard 2            Shard 3    Shard 4




                             40
The Mechanism of Sharding
                                    Client
       Query: Linda Mar           Application



 Chu                                                          Chu
 nkc                                                          nkc

 Chu      Chu             Chu   Chu        Chu    Chu   Chu   Chu
 nkc      nkc             nkc   nkc        nkc    nkc   nkc   nkc




       Shard 1            Shard 2            Shard 3    Shard 4




                                      41
The Mechanism of Sharding
                                     Client
       Query: Maverick’s           Application



 Chu                                                           Chu
 nkc                                                           nkc

 Chu      Chu              Chu   Chu        Chu    Chu   Chu   Chu
 nkc      nkc              nkc   nkc        nkc    nkc   nkc   nkc




       Shard 1             Shard 2            Shard 3    Shard 4




                                       42
Thanks!
 Office Hours
 Thursdays 4-6 pm
 555 University Ave.
 Palo Alto

     We’re Hiring !
Bryan.reinero@10gen.com

More Related Content

PPTX
Building your first java application with MongoDB
PPTX
Building Your First Java Application with MongoDB
PPTX
Mythbusting: Understanding How We Measure the Performance of MongoDB
PDF
MongoDB Performance Tuning
PDF
MongoDB World 2016: Deciphering .explain() Output
PPTX
How Not to Code
PDF
はじめてのMongoDB
PDF
MongoDB Performance Debugging
Building your first java application with MongoDB
Building Your First Java Application with MongoDB
Mythbusting: Understanding How We Measure the Performance of MongoDB
MongoDB Performance Tuning
MongoDB World 2016: Deciphering .explain() Output
How Not to Code
はじめてのMongoDB
MongoDB Performance Debugging

What's hot (20)

PDF
Erlang for data ops
PDF
Indexing
PDF
Elastic search 검색
PPTX
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
PDF
Inside MongoDB: the Internals of an Open-Source Database
PDF
Mongodb debugging-performance-problems
PDF
dotSwift - From Problem to Solution
PPTX
MongoDB Online Conference: Introducing MongoDB 2.2
PDF
Mobile Web 5.0
PDF
Scylla core dump debugging tools
PDF
php plus mysql
PPTX
MongoDB + Java - Everything you need to know
PDF
MongoDBで作るソーシャルデータ新解析基盤
PDF
MongoDB全機能解説2
PDF
BDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
PDF
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
KEY
Web programming in Haskell
PPTX
Open Source Search: An Analysis
PDF
MySQL flexible schema and JSON for Internet of Things
PDF
MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
Erlang for data ops
Indexing
Elastic search 검색
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
Inside MongoDB: the Internals of an Open-Source Database
Mongodb debugging-performance-problems
dotSwift - From Problem to Solution
MongoDB Online Conference: Introducing MongoDB 2.2
Mobile Web 5.0
Scylla core dump debugging tools
php plus mysql
MongoDB + Java - Everything you need to know
MongoDBで作るソーシャルデータ新解析基盤
MongoDB全機能解説2
BDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
Web programming in Haskell
Open Source Search: An Analysis
MySQL flexible schema and JSON for Internet of Things
MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
Ad

Similar to Building your first Java Application with MongoDB (20)

PPTX
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
KEY
Building Your First MongoDB Application
PPTX
First app online conf
PDF
Webinar: Data Processing and Aggregation Options
PDF
A Century Of Weather Data - Midwest.io
PPT
Building web applications with mongo db presentation
PPTX
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
PPTX
Webinar: Building Your First Application with MongoDB
PPTX
Mongo db 101 dc group
PDF
Mongo db for funnel
PPTX
MongoDB for Time Series Data: Setting the Stage for Sensor Management
KEY
MongoDB - Introduction
KEY
Mapping Flatland: Using MongoDB for an MMO Crossword Game (GDC Online 2011)
PDF
Using Spring with NoSQL databases (SpringOne China 2012)
PPTX
MongoDB Chicago - MapReduce, Geospatial, & Other Cool Features
PPT
Building Applications with MongoDB - an Introduction
PPT
Building Your First MongoDB Application (Mongo Austin)
PPTX
MongoDB Use Cases: Healthcare, CMS, Analytics
KEY
Building your first application w/mongoDB MongoSV2011
PDF
Aggregation Framework MongoDB Days Munich
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
Building Your First MongoDB Application
First app online conf
Webinar: Data Processing and Aggregation Options
A Century Of Weather Data - Midwest.io
Building web applications with mongo db presentation
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
Webinar: Building Your First Application with MongoDB
Mongo db 101 dc group
Mongo db for funnel
MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB - Introduction
Mapping Flatland: Using MongoDB for an MMO Crossword Game (GDC Online 2011)
Using Spring with NoSQL databases (SpringOne China 2012)
MongoDB Chicago - MapReduce, Geospatial, & Other Cool Features
Building Applications with MongoDB - an Introduction
Building Your First MongoDB Application (Mongo Austin)
MongoDB Use Cases: Healthcare, CMS, Analytics
Building your first application w/mongoDB MongoSV2011
Aggregation Framework MongoDB Days Munich
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PPTX
Big Data Technologies - Introduction.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Approach and Philosophy of On baking technology
PDF
KodekX | Application Modernization Development
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPT
Teaching material agriculture food technology
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
Big Data Technologies - Introduction.pptx
cuic standard and advanced reporting.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
20250228 LYD VKU AI Blended-Learning.pptx
sap open course for s4hana steps from ECC to s4
Digital-Transformation-Roadmap-for-Companies.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Per capita expenditure prediction using model stacking based on satellite ima...
Approach and Philosophy of On baking technology
KodekX | Application Modernization Development
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Dropbox Q2 2025 Financial Results & Investor Presentation
Chapter 3 Spatial Domain Image Processing.pdf
Spectroscopy.pptx food analysis technology
Advanced methodologies resolving dimensionality complications for autism neur...
Teaching material agriculture food technology
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
The Rise and Fall of 3GPP – Time for a Sabbatical?

Building your first Java Application with MongoDB

  • 1. Building Your First Application in Java 1
  • 2. is a… • High performance • Highly available • Easily scalable • Easy to use • Feature rich Document store
  • 3. Data Model • A Mongo system holds a set of databases • A database holds a set of collections • A collection holds a set of documents • A document is a set of fields • A field is a key-value pair • A key is a name (string) • A value is a basic type like string, integer, float, timestamp, binary, etc., a document, or an array of values
  • 4. High Availability: Replica Sets • Initialize -> Election • Primary + data replication from primary to secondary Node 1 Node 2 Secondary Heartbeat Secondary Node 3 Primary Replication Replication
  • 5. Replica Set - Failure • Primary down/network failure • Automatic election of new primary if majority exists Primary Election Node 1 Node 2 Secondary Heartbeat Secondary Node 3 Primary
  • 6. Replica Set - Failover • New primary elected • Replication established from new primary Node 1 Node 2 Secondary Heartbeat Primary Node 3 Primary
  • 7. Durability • Fire and forget • Wait for error • Wait for journal sync • Wait for flush to disk • Wait for replication
  • 8. Read Preferences • PRIMARY • PRIMARY PREFERRED • SECONDARY • SECONDARY PREFERRED • NEAREST
  • 9. Let’s build a location based surf reporting app!
  • 10. Let’s build a location based surf reporting app! • Report current conditions
  • 11. Let’s build a location based surf reporting app! • Report current conditions • Get current local conditions
  • 12. Let’s build a location based surf reporting app! • Report current conditions • Get current local conditions • Determine best conditions per beach
  • 13. Document Structure { "_id" : ObjectId("504ceb3d30042d707af96fef"), "reporter" : "test", "location" : { "coordinates" : [ -122.477222, 37.810556 ], "name" : "Fort Point" }, "conditions" : { "height" : 0, "period" : 9, "rating" : 1 }, "date" : ISODate("2011-11-16T20:17:17.277Z") }
  • 14. Document Structure { "_id" : ObjectId("504ceb3d30042d707af96fef"), Primary Key, "reporter" : "test", Unique, "location" : { "coordinates" : [ Auto-indexed -122.477222, 37.810556 ], "name" : "Fort Point" }, "conditions" : { "height" : 0, "period" : 9, "rating" : 1 }, "date" : ISODate("2011-11-16T20:17:17.277Z") }
  • 15. Document Structure { "_id" : ObjectId("504ceb3d30042d707af96fef"), Primary Key, "reporter" : "test", Unique, "location" : { "coordinates" : [ Autoindexed -122.477222, 37.810556 ], Compound Index, "name" : "Fort Point" Geospacial }, "conditions" : { "height" : 0, "period" : 9, "rating" : 1 }, "date" : ISODate("2011-11-16T20:17:17.277Z") }
  • 16. Document Structure { "_id" : ObjectId("504ceb3d30042d707af96fef"), Primary Key, "reporter" : "test", Unique, "location" : { "coordinates" : [ Autoindexed -122.477222, 37.810556 ], Compound Index, "name" : "Fort Point" Geospacial }, "conditions" : { "height" : 0, "period" : 9, "rating" : 1 }, Indexed for "date" : ISODate("2011-11-16T20:17:17.277Z") Time-To-Live }
  • 17. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"date" : 1, "location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1})
  • 18. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"date" : 1, "location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1}) • Get local reports
  • 19. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"date" : 1, "location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1}) • Get local reports • Get today’s reports
  • 20. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1}) • Get local reports • Get today’s reports • Return only the relevant info
  • 21. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1}) • Get local reports • Get today’s reports • Return only the relevant info • Show me the best surf first
  • 22. Get local surf conditions: Connecting
  • 23. DBObjects Output: { "name" : "test"} parsed
  • 25. Results { "location" : { "name" : "Montara" }, "conditions" : { "height" : 6, "period" : 20, "rating" : 5 } } { "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 5, "period" : 13, "rating" : 3 } } { "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 3, "period" : 15, "rating" : 3 } } { "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 3, "period" : 16, "rating" : 2 } } { "location" : { "name" : "Montara" }, "conditions" : { "height" : 0, "period" : 8, "rating" : 1 } } { "location" : { "name" : "Linda Mar" }, "conditions" : { "height" : 3, "period" : 10, "rating" : 1 } } { "location" : { "name" : "Sharp Park" }, "conditions" : { "height" : 1, "period" : 15, "rating" : 1 } } { "location" : { "name" : "Sharp Park" }, "conditions" : { "height" : 5, "period" : 6, "rating" : 1 } } { "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 1, "period" : 6, "rating" : 1 } } { "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 0, "period" : 10, "rating" : 1 } } { "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 4, "period" : 6, "rating" : 1 } } { "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 0, "period" : 14, "rating" : 1 } }
  • 26. Analysis Features: Aggregation Framework What are the best conditions for my local beach?
  • 27. Pipelining Operations $match Match “Linda Mar” $project Only interested in conditions Group by rating, averaging $group wave height and wave period $sort Order by best conditions
  • 28. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] }
  • 29. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] } Match “Linda Mar”
  • 30. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] } Only interested in conditions
  • 31. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] } Group by rating & average conditions
  • 32. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] } Show me best conditions first
  • 34. Scaling • Sharding is the partitioning of data among multiple machines • Balancing occurs when the load on any one node grows out of proportion
  • 35. Scaling MongoDB Sharded cluster MongoDB Single Instance Or Replica Set Client Application
  • 36. The Mechanism of Sharding Complete Data Set Define Shard Key on Location Name Fort Point Linda Mar Maverick’s Ocean Beach Rockaway
  • 37. The Mechanism of Sharding Chunk Chunk Define Shard Key on Location Name Fort Point Linda Mar Maverick’s Ocean Beach Rockaway
  • 38. The Mechanism of Sharding Chunk Chunk Chunk Chunk Fort Point Linda Mar Maverick’s Ocean Beach Rockaway
  • 39. The Mechanism of Sharding Chunk Chunk Chunk Chunk Fort Point Linda Mar Maverick’s Ocean Beach Rockaway Shard 1 Shard 2 Shard 3 Shard 4
  • 40. The Mechanism of Sharding Chu Chu nkc nkc Chu Chu Chu Chu Chu Chu Chu Chu nkc nkc nkc nkc nkc nkc nkc nkc Shard 1 Shard 2 Shard 3 Shard 4 40
  • 41. The Mechanism of Sharding Client Query: Linda Mar Application Chu Chu nkc nkc Chu Chu Chu Chu Chu Chu Chu Chu nkc nkc nkc nkc nkc nkc nkc nkc Shard 1 Shard 2 Shard 3 Shard 4 41
  • 42. The Mechanism of Sharding Client Query: Maverick’s Application Chu Chu nkc nkc Chu Chu Chu Chu Chu Chu Chu Chu nkc nkc nkc nkc nkc nkc nkc nkc Shard 1 Shard 2 Shard 3 Shard 4 42
  • 43. Thanks! Office Hours Thursdays 4-6 pm 555 University Ave. Palo Alto We’re Hiring ! [email protected]

Editor's Notes

  • #9: Read from any of the fastest responding nodes.