SlideShare a Scribd company logo
Building Your First Application in Java
    Bryan Reinero
    Bryan.reinero@10gen.com
    September 2012




1
      High performance
      Highly available
      Easily scalable
      Easy to use
      Feature rich


                                                Document store


©2012 Jaspersoft Corporation. Proprietary and
Confidential                                         2
Data Model

      A Mongo system holds a set of databases
      A database holds a set of collections
      A collection holds a set of documents
      A document is a set of fields
      A field is a key-value pair
      A key is a name (string)
      A value is a
                   basic type like string, integer, float, timestamp, binary, etc.,
                   a document, or
                   an array of values



©2012 Jaspersoft Corporation. Proprietary and
Confidential                                       3
High Availability: Replica Sets


 Initialize -> Election
 Primary + data replication from primary to secondary


                Node 1                                       Node 2
               Secondary                        Heartbeat   Secondary



                                                Node 3
                                                Primary
                               Replication                    Replication


©2012 Jaspersoft Corporation. Proprietary and
Confidential                                      4
High Availability: Failure


 Primary down/network failure
 Automatic election of new primary if majority exists

                                                Primary Election
                Node 1                                              Node 2
               Secondary                          Heartbeat        Secondary



                                                   Node 3
                                                   Primary


©2012 Jaspersoft Corporation. Proprietary and
Confidential                                         5
High Availability: Failover


 New primary elected
 Replication established from new primary


                Node 1                                       Node 2
               Secondary                        Heartbeat   Secondary



                                                Node 3
                                                Primary


©2012 Jaspersoft Corporation. Proprietary and
Confidential                                      6
Durability

      Fire and forget
      Wait for error
      Wait for journal sync
      Wait for fsync
      Wait for replication




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                    7
Read Preferences


      PRIMARY
      PRIMARY PREFERRED
      SECONDARY
      SECONDARY PREFERRED
      NEAREST




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                    8
Let’s build a location based surf reporting app!




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                    9
Let’s build a location based surf reporting app!




• Report current conditions
Let’s build a location based surf reporting app!




• Report current conditions
• Get current local conditions
Let’s build a location based surf reporting app!




• Report current conditions
• Get current local conditions
• Determine best conditions per beach
Document Structure
{
     "_id" : ObjectId("504ceb3d30042d707af96fef"),
     "reporter" : "test",
     "location" : {
               "coordinates" : [
                          -122.477222,
                          37.810556
               ],
               "name" : "Fort Point"
     },
     "conditions" : {
               "height" : 0,
               "period" : 9,
               "rating" : 1
     },
     "date" : ISODate("2011-11-16T20:17:17.277Z")
}
Document Structure
{
     "_id" : ObjectId("504ceb3d30042d707af96fef"),   Primary Key,
     "reporter" : "test",
                                                     Unique,
     "location" : {
               "coordinates" : [                     Auto-indexed
                          -122.477222,
                          37.810556
               ],
               "name" : "Fort Point"
     },
     "conditions" : {
               "height" : 0,
               "period" : 9,
               "rating" : 1
     },
     "date" : ISODate("2011-11-16T20:17:17.277Z")
}
Document Structure
{
     "_id" : ObjectId("504ceb3d30042d707af96fef"),      Primary Key,
     "reporter" : "test",
                                                        Unique,
     "location" : {
               "coordinates" : [                        Autoindexed
                          -122.477222,
                          37.810556
               ],                                    Compound Index,
               "name" : "Fort Point"                 Geospacial
     },
     "conditions" : {
               "height" : 0,
               "period" : 9,
               "rating" : 1
     },
     "date" : ISODate("2011-11-16T20:17:17.277Z")
}
Document Structure
{
     "_id" : ObjectId("504ceb3d30042d707af96fef"),     Primary Key,
     "reporter" : "test",
                                                       Unique,
     "location" : {
               "coordinates" : [                       Autoindexed
                          -122.477222,
                          37.810556
               ],                                    Compound Index,
               "name" : "Fort Point"                 Geospacial
     },
     "conditions" : {
               "height" : 0,
               "period" : 9,
               "rating" : 1
     },                                                 Indexed for
     "date" : ISODate("2011-11-16T20:17:17.277Z")       Time-To-Live
}
Get local surf conditions

  db.reports.find(
            {
            "location.coordinates" : { $near : [-122, 37] ,
            $maxDistance : 0.9},
            date : { $gte : new Date(2012, 8, 9)}
            },
            {"date" : 1, "location.name" :1, _id : 0, "conditions" :1}
  ).sort({"conditions.rating" : -1})
Get local surf conditions

  db.reports.find(
            {
            "location.coordinates" : { $near : [-122, 37] ,
            $maxDistance : 0.9},
            date : { $gte : new Date(2012, 8, 9)}
            },
            {"date" : 1, "location.name" :1, _id : 0, "conditions" :1}
  ).sort({"conditions.rating" : -1})

  • Get local reports
Get local surf conditions

  db.reports.find(
            {
            "location.coordinates" : { $near : [-122, 37] ,
            $maxDistance : 0.9},
            date : { $gte : new Date(2012, 8, 9)}
            },
            {"date" : 1, "location.name" :1, _id : 0, "conditions" :1}
  ).sort({"conditions.rating" : -1})

  • Get local reports
  • Get today’s reports
Get local surf conditions

  db.reports.find(
            {
            "location.coordinates" : { $near : [-122, 37] ,
            $maxDistance : 0.9},
            date : { $gte : new Date(2012, 8, 9)}
            },
            {"location.name" :1, _id : 0, "conditions" :1}
  ).sort({"conditions.rating" : -1})

  • Get local reports
  • Get today’s reports
  • Return only the relevant info
Get local surf conditions

  db.reports.find(
            {
            "location.coordinates" : { $near : [-122, 37] ,
            $maxDistance : 0.9},
            date : { $gte : new Date(2012, 8, 9)}
            },
            {"location.name" :1, _id : 0, "conditions" :1}
  ).sort({"conditions.rating" : -1})

  •   Get local reports
  •   Get today’s reports
  •   Return only the relevant info
  •   Show me the best surf first
Results

{ "location" : { "name" : "Montara" }, "conditions" : { "height" : 6, "period" : 20, "rating" : 5 } }
{ "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 5, "period" : 13, "rating" : 3 } }
{ "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 3, "period" : 15, "rating" : 3 } }
{ "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 3, "period" : 16, "rating" : 2 } }
{ "location" : { "name" : "Montara" }, "conditions" : { "height" : 0, "period" : 8, "rating" : 1 } }
{ "location" : { "name" : "Linda Mar" }, "conditions" : { "height" : 3, "period" : 10, "rating" : 1 } }
{ "location" : { "name" : "Sharp Park" }, "conditions" : { "height" : 1, "period" : 15, "rating" : 1 } }
{ "location" : { "name" : "Sharp Park" }, "conditions" : { "height" : 5, "period" : 6, "rating" : 1 } }
{ "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 1, "period" : 6, "rating" : 1 } }
{ "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 0, "period" : 10, "rating" : 1 } }
{ "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 4, "period" : 6, "rating" : 1 } }
{ "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 0, "period" : 14, "rating" : 1 } }
Scaling

 Sharding is the partitioning of data among multiple
       machines
      Balancing occurs when the load on any one node grows
       out of proportion




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                         23
Scaling MongoDB



   Sharded cluster


                  MongoDB

              Single Instance
                    Or
                Replica Set
                                  Client
                                Application
The Mechanism of Sharding


                      Complete Data Set

Define Shard Key on Location Name




    Fort Point       Linda Mar Maverick’s Ocean Beach Rockaway
The Mechanism of Sharding


             Chunk                            Chunk

Define Shard Key on Location Name




    Fort Point       Linda Mar Maverick’s Ocean Beach Rockaway
The Mechanism of Sharding


  Chunk         Chunk           Chunk            Chunk




   Fort Point   Linda Mar Maverick’s Ocean Beach Rockaway
The Mechanism of Sharding


  Chunk         Chunk            Chunk            Chunk




   Fort Point   Linda Mar   Maverick’s Ocean BeachRockaway


    Shard 1     Shard 2           Shard 3          Shard 4
The Mechanism of Sharding




       Chu           Chu
       nkc           nkc

       Chu           Chu                        Chu   Chu        Chu   Chu   Chu   Chu
       nkc           nkc                        nkc   nkc        nkc   nkc   nkc   nkc




              Shard 1                           Shard 2           Shard 3    Shard 4




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                                29
The Mechanism of Sharding


                                                         Client
             Query: Linda Mar                          Application



       Chu           Chu
       nkc           nkc

       Chu           Chu                        Chu   Chu        Chu   Chu   Chu   Chu
       nkc           nkc                        nkc   nkc        nkc   nkc   nkc   nkc




              Shard 1                           Shard 2           Shard 3    Shard 4




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                                30
The Mechanism of Sharding


                                                         Client
             Query: Maverick’s                         Application



       Chu           Chu
       nkc           nkc

       Chu           Chu                        Chu   Chu        Chu   Chu   Chu   Chu
       nkc           nkc                        nkc   nkc        nkc   nkc   nkc   nkc




              Shard 1                           Shard 2           Shard 3    Shard 4




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                                31
Analysis Features:
Aggregation Framework




 What are the best conditions for my local beach?
Pipelining Operations


   $match        Match “Linda Mar”

   $project      Only interested in conditions

   $group        Group by rating, averaging
                 wave height and wave period

     $sort       Order by best conditions
Aggregation Framework

 { "aggregate" : "reports" ,
    "pipeline" : [
       { "$match" : { "location.name" : "Linda Mar"}} ,
       { "$project" : { "conditions" : 1}} ,
       { "$group" : {
          "_id" : "$conditions.rating" ,
          "average height" : { "$avg" : "$conditions.height"} ,
          "average period" : { "$avg" : "$conditions.period"}}} ,
       { "$sort" : { "_id" : -1}}
    ]
 }
Aggregation Framework

 { "aggregate" : "reports" ,
    "pipeline" : [
       { "$match" : { "location.name" : "Linda Mar"}} ,
       { "$project" : { "conditions" : 1}} ,
       { "$group" : {
          "_id" : "$conditions.rating" ,
          "average height" : { "$avg" : "$conditions.height"} ,
          "average period" : { "$avg" : "$conditions.period"}}} ,
       { "$sort" : { "_id" : -1}}
    ]
 }


                    Match “Linda Mar”
Aggregation Framework

 { "aggregate" : "reports" ,
    "pipeline" : [
       { "$match" : { "location.name" : "Linda Mar"}} ,
       { "$project" : { "conditions" : 1}} ,
       { "$group" : {
          "_id" : "$conditions.rating" ,
          "average height" : { "$avg" : "$conditions.height"} ,
          "average period" : { "$avg" : "$conditions.period"}}} ,
       { "$sort" : { "_id" : -1}}
    ]
 }


                 Only interested in conditions
Aggregation Framework

 { "aggregate" : "reports" ,
    "pipeline" : [
       { "$match" : { "location.name" : "Linda Mar"}} ,
       { "$project" : { "conditions" : 1}} ,
       { "$group" : {
          "_id" : "$conditions.rating" ,
          "average height" : { "$avg" : "$conditions.height"} ,
          "average period" : { "$avg" : "$conditions.period"}}} ,
       { "$sort" : { "_id" : -1}}
    ]
 }


        Group by rating & average conditions
Aggregation Framework

 { "aggregate" : "reports" ,
    "pipeline" : [
       { "$match" : { "location.name" : "Linda Mar"}} ,
       { "$project" : { "conditions" : 1}} ,
       { "$group" : {
          "_id" : "$conditions.rating" ,
          "average height" : { "$avg" : "$conditions.height"} ,
          "average period" : { "$avg" : "$conditions.period"}}} ,
       { "$sort" : { "_id" : -1}}
    ]
 }


            Show me best conditions first
Other Features…


      Native MapReduce
      Hadoop Connector
      Tagging
      Drivers for all major languages




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                    39
Thanks!

   Office Hours
Thursdays 4-6 pm
555 University Ave.
    Palo Alto

   We’re Hiring !
Bryan.reinero@10gen.com

More Related Content

PPTX
Building your first Java Application with MongoDB
PPTX
Building Your First Java Application with MongoDB
PDF
MongoDB: Optimising for Performance, Scale & Analytics
PDF
Real Time Big Data (w/ NoSQL)
PDF
Intro to the Hadoop Stack @ April 2011 JavaMUG
PDF
ROracle
PPTX
Ops Jumpstart: MongoDB Administration 101
KEY
An introduction to CouchDB
Building your first Java Application with MongoDB
Building Your First Java Application with MongoDB
MongoDB: Optimising for Performance, Scale & Analytics
Real Time Big Data (w/ NoSQL)
Intro to the Hadoop Stack @ April 2011 JavaMUG
ROracle
Ops Jumpstart: MongoDB Administration 101
An introduction to CouchDB

Viewers also liked (7)

PPTX
MongoDB Roadmap
PPT
Giftivo mongodb
PPT
A Morning with MongoDB - Helsinki
PPTX
Indexing and Query Optimisation
PPTX
Webinar: Replication and Replica Sets
PPTX
Branf final bringing mongodb into your organization - mongo db-boston2012
KEY
Discover MongoDB - Israel
MongoDB Roadmap
Giftivo mongodb
A Morning with MongoDB - Helsinki
Indexing and Query Optimisation
Webinar: Replication and Replica Sets
Branf final bringing mongodb into your organization - mongo db-boston2012
Discover MongoDB - Israel
Ad

Similar to Building your first java application with MongoDB (20)

PPT
MongoDB Basic Concepts
PPT
Building web applications with mongo db presentation
PPTX
First app online conf
KEY
Building Your First MongoDB Application
PDF
Thoughts on Transaction and Consistency Models
PPTX
Webinar: Building Your First Application with MongoDB
PDF
Using Spring with NoSQL databases (SpringOne China 2012)
PPTX
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
PPTX
MongoDB Use Cases: Healthcare, CMS, Analytics
PDF
Consistency Models in New Generation Databases
PDF
Consistency-New-Generation-Databases
PDF
MongoDB: What, why, when
PPTX
Webinar: Getting Started with MongoDB - Back to Basics
PDF
Nosql hands on handout 04
PDF
SDEC2011 NoSQL concepts and models
PPTX
High-Volume Data Collection and Real Time Analytics Using Redis
PDF
Scaling GIS Data in Non-relational Data Stores
PDF
A Century Of Weather Data - Midwest.io
KEY
Building your first application w/mongoDB MongoSV2011
PPTX
Java and Mongo
MongoDB Basic Concepts
Building web applications with mongo db presentation
First app online conf
Building Your First MongoDB Application
Thoughts on Transaction and Consistency Models
Webinar: Building Your First Application with MongoDB
Using Spring with NoSQL databases (SpringOne China 2012)
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB Use Cases: Healthcare, CMS, Analytics
Consistency Models in New Generation Databases
Consistency-New-Generation-Databases
MongoDB: What, why, when
Webinar: Getting Started with MongoDB - Back to Basics
Nosql hands on handout 04
SDEC2011 NoSQL concepts and models
High-Volume Data Collection and Real Time Analytics Using Redis
Scaling GIS Data in Non-relational Data Stores
A Century Of Weather Data - Midwest.io
Building your first application w/mongoDB MongoSV2011
Java and Mongo
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Spectroscopy.pptx food analysis technology
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Empathic Computing: Creating Shared Understanding
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
A Presentation on Artificial Intelligence
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
cuic standard and advanced reporting.pdf
The AUB Centre for AI in Media Proposal.docx
Spectroscopy.pptx food analysis technology
“AI and Expert System Decision Support & Business Intelligence Systems”
Dropbox Q2 2025 Financial Results & Investor Presentation
Per capita expenditure prediction using model stacking based on satellite ima...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Encapsulation_ Review paper, used for researhc scholars
Empathic Computing: Creating Shared Understanding
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
A Presentation on Artificial Intelligence
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
sap open course for s4hana steps from ECC to s4
Chapter 3 Spatial Domain Image Processing.pdf
Review of recent advances in non-invasive hemoglobin estimation
cuic standard and advanced reporting.pdf

Building your first java application with MongoDB

  • 1. Building Your First Application in Java Bryan Reinero [email protected] September 2012 1
  • 2. High performance  Highly available  Easily scalable  Easy to use  Feature rich Document store ©2012 Jaspersoft Corporation. Proprietary and Confidential 2
  • 3. Data Model  A Mongo system holds a set of databases  A database holds a set of collections  A collection holds a set of documents  A document is a set of fields  A field is a key-value pair  A key is a name (string)  A value is a basic type like string, integer, float, timestamp, binary, etc., a document, or an array of values ©2012 Jaspersoft Corporation. Proprietary and Confidential 3
  • 4. High Availability: Replica Sets  Initialize -> Election  Primary + data replication from primary to secondary Node 1 Node 2 Secondary Heartbeat Secondary Node 3 Primary Replication Replication ©2012 Jaspersoft Corporation. Proprietary and Confidential 4
  • 5. High Availability: Failure  Primary down/network failure  Automatic election of new primary if majority exists Primary Election Node 1 Node 2 Secondary Heartbeat Secondary Node 3 Primary ©2012 Jaspersoft Corporation. Proprietary and Confidential 5
  • 6. High Availability: Failover  New primary elected  Replication established from new primary Node 1 Node 2 Secondary Heartbeat Secondary Node 3 Primary ©2012 Jaspersoft Corporation. Proprietary and Confidential 6
  • 7. Durability  Fire and forget  Wait for error  Wait for journal sync  Wait for fsync  Wait for replication ©2012 Jaspersoft Corporation. Proprietary and Confidential 7
  • 8. Read Preferences  PRIMARY  PRIMARY PREFERRED  SECONDARY  SECONDARY PREFERRED  NEAREST ©2012 Jaspersoft Corporation. Proprietary and Confidential 8
  • 9. Let’s build a location based surf reporting app! ©2012 Jaspersoft Corporation. Proprietary and Confidential 9
  • 10. Let’s build a location based surf reporting app! • Report current conditions
  • 11. Let’s build a location based surf reporting app! • Report current conditions • Get current local conditions
  • 12. Let’s build a location based surf reporting app! • Report current conditions • Get current local conditions • Determine best conditions per beach
  • 13. Document Structure { "_id" : ObjectId("504ceb3d30042d707af96fef"), "reporter" : "test", "location" : { "coordinates" : [ -122.477222, 37.810556 ], "name" : "Fort Point" }, "conditions" : { "height" : 0, "period" : 9, "rating" : 1 }, "date" : ISODate("2011-11-16T20:17:17.277Z") }
  • 14. Document Structure { "_id" : ObjectId("504ceb3d30042d707af96fef"), Primary Key, "reporter" : "test", Unique, "location" : { "coordinates" : [ Auto-indexed -122.477222, 37.810556 ], "name" : "Fort Point" }, "conditions" : { "height" : 0, "period" : 9, "rating" : 1 }, "date" : ISODate("2011-11-16T20:17:17.277Z") }
  • 15. Document Structure { "_id" : ObjectId("504ceb3d30042d707af96fef"), Primary Key, "reporter" : "test", Unique, "location" : { "coordinates" : [ Autoindexed -122.477222, 37.810556 ], Compound Index, "name" : "Fort Point" Geospacial }, "conditions" : { "height" : 0, "period" : 9, "rating" : 1 }, "date" : ISODate("2011-11-16T20:17:17.277Z") }
  • 16. Document Structure { "_id" : ObjectId("504ceb3d30042d707af96fef"), Primary Key, "reporter" : "test", Unique, "location" : { "coordinates" : [ Autoindexed -122.477222, 37.810556 ], Compound Index, "name" : "Fort Point" Geospacial }, "conditions" : { "height" : 0, "period" : 9, "rating" : 1 }, Indexed for "date" : ISODate("2011-11-16T20:17:17.277Z") Time-To-Live }
  • 17. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"date" : 1, "location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1})
  • 18. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"date" : 1, "location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1}) • Get local reports
  • 19. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"date" : 1, "location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1}) • Get local reports • Get today’s reports
  • 20. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1}) • Get local reports • Get today’s reports • Return only the relevant info
  • 21. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1}) • Get local reports • Get today’s reports • Return only the relevant info • Show me the best surf first
  • 22. Results { "location" : { "name" : "Montara" }, "conditions" : { "height" : 6, "period" : 20, "rating" : 5 } } { "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 5, "period" : 13, "rating" : 3 } } { "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 3, "period" : 15, "rating" : 3 } } { "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 3, "period" : 16, "rating" : 2 } } { "location" : { "name" : "Montara" }, "conditions" : { "height" : 0, "period" : 8, "rating" : 1 } } { "location" : { "name" : "Linda Mar" }, "conditions" : { "height" : 3, "period" : 10, "rating" : 1 } } { "location" : { "name" : "Sharp Park" }, "conditions" : { "height" : 1, "period" : 15, "rating" : 1 } } { "location" : { "name" : "Sharp Park" }, "conditions" : { "height" : 5, "period" : 6, "rating" : 1 } } { "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 1, "period" : 6, "rating" : 1 } } { "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 0, "period" : 10, "rating" : 1 } } { "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 4, "period" : 6, "rating" : 1 } } { "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 0, "period" : 14, "rating" : 1 } }
  • 23. Scaling  Sharding is the partitioning of data among multiple machines  Balancing occurs when the load on any one node grows out of proportion ©2012 Jaspersoft Corporation. Proprietary and Confidential 23
  • 24. Scaling MongoDB Sharded cluster MongoDB Single Instance Or Replica Set Client Application
  • 25. The Mechanism of Sharding Complete Data Set Define Shard Key on Location Name Fort Point Linda Mar Maverick’s Ocean Beach Rockaway
  • 26. The Mechanism of Sharding Chunk Chunk Define Shard Key on Location Name Fort Point Linda Mar Maverick’s Ocean Beach Rockaway
  • 27. The Mechanism of Sharding Chunk Chunk Chunk Chunk Fort Point Linda Mar Maverick’s Ocean Beach Rockaway
  • 28. The Mechanism of Sharding Chunk Chunk Chunk Chunk Fort Point Linda Mar Maverick’s Ocean BeachRockaway Shard 1 Shard 2 Shard 3 Shard 4
  • 29. The Mechanism of Sharding Chu Chu nkc nkc Chu Chu Chu Chu Chu Chu Chu Chu nkc nkc nkc nkc nkc nkc nkc nkc Shard 1 Shard 2 Shard 3 Shard 4 ©2012 Jaspersoft Corporation. Proprietary and Confidential 29
  • 30. The Mechanism of Sharding Client Query: Linda Mar Application Chu Chu nkc nkc Chu Chu Chu Chu Chu Chu Chu Chu nkc nkc nkc nkc nkc nkc nkc nkc Shard 1 Shard 2 Shard 3 Shard 4 ©2012 Jaspersoft Corporation. Proprietary and Confidential 30
  • 31. The Mechanism of Sharding Client Query: Maverick’s Application Chu Chu nkc nkc Chu Chu Chu Chu Chu Chu Chu Chu nkc nkc nkc nkc nkc nkc nkc nkc Shard 1 Shard 2 Shard 3 Shard 4 ©2012 Jaspersoft Corporation. Proprietary and Confidential 31
  • 32. Analysis Features: Aggregation Framework What are the best conditions for my local beach?
  • 33. Pipelining Operations $match Match “Linda Mar” $project Only interested in conditions $group Group by rating, averaging wave height and wave period $sort Order by best conditions
  • 34. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] }
  • 35. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] } Match “Linda Mar”
  • 36. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] } Only interested in conditions
  • 37. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] } Group by rating & average conditions
  • 38. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] } Show me best conditions first
  • 39. Other Features…  Native MapReduce  Hadoop Connector  Tagging  Drivers for all major languages ©2012 Jaspersoft Corporation. Proprietary and Confidential 39
  • 40. Thanks! Office Hours Thursdays 4-6 pm 555 University Ave. Palo Alto We’re Hiring ! [email protected]