SlideShare a Scribd company logo
Building Your First Application in Java
    Bryan Reinero
    Bryan.reinero@10gen.com
    September 2012




1
      High performance
      Highly available
      Easily scalable
      Easy to use
      Feature rich


                                                Document store


©2012 Jaspersoft Corporation. Proprietary and
Confidential                                         2
Data Model

      A Mongo system holds a set of databases
      A database holds a set of collections
      A collection holds a set of documents
      A document is a set of fields
      A field is a key-value pair
      A key is a name (string)
      A value is a
                   basic type like string, integer, float, timestamp, binary, etc.,
                   a document, or
                   an array of values



©2012 Jaspersoft Corporation. Proprietary and
Confidential                                       3
High Availability: Replica Sets


 Initialize -> Election
 Primary + data replication from primary to secondary


                Node 1                                       Node 2
               Secondary                        Heartbeat   Secondary



                                                Node 3
                                                Primary
                               Replication                    Replication


©2012 Jaspersoft Corporation. Proprietary and
Confidential                                      4
High Availability: Failure


 Primary down/network failure
 Automatic election of new primary if majority exists

                                                Primary Election
                Node 1                                              Node 2
               Secondary                          Heartbeat        Secondary



                                                   Node 3
                                                   Primary


©2012 Jaspersoft Corporation. Proprietary and
Confidential                                         5
High Availability: Failover


 New primary elected
 Replication established from new primary


                Node 1                                       Node 2
               Secondary                        Heartbeat   Secondary



                                                Node 3
                                                Primary


©2012 Jaspersoft Corporation. Proprietary and
Confidential                                      6
Durability

      Fire and forget
      Wait for error
      Wait for journal sync
      Wait for fsync
      Wait for replication




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                    7
Read Preferences


      PRIMARY
      PRIMARY PREFERRED
      SECONDARY
      SECONDARY PREFERRED
      NEAREST




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                    8
Let’s build a location based surf reporting app!




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                    9
Let’s build a location based surf reporting app!




• Report current conditions
Let’s build a location based surf reporting app!




• Report current conditions
• Get current local conditions
Let’s build a location based surf reporting app!




• Report current conditions
• Get current local conditions
• Determine best conditions per beach
Document Structure
{
     "_id" : ObjectId("504ceb3d30042d707af96fef"),
     "reporter" : "test",
     "location" : {
               "coordinates" : [
                          -122.477222,
                          37.810556
               ],
               "name" : "Fort Point"
     },
     "conditions" : {
               "height" : 0,
               "period" : 9,
               "rating" : 1
     },
     "date" : ISODate("2011-11-16T20:17:17.277Z")
}
Document Structure
{
     "_id" : ObjectId("504ceb3d30042d707af96fef"),   Primary Key,
     "reporter" : "test",
                                                     Unique,
     "location" : {
               "coordinates" : [                     Auto-indexed
                          -122.477222,
                          37.810556
               ],
               "name" : "Fort Point"
     },
     "conditions" : {
               "height" : 0,
               "period" : 9,
               "rating" : 1
     },
     "date" : ISODate("2011-11-16T20:17:17.277Z")
}
Document Structure
{
     "_id" : ObjectId("504ceb3d30042d707af96fef"),      Primary Key,
     "reporter" : "test",
                                                        Unique,
     "location" : {
               "coordinates" : [                        Autoindexed
                          -122.477222,
                          37.810556
               ],                                    Compound Index,
               "name" : "Fort Point"                 Geospacial
     },
     "conditions" : {
               "height" : 0,
               "period" : 9,
               "rating" : 1
     },
     "date" : ISODate("2011-11-16T20:17:17.277Z")
}
Document Structure
{
     "_id" : ObjectId("504ceb3d30042d707af96fef"),     Primary Key,
     "reporter" : "test",
                                                       Unique,
     "location" : {
               "coordinates" : [                       Autoindexed
                          -122.477222,
                          37.810556
               ],                                    Compound Index,
               "name" : "Fort Point"                 Geospacial
     },
     "conditions" : {
               "height" : 0,
               "period" : 9,
               "rating" : 1
     },                                                 Indexed for
     "date" : ISODate("2011-11-16T20:17:17.277Z")       Time-To-Live
}
Get local surf conditions

  db.reports.find(
            {
            "location.coordinates" : { $near : [-122, 37] ,
            $maxDistance : 0.9},
            date : { $gte : new Date(2012, 8, 9)}
            },
            {"date" : 1, "location.name" :1, _id : 0, "conditions" :1}
  ).sort({"conditions.rating" : -1})
Get local surf conditions

  db.reports.find(
            {
            "location.coordinates" : { $near : [-122, 37] ,
            $maxDistance : 0.9},
            date : { $gte : new Date(2012, 8, 9)}
            },
            {"date" : 1, "location.name" :1, _id : 0, "conditions" :1}
  ).sort({"conditions.rating" : -1})

  • Get local reports
Get local surf conditions

  db.reports.find(
            {
            "location.coordinates" : { $near : [-122, 37] ,
            $maxDistance : 0.9},
            date : { $gte : new Date(2012, 8, 9)}
            },
            {"date" : 1, "location.name" :1, _id : 0, "conditions" :1}
  ).sort({"conditions.rating" : -1})

  • Get local reports
  • Get today’s reports
Get local surf conditions

  db.reports.find(
            {
            "location.coordinates" : { $near : [-122, 37] ,
            $maxDistance : 0.9},
            date : { $gte : new Date(2012, 8, 9)}
            },
            {"location.name" :1, _id : 0, "conditions" :1}
  ).sort({"conditions.rating" : -1})

  • Get local reports
  • Get today’s reports
  • Return only the relevant info
Get local surf conditions

  db.reports.find(
            {
            "location.coordinates" : { $near : [-122, 37] ,
            $maxDistance : 0.9},
            date : { $gte : new Date(2012, 8, 9)}
            },
            {"location.name" :1, _id : 0, "conditions" :1}
  ).sort({"conditions.rating" : -1})

  •   Get local reports
  •   Get today’s reports
  •   Return only the relevant info
  •   Show me the best surf first
Results

{ "location" : { "name" : "Montara" }, "conditions" : { "height" : 6, "period" : 20, "rating" : 5 } }
{ "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 5, "period" : 13, "rating" : 3 } }
{ "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 3, "period" : 15, "rating" : 3 } }
{ "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 3, "period" : 16, "rating" : 2 } }
{ "location" : { "name" : "Montara" }, "conditions" : { "height" : 0, "period" : 8, "rating" : 1 } }
{ "location" : { "name" : "Linda Mar" }, "conditions" : { "height" : 3, "period" : 10, "rating" : 1 } }
{ "location" : { "name" : "Sharp Park" }, "conditions" : { "height" : 1, "period" : 15, "rating" : 1 } }
{ "location" : { "name" : "Sharp Park" }, "conditions" : { "height" : 5, "period" : 6, "rating" : 1 } }
{ "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 1, "period" : 6, "rating" : 1 } }
{ "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 0, "period" : 10, "rating" : 1 } }
{ "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 4, "period" : 6, "rating" : 1 } }
{ "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 0, "period" : 14, "rating" : 1 } }
Scaling

 Sharding is the partitioning of data among multiple
       machines
      Balancing occurs when the load on any one node grows
       out of proportion




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                         23
Scaling MongoDB



   Sharded cluster


                  MongoDB

              Single Instance
                    Or
                Replica Set
                                  Client
                                Application
The Mechanism of Sharding


                      Complete Data Set

Define Shard Key on Location Name




    Fort Point       Linda Mar Maverick’s Ocean Beach Rockaway
The Mechanism of Sharding


             Chunk                            Chunk

Define Shard Key on Location Name




    Fort Point       Linda Mar Maverick’s Ocean Beach Rockaway
The Mechanism of Sharding


  Chunk         Chunk           Chunk            Chunk




   Fort Point   Linda Mar Maverick’s Ocean Beach Rockaway
The Mechanism of Sharding


  Chunk         Chunk            Chunk            Chunk




   Fort Point   Linda Mar   Maverick’s Ocean BeachRockaway


    Shard 1     Shard 2           Shard 3          Shard 4
The Mechanism of Sharding




       Chu           Chu
       nkc           nkc

       Chu           Chu                        Chu   Chu        Chu   Chu   Chu   Chu
       nkc           nkc                        nkc   nkc        nkc   nkc   nkc   nkc




              Shard 1                           Shard 2           Shard 3    Shard 4




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                                29
The Mechanism of Sharding


                                                         Client
             Query: Linda Mar                          Application



       Chu           Chu
       nkc           nkc

       Chu           Chu                        Chu   Chu        Chu   Chu   Chu   Chu
       nkc           nkc                        nkc   nkc        nkc   nkc   nkc   nkc




              Shard 1                           Shard 2           Shard 3    Shard 4




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                                30
The Mechanism of Sharding


                                                         Client
             Query: Maverick’s                         Application



       Chu           Chu
       nkc           nkc

       Chu           Chu                        Chu   Chu        Chu   Chu   Chu   Chu
       nkc           nkc                        nkc   nkc        nkc   nkc   nkc   nkc




              Shard 1                           Shard 2           Shard 3    Shard 4




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                                31
Analysis Features:
Aggregation Framework




 What are the best conditions for my local beach?
Pipelining Operations


   $match        Match “Linda Mar”

   $project      Only interested in conditions

   $group        Group by rating, averaging
                 wave height and wave period

     $sort       Order by best conditions
Aggregation Framework

 { "aggregate" : "reports" ,
    "pipeline" : [
       { "$match" : { "location.name" : "Linda Mar"}} ,
       { "$project" : { "conditions" : 1}} ,
       { "$group" : {
          "_id" : "$conditions.rating" ,
          "average height" : { "$avg" : "$conditions.height"} ,
          "average period" : { "$avg" : "$conditions.period"}}} ,
       { "$sort" : { "_id" : -1}}
    ]
 }
Aggregation Framework

 { "aggregate" : "reports" ,
    "pipeline" : [
       { "$match" : { "location.name" : "Linda Mar"}} ,
       { "$project" : { "conditions" : 1}} ,
       { "$group" : {
          "_id" : "$conditions.rating" ,
          "average height" : { "$avg" : "$conditions.height"} ,
          "average period" : { "$avg" : "$conditions.period"}}} ,
       { "$sort" : { "_id" : -1}}
    ]
 }


                    Match “Linda Mar”
Aggregation Framework

 { "aggregate" : "reports" ,
    "pipeline" : [
       { "$match" : { "location.name" : "Linda Mar"}} ,
       { "$project" : { "conditions" : 1}} ,
       { "$group" : {
          "_id" : "$conditions.rating" ,
          "average height" : { "$avg" : "$conditions.height"} ,
          "average period" : { "$avg" : "$conditions.period"}}} ,
       { "$sort" : { "_id" : -1}}
    ]
 }


                 Only interested in conditions
Aggregation Framework

 { "aggregate" : "reports" ,
    "pipeline" : [
       { "$match" : { "location.name" : "Linda Mar"}} ,
       { "$project" : { "conditions" : 1}} ,
       { "$group" : {
          "_id" : "$conditions.rating" ,
          "average height" : { "$avg" : "$conditions.height"} ,
          "average period" : { "$avg" : "$conditions.period"}}} ,
       { "$sort" : { "_id" : -1}}
    ]
 }


        Group by rating & average conditions
Aggregation Framework

 { "aggregate" : "reports" ,
    "pipeline" : [
       { "$match" : { "location.name" : "Linda Mar"}} ,
       { "$project" : { "conditions" : 1}} ,
       { "$group" : {
          "_id" : "$conditions.rating" ,
          "average height" : { "$avg" : "$conditions.height"} ,
          "average period" : { "$avg" : "$conditions.period"}}} ,
       { "$sort" : { "_id" : -1}}
    ]
 }


            Show me best conditions first
Other Features…


      Native MapReduce
      Hadoop Connector
      Tagging
      Drivers for all major languages




©2012 Jaspersoft Corporation. Proprietary and
Confidential                                    39
Thanks!

   Office Hours
Thursdays 4-6 pm
555 University Ave.
    Palo Alto

   We’re Hiring !
Bryan.reinero@10gen.com

More Related Content

PPTX
Building your first Java Application with MongoDB
PPTX
Building Your First Java Application with MongoDB
PDF
MongoDB: Optimising for Performance, Scale & Analytics
PDF
Real Time Big Data (w/ NoSQL)
PDF
Intro to the Hadoop Stack @ April 2011 JavaMUG
PDF
ROracle
PPTX
Ops Jumpstart: MongoDB Administration 101
KEY
An introduction to CouchDB
Building your first Java Application with MongoDB
Building Your First Java Application with MongoDB
MongoDB: Optimising for Performance, Scale & Analytics
Real Time Big Data (w/ NoSQL)
Intro to the Hadoop Stack @ April 2011 JavaMUG
ROracle
Ops Jumpstart: MongoDB Administration 101
An introduction to CouchDB

Viewers also liked (7)

PPTX
MongoDB Roadmap
PPT
Giftivo mongodb
PPT
A Morning with MongoDB - Helsinki
PPTX
Indexing and Query Optimisation
PPTX
Webinar: Replication and Replica Sets
PPTX
Branf final bringing mongodb into your organization - mongo db-boston2012
KEY
Discover MongoDB - Israel
MongoDB Roadmap
Giftivo mongodb
A Morning with MongoDB - Helsinki
Indexing and Query Optimisation
Webinar: Replication and Replica Sets
Branf final bringing mongodb into your organization - mongo db-boston2012
Discover MongoDB - Israel
Ad

Similar to Building your first java application with MongoDB (20)

PDF
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
PDF
Improving Operational Space Responsiveness
PPTX
Operational Intelligence with MongoDB Webinar
PPTX
MongoDB for Time Series Data Part 3: Sharding
PDF
Webinar: Building Your First App with MongoDB and Java
PPTX
High-Volume Data Collection and Real Time Analytics Using Redis
PPT
Meetup#1: 10 reasons to fall in love with MongoDB
PDF
Getting started with Spark & Cassandra by Jon Haddad of Datastax
PDF
10 Key MongoDB Performance Indicators
PDF
Composable Data Processing with Apache Spark
PDF
2013 london advanced-replication
PDF
Mongodb workshop
PDF
A Century Of Weather Data - Midwest.io
PPTX
Average- An android project
PDF
HMS: Scalable Configuration Management System for Hadoop
PDF
Spring Data MongoDB 介紹
PDF
NoSQL Infrastructure
PDF
Maintenance for MongoDB Replica Sets
PPTX
Data-centric Invocable Services
KEY
OpenStack Folsom Summit: Melange overview
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Improving Operational Space Responsiveness
Operational Intelligence with MongoDB Webinar
MongoDB for Time Series Data Part 3: Sharding
Webinar: Building Your First App with MongoDB and Java
High-Volume Data Collection and Real Time Analytics Using Redis
Meetup#1: 10 reasons to fall in love with MongoDB
Getting started with Spark & Cassandra by Jon Haddad of Datastax
10 Key MongoDB Performance Indicators
Composable Data Processing with Apache Spark
2013 london advanced-replication
Mongodb workshop
A Century Of Weather Data - Midwest.io
Average- An android project
HMS: Scalable Configuration Management System for Hadoop
Spring Data MongoDB 介紹
NoSQL Infrastructure
Maintenance for MongoDB Replica Sets
Data-centric Invocable Services
OpenStack Folsom Summit: Melange overview
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Empathic Computing: Creating Shared Understanding
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Approach and Philosophy of On baking technology
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Assigned Numbers - 2025 - Bluetooth® Document
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPT
Teaching material agriculture food technology
PDF
Electronic commerce courselecture one. Pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
“AI and Expert System Decision Support & Business Intelligence Systems”
Digital-Transformation-Roadmap-for-Companies.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Encapsulation_ Review paper, used for researhc scholars
Empathic Computing: Creating Shared Understanding
Unlocking AI with Model Context Protocol (MCP)
Approach and Philosophy of On baking technology
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
MYSQL Presentation for SQL database connectivity
Assigned Numbers - 2025 - Bluetooth® Document
The AUB Centre for AI in Media Proposal.docx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Teaching material agriculture food technology
Electronic commerce courselecture one. Pdf
Spectroscopy.pptx food analysis technology
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
sap open course for s4hana steps from ECC to s4
Building Integrated photovoltaic BIPV_UPV.pdf

Building your first java application with MongoDB

  • 1. Building Your First Application in Java Bryan Reinero [email protected] September 2012 1
  • 2. High performance  Highly available  Easily scalable  Easy to use  Feature rich Document store ©2012 Jaspersoft Corporation. Proprietary and Confidential 2
  • 3. Data Model  A Mongo system holds a set of databases  A database holds a set of collections  A collection holds a set of documents  A document is a set of fields  A field is a key-value pair  A key is a name (string)  A value is a basic type like string, integer, float, timestamp, binary, etc., a document, or an array of values ©2012 Jaspersoft Corporation. Proprietary and Confidential 3
  • 4. High Availability: Replica Sets  Initialize -> Election  Primary + data replication from primary to secondary Node 1 Node 2 Secondary Heartbeat Secondary Node 3 Primary Replication Replication ©2012 Jaspersoft Corporation. Proprietary and Confidential 4
  • 5. High Availability: Failure  Primary down/network failure  Automatic election of new primary if majority exists Primary Election Node 1 Node 2 Secondary Heartbeat Secondary Node 3 Primary ©2012 Jaspersoft Corporation. Proprietary and Confidential 5
  • 6. High Availability: Failover  New primary elected  Replication established from new primary Node 1 Node 2 Secondary Heartbeat Secondary Node 3 Primary ©2012 Jaspersoft Corporation. Proprietary and Confidential 6
  • 7. Durability  Fire and forget  Wait for error  Wait for journal sync  Wait for fsync  Wait for replication ©2012 Jaspersoft Corporation. Proprietary and Confidential 7
  • 8. Read Preferences  PRIMARY  PRIMARY PREFERRED  SECONDARY  SECONDARY PREFERRED  NEAREST ©2012 Jaspersoft Corporation. Proprietary and Confidential 8
  • 9. Let’s build a location based surf reporting app! ©2012 Jaspersoft Corporation. Proprietary and Confidential 9
  • 10. Let’s build a location based surf reporting app! • Report current conditions
  • 11. Let’s build a location based surf reporting app! • Report current conditions • Get current local conditions
  • 12. Let’s build a location based surf reporting app! • Report current conditions • Get current local conditions • Determine best conditions per beach
  • 13. Document Structure { "_id" : ObjectId("504ceb3d30042d707af96fef"), "reporter" : "test", "location" : { "coordinates" : [ -122.477222, 37.810556 ], "name" : "Fort Point" }, "conditions" : { "height" : 0, "period" : 9, "rating" : 1 }, "date" : ISODate("2011-11-16T20:17:17.277Z") }
  • 14. Document Structure { "_id" : ObjectId("504ceb3d30042d707af96fef"), Primary Key, "reporter" : "test", Unique, "location" : { "coordinates" : [ Auto-indexed -122.477222, 37.810556 ], "name" : "Fort Point" }, "conditions" : { "height" : 0, "period" : 9, "rating" : 1 }, "date" : ISODate("2011-11-16T20:17:17.277Z") }
  • 15. Document Structure { "_id" : ObjectId("504ceb3d30042d707af96fef"), Primary Key, "reporter" : "test", Unique, "location" : { "coordinates" : [ Autoindexed -122.477222, 37.810556 ], Compound Index, "name" : "Fort Point" Geospacial }, "conditions" : { "height" : 0, "period" : 9, "rating" : 1 }, "date" : ISODate("2011-11-16T20:17:17.277Z") }
  • 16. Document Structure { "_id" : ObjectId("504ceb3d30042d707af96fef"), Primary Key, "reporter" : "test", Unique, "location" : { "coordinates" : [ Autoindexed -122.477222, 37.810556 ], Compound Index, "name" : "Fort Point" Geospacial }, "conditions" : { "height" : 0, "period" : 9, "rating" : 1 }, Indexed for "date" : ISODate("2011-11-16T20:17:17.277Z") Time-To-Live }
  • 17. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"date" : 1, "location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1})
  • 18. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"date" : 1, "location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1}) • Get local reports
  • 19. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"date" : 1, "location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1}) • Get local reports • Get today’s reports
  • 20. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1}) • Get local reports • Get today’s reports • Return only the relevant info
  • 21. Get local surf conditions db.reports.find( { "location.coordinates" : { $near : [-122, 37] , $maxDistance : 0.9}, date : { $gte : new Date(2012, 8, 9)} }, {"location.name" :1, _id : 0, "conditions" :1} ).sort({"conditions.rating" : -1}) • Get local reports • Get today’s reports • Return only the relevant info • Show me the best surf first
  • 22. Results { "location" : { "name" : "Montara" }, "conditions" : { "height" : 6, "period" : 20, "rating" : 5 } } { "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 5, "period" : 13, "rating" : 3 } } { "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 3, "period" : 15, "rating" : 3 } } { "location" : { "name" : "Maverick's" }, "conditions" : { "height" : 3, "period" : 16, "rating" : 2 } } { "location" : { "name" : "Montara" }, "conditions" : { "height" : 0, "period" : 8, "rating" : 1 } } { "location" : { "name" : "Linda Mar" }, "conditions" : { "height" : 3, "period" : 10, "rating" : 1 } } { "location" : { "name" : "Sharp Park" }, "conditions" : { "height" : 1, "period" : 15, "rating" : 1 } } { "location" : { "name" : "Sharp Park" }, "conditions" : { "height" : 5, "period" : 6, "rating" : 1 } } { "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 1, "period" : 6, "rating" : 1 } } { "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 0, "period" : 10, "rating" : 1 } } { "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 4, "period" : 6, "rating" : 1 } } { "location" : { "name" : "South Ocean Beach" }, "conditions" : { "height" : 0, "period" : 14, "rating" : 1 } }
  • 23. Scaling  Sharding is the partitioning of data among multiple machines  Balancing occurs when the load on any one node grows out of proportion ©2012 Jaspersoft Corporation. Proprietary and Confidential 23
  • 24. Scaling MongoDB Sharded cluster MongoDB Single Instance Or Replica Set Client Application
  • 25. The Mechanism of Sharding Complete Data Set Define Shard Key on Location Name Fort Point Linda Mar Maverick’s Ocean Beach Rockaway
  • 26. The Mechanism of Sharding Chunk Chunk Define Shard Key on Location Name Fort Point Linda Mar Maverick’s Ocean Beach Rockaway
  • 27. The Mechanism of Sharding Chunk Chunk Chunk Chunk Fort Point Linda Mar Maverick’s Ocean Beach Rockaway
  • 28. The Mechanism of Sharding Chunk Chunk Chunk Chunk Fort Point Linda Mar Maverick’s Ocean BeachRockaway Shard 1 Shard 2 Shard 3 Shard 4
  • 29. The Mechanism of Sharding Chu Chu nkc nkc Chu Chu Chu Chu Chu Chu Chu Chu nkc nkc nkc nkc nkc nkc nkc nkc Shard 1 Shard 2 Shard 3 Shard 4 ©2012 Jaspersoft Corporation. Proprietary and Confidential 29
  • 30. The Mechanism of Sharding Client Query: Linda Mar Application Chu Chu nkc nkc Chu Chu Chu Chu Chu Chu Chu Chu nkc nkc nkc nkc nkc nkc nkc nkc Shard 1 Shard 2 Shard 3 Shard 4 ©2012 Jaspersoft Corporation. Proprietary and Confidential 30
  • 31. The Mechanism of Sharding Client Query: Maverick’s Application Chu Chu nkc nkc Chu Chu Chu Chu Chu Chu Chu Chu nkc nkc nkc nkc nkc nkc nkc nkc Shard 1 Shard 2 Shard 3 Shard 4 ©2012 Jaspersoft Corporation. Proprietary and Confidential 31
  • 32. Analysis Features: Aggregation Framework What are the best conditions for my local beach?
  • 33. Pipelining Operations $match Match “Linda Mar” $project Only interested in conditions $group Group by rating, averaging wave height and wave period $sort Order by best conditions
  • 34. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] }
  • 35. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] } Match “Linda Mar”
  • 36. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] } Only interested in conditions
  • 37. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] } Group by rating & average conditions
  • 38. Aggregation Framework { "aggregate" : "reports" , "pipeline" : [ { "$match" : { "location.name" : "Linda Mar"}} , { "$project" : { "conditions" : 1}} , { "$group" : { "_id" : "$conditions.rating" , "average height" : { "$avg" : "$conditions.height"} , "average period" : { "$avg" : "$conditions.period"}}} , { "$sort" : { "_id" : -1}} ] } Show me best conditions first
  • 39. Other Features…  Native MapReduce  Hadoop Connector  Tagging  Drivers for all major languages ©2012 Jaspersoft Corporation. Proprietary and Confidential 39
  • 40. Thanks! Office Hours Thursdays 4-6 pm 555 University Ave. Palo Alto We’re Hiring ! [email protected]