SlideShare a Scribd company logo
When to Use MongoDB
When should you use MongoDB 
…. And when you should not…. 
Edouard Servan-Schreiber, Ph.D. 
Director for Solution Architecture 
MongoDB 
edss@mongodb.com
Agenda 
• What is MongoDB? 
• What is MongoDB for? 
• What does MongoDB do very well…. And less well 
• What do customers do very well with MongoDB, and 
what they do not do 
• Some unusual use cases 
• When you should use MongoDB
CREATE APPLICATIONS 
NEVER BEFORE POSSIBLE 
AGILE SCALABLE
MongoDB 
GENERAL PURPOSE DOCUMENT DATABASE OPEN-SOURCE
What is MongoDB for? 
• The data store for all systems of engagement 
– Demanding, real-time SLAs 
– Diverse, mixed data sets 
– Massive concurrency 
– Globally deployed over multiple sites 
– No downtime tolerated 
– Able to grow with user needs 
– High uncertainty in sizing 
– Fast scaling needs 
– Delivers a seamless and consistent experience
What MongoDB is NOT 
• An analytical suite 
– Not competing with SAS or SPSS 
• A data warehouse technology 
– Not competing with Teradata, Netezza, Vertica 
• A BI tool 
– Not competing with Tableau or QlikView 
• Backoffice transaction processing 
– Not competing with IBM Mainframes 
• Backend for a billing system or general ledger system 
– Not competing with Oracle RAC 
• A search engine 
– Not competing with Elasticsearch, SOLR
MongoDB and Enterprise IT Stack
MongoDB and Enterprise IT Stack 
OLTP OLAP
Factors Driving Modern Applications 
Data 
• 90% data created in last 2 years 
• 80% enterprise data is unstructured 
• Unstructured data growing 2X rate 
of structured data 
Mobile 
• 2 Billion smartphones by 2015 
• Mobile now >50% internet use 
• 26 Billion devices on IoT by 
2020 
Social 
• 72% of internet use is social media 
• 2 Billion active users monthly 
• 93% of businesses use social media 
Cloud 
• Compute costs declining 33% YOY 
• Storage costs declining 38% YOY 
• Network costs declining 27% YOY
MongoDB Strategic Advantages 
Horizontally Scalable 
-Sharding 
Agile 
Flexible 
High Performance & 
Strong Consistency 
Application 
Highly 
Available 
-Replica Sets 
{ author: “eliot”, 
date: new Date(), 
text: “MongoDB”, 
tags: [“database”, “flexible”, 
“JSON”]}
Document Data Model 
Relational MongoDB 
{ 
first_name: ‘Paul’, 
surname: ‘Miller’, 
city: ‘London’, 
location: 
[45.123,47.232], 
cars: [ 
{ model: ‘Bentley’, 
year: 1973, 
value: 100000, … }, 
{ model: ‘Rolls Royce’, 
year: 1965, 
value: 330000, … } 
] 
}
Do More With Your Data 
MongoDB 
{ 
first_name: ‘Paul’, 
surname: ‘Miller’, 
city: ‘London’, 
location: 
[45.123,47.232], 
cars: [ 
{ model: ‘Bentley’, 
year: 1973, 
value: 100000, … }, 
{ model: ‘Rolls Royce’, 
year: 1965, 
value: 330000, … } 
} 
} 
Rich Queries 
Find Paul’s cars 
Find everybody in London with a car 
built between 1970 and 1980 
Geospatial 
Find all of the car owners within 5km 
of Trafalgar Sq. 
Text Search 
Find all the cars described as having 
leather seats 
Aggregation 
Calculate the average value of Paul’s 
car collection 
Map Reduce 
What is the ownership pattern of 
colors by geography over time? 
(is purple trending up in China?)
Requirements For These Challenges 
Addresses Requirement Description 
Data Types Hierarchical data 
structure 
Can match the structure of objects in today’s OOP languages 
Data Types, 
Agile 
Dynamic schema Can handle differently shaped data in a table/collection and not a 
predefined schema 
Agile Native OOP language Keeps developers in one environment and encapsulates 
functionality/validation/rules in one place 
Volume Scale Can efficiently handle 100s tera & petabytes of data 
Volumes, New 
Arch 
Performance High throughput on a single node and scales horizontally easily 
Still required Software cost Open source with premium value added services 
Still required Data consistency How soon you can read data that was just written 
Still required Rich querying Querying based on any field, e.g. secondary indexes 
Still required Ease of use Short learning curve and easy to design
How Databases Stack Up 
Requirement RDBMS MongoDB Key/value Wide column 
Hierarchical data 
structure 
Poor Great Poor Good 
Dynamic schema Poor Great Poor Poor 
Native OOP 
language 
Poor Great Great Great 
Software cost Poor Great Great Great 
Performance Poor Great Great Great 
Scale Poor Great Great Great 
Data consistency Great Good Poor Poor 
Rich querying Great Great Poor Poor 
Ease of use Good Great Good Poor
How Databases Stack Up 
Requirement RDBMS MongoDB Key/value Wide column 
Hierarchical data 
structure 
Poor Great Poor Good 
Dynamic schema Poor Great Poor Poor 
Native OOP 
language 
VALUE OF NOSQL 
Poor Great Great Great 
Software cost Poor Great Great Great 
Performance Poor Great Great Great 
Scale Poor Great Great Great 
Data consistency Great Good Poor Poor 
Rich querying Great Great Poor Poor 
Ease of use Good Great Good Poor
How Databases Stack Up 
Requirement RDBMS MongoDB Key/value Wide column 
Hierarchical data 
structure 
Poor Great Poor Good 
Dynamic schema Poor Great Poor Poor 
Native OOP 
language 
VALUE OF NOSQL 
Poor Great Great Great 
Software cost Poor Great Great Great 
Performance Poor Great Great Great 
Scale Poor Great Great Great 
Data consistency Great Good Poor Poor 
VALUE OF MONGODB 
Rich querying Great Great Poor Poor 
Ease of use Good Great Good Poor
As a database, where does MongoDB shine? 
MongoDB does well MongoDB does less well 
• Straightforward replication 
• High performance on mixed workloads 
of reads, writes and updates 
• Scaling on demand 
• Location based deployments 
• Geo spatial queries 
• High Availability and auto failover 
• Flexible schema & secondary indexing 
• Agile development in most 
programming languages 
• Commodity infrastructure 
• Real time analytics 
• Text indexing 
• Data consistency 
• Compression 
• Resource management * 
• Collection scanning under load * 
• Absolute write availability 
• Faceted search 
• Joins across collections 
• SQL* 
• Transactions over multiple docs
As a database, where does MongoDB shine? 
MongoDB does well 
• Straightforward replication 
• High performance on mixed workloads 
of reads, writes and updates 
• Scaling on demand 
• Location based deployment 
• Geo spatial queries 
• High Availability and auto failover 
• Flexible schema & secondary indexing 
• Agile development in most 
programming languages 
• Commodity infrastructure 
• Real time analytics 
• Text indexing 
• Data consistency 
• Compression 
Easy to initiate 
All reads, mixed, and mostly writes 
No expensive overprovisioning 
One cluster can span the globe 
Easy to build relevant mobile apps 
Low stress operations 
No need for complex data modeling 
No need to give up your favorite 
development language 
No vendor lock-in through hardware 
Get value from data right away ! 
Basic search feature 
Simpler app design 
With new version 2.8
As a database, where does MongoDB shine? 
MongoDB does less well 
• Resource management * 
• Collection scanning under load * 
• Absolute write availability 
• Faceted search 
• Joins across collections 
• SQL* 
• Transactions over multiple docs 
Needs to be done at infrastructure level 
Concurrent scans can disrupt the working 
set 
Consistency vs Availability 
Core value of search engines 
Doc model mitigates need for this 
Some partial solutions (ODBC) 
Pushed to application level. Rarely needed 
with good schema design
MongoDB Use Cases 
Single View Internet of Things Mobile Real-Time Analytics 
Catalog Personalization Content Management
Use cases where MongoDB shines 
MongoDB is good for MongoDB is less good for 
• Single View 
• Search engine 
• Internet of Things – sensor data 
• Mobile apps – geospatial 
• Real-time analytics 
• Catalog 
• Personalization 
• Content management 
• Inventory management 
• Personalization engines 
• Shopping cart 
• Dependent datamarts 
• Archiving for fast lookup 
• Collaboration tools 
• Messaging applications 
• Log file aggregation 
• Caching 
• Adserving 
• …… 
• Slicing and dicing of data in unplanned 
ways requiring joins and full scans 
• Nanosecond latency writing (real time 
tick data) 
• Uptime beyond 99.999%, instant 
failover 
• Batch processing
Use cases where MongoDB shines 
MongoDB is good for 
• Single View 
• Internet of Things – sensor data 
• Mobile apps – geospatial 
• Real-time analytics 
• Catalog 
• Personalization 
• Content management 
• Inventory management 
• Personalization engines 
• Shopping cart 
• Dependent datamarts 
• Archiving for fast lookup 
• Collaboration tools 
• Messaging applications 
• Log file aggregation 
• Caching 
• Adserving 
• …… 
Mixture of analytics and archiving 
Build information from data as it comes in 
Extract from DW for analysis 
Large volume, targeted queries 
Sharing in near real time 
Twitter-like apps 
E.g., SPLUNK 
Enable massive reads on consolidated data
Use cases where MongoDB shines 
MongoDB is less good for 
• Search engine 
• Slicing and dicing of data in unplanned 
ways requiring joins and full scans 
• Nanosecond latency writing (real time 
tick data) 
• Uptime beyond 99.999%, instant 
failover 
• Batch processing 
Text indexing only for elementary uses 
Classic DW usage. MongoDB needs known 
query pattern. 
Specialty DBs like Kdb are built for this 
Requires failover in <1s 
That’s what Hadoop is for…. 
Note: transaction processing does not require 
database transactions. Move money from 
account A to account B is never instantaneous 
and requires actual processing…. Usually in 
batch
Data Consolidation 
Operational Data Hub Benefits 
Data 
Warehouse 
Real-time or 
Batch 
Engagement 
Applicaiton 
Engagement 
Applicaiton 
• Real-time 
• Complete details 
• Agile 
• Higher customer 
retention 
• Increase wallet share 
• Proactive exception 
handling 
Strategic 
Reporting 
Operational 
Reporting 
Cards 
CarDdast a 
Source 1 
Loans 
LoaDnasta 
Source 2 
… 
Deposits 
Deposits 
Data 
Source n
Data Hub for Large Investment Bank 
Feeds & Batch data 
• Pricing 
• Accounts 
• Securities Master 
• Corporate actions 
Source 
Master Data 
(RDBMS) 
Batch 
Batch Batch 
Batch 
Batch 
Batch 
Batch 
Destination 
Data 
(RDBMS) 
Each represents 
• People $ 
• Hardware $ 
• License $ 
• Reg penalty $ 
• & other downstream 
problems
Data Hub for Large Investment Bank 
Feeds & Batch data 
• Pricing 
• Accounts 
• Securities Master 
• Corporate actions 
Source 
Master Data 
(RDBMS) 
Batch 
Batch Batch 
Batch 
Batch 
Batch 
Batch 
Destination 
Data 
(RDBMS) 
Each represents 
• People $ 
• Hardware $ 
• License $ 
• Reg penalty $ 
• & other downstream 
problems 
• Delays up to 36 hours in 
distributing data by batch 
• Charged multiple times 
globally for same data 
• Incurring regulatory 
penalties from missing 
SLAs 
• Had to manage 20 
distributed systems with 
same data
Data Hub for Large Investment Bank 
Feeds & Batch data 
• Pricing 
• Accounts 
• Securities Master 
• Corporate actions 
Real-time 
Real-time Real-time 
Real-time 
Real-time 
Real-time 
Real-time 
Each represents 
• No people $ 
• Less hardware $ 
• Less license $ 
• No penalty $ 
• & many less problems 
MongoDB 
Primary 
MongoDB 
Secondaries
Data Hub for Large Investment Bank 
Feeds & Batch data 
• Pricing 
• Accounts 
• Securities Master 
• Corporate actions 
Real-time 
Real-time Real-time 
Real-time 
Real-time 
Real-time 
Real-time 
Each represents 
• No people $ 
• Less hardware $ 
• Less license $ 
• No penalty $ 
• & many less problems 
MongoDB 
Primary 
MongoDB 
Secondaries 
• Will save about 
$40,000,000 in costs and 
penalties over 5 years 
• Only charged once for data 
• Data in sync globally and 
read locally 
• Capacity to move to one 
global shared data service
Molecular Similarity Database 
• Store Chemical Compounds – 
Fingerprints 
• Want to find compounds which 
are “close” to a given 
compound 
• Need to return quickly a small 
set of reasonable candidates 
• Few researchers working 
concurrently 
• Use Tanimoto association 
coefficient to compare two 
compounds based on their 
common fingerprints
Big Data Genomics 
• Very large base of DNA sample 
sequences 
– Origin, collection method, 
sequence, date, … 
• Enumeration of mutations 
relative to reference sequence 
– Positions, mutation type, 
base 
• Need to retrieve efficiently all 
sequences showing a particular 
mutation 
• Similar to Content 
Management System pattern 
• Add tag array in sequence 
document with mutation 
names 
• Index tag array 
• Queries looking for affected 
sequences are indexed and 
very fast 
• Easy to setup, flexible 
representation and details for 
sequences, flexible evolution 
• Can scale to massive volumes
IoT: Large Industrial Vehicle Manufacturer 
Shard 1 
Secondary 
Shard 2 
Secondary 
Shard 3 
Secondary 
Shard 1 
Primary 
Shard 1 
Secondary 
Shard 1 
Primary 
Shard 1 
Secondary 
Shard 1 
Primary 
Shard 1 
Secondary 
Central 
Hub 
Regional 
Hub 
Regional 
Hub 
Regional 
Hub
What database do you need for your 
business?
What vehicle do you want for a race?
WHAT ARE YOU TRYING 
TO ACHIEVE?
The important aspect of MongoDB 
• MongoDB was not designed for niche use cases 
• MongoDB strives to have excellent 
characteristics applicable to a very broad range 
of use cases 
MongoDB is the most balanced database for 
Enterprise applications and performance
Technical: Why MongoDB 
• High performance (1000’s – 
millions queries / sec) - reads & 
writes 
• Need flexible schema, rich 
querying with any number of 
secondary indexes 
• Need for replication across 
multiple data centers, even 
globally 
• Need to deploy rapidly and 
scale on demand (start small 
and fast, grow easily) 
• 99.999% availability 
• Real time analysis in the 
database, under load 
• Geospatial querying 
• Processing in real time, not in 
batch 
• Need to promote agile coding 
methodologies 
• Deploy over commodity 
computing and storage 
architectures 
• Point in Time recovery 
• Need strong data consistency 
• Advanced security
Technical: Why MongoDB 
• High performance (1000’s – 
millions queries / sec) - reads & 
writes 
• Need flexible schema, rich 
querying with any number of 
secondary indexes 
• Need for replication across 
multiple data centers, even 
globally 
• Need to deploy rapidly and 
scale on demand (start small 
and fast, grow easily) 
• 99.999% availability 
• Real time analysis in the 
database, under load 
• Geospatial querying 
• Processing in real time, not in 
batch 
• Need to promote agile coding 
methodologies 
• Deploy over commodity 
computing and storage 
architectures 
• Point in Time recovery 
• Need strong data consistency 
• Advanced security
Business: Why MongoDB 
• Management tooling and services 
• Ease of hiring 
• Commercial license 
• Ease of developer adoption 
• Global Support 
• Global Professional Services 
• IT ecosystem integration 
• Company stability 
• De facto standard for next generation database
Business: Why MongoDB 
• Management tooling and services 
• Ease of hiring 
• Commercial license 
• Ease of developer adoption 
• Global Support 
• Global Professional Services 
• IT ecosystem integration 
• Company stability 
• De facto standard for next generation database
Summary 
• MongoDB is for Systems of Engagement 
• Complements search engines, Hadoop and Data 
Warehouses 
– Does not replace these technologies 
• Wide range of use cases – and that’s the core point ! 
– Excellent across many possible use cases, not just a few 
• Recognized by Gartner and Forrester 
• De facto standard for next generation database 
• Enterprise maturity and integration
We Can Help 
MongoDB Enterprise Advanced 
The best way to run MongoDB in your data center 
MongoDB Management Service (MMS) 
The easiest way to run MongoDB in the cloud 
Production Support 
In production and under control 
Development Support 
Let’s get you running 
Consulting 
We solve problems 
Training 
Get your teams up to speed
When to Use MongoDB

More Related Content

PDF
Power of the Log: LSM & Append Only Data Structures
PDF
Common MongoDB Use Cases
PDF
DevOps for Databricks
PPTX
PPTX
Introduction to MongoDB
PDF
Docker 101: Introduction to Docker
PPTX
Introduction to docker
PDF
DevOps for beginners
Power of the Log: LSM & Append Only Data Structures
Common MongoDB Use Cases
DevOps for Databricks
Introduction to MongoDB
Docker 101: Introduction to Docker
Introduction to docker
DevOps for beginners

What's hot (20)

PPTX
Oracle GoldenGate for Zero Downtime Migration
PPSX
Microservices Architecture - Cloud Native Apps
PPT
Introduction to MongoDB
PPTX
Introduction to azure cosmos db
PPTX
Batch Processing vs Stream Processing Difference
PDF
Azure Cosmos DB
PPTX
The Right (and Wrong) Use Cases for MongoDB
PPTX
An Enterprise Architect's View of MongoDB
PPTX
Key-Value NoSQL Database
PDF
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
PDF
Introduction to MariaDB
PPTX
Programming in Spark using PySpark
PPTX
AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017
PDF
Migration From Oracle to PostgreSQL
PPTX
Azure DataBricks for Data Engineering by Eugene Polonichko
PPTX
Microsoft Azure Databricks
PDF
Modularized ETL Writing with Apache Spark
PPTX
Serverless integration with Knative and Apache Camel on Kubernetes
PDF
Storing time series data with Apache Cassandra
Oracle GoldenGate for Zero Downtime Migration
Microservices Architecture - Cloud Native Apps
Introduction to MongoDB
Introduction to azure cosmos db
Batch Processing vs Stream Processing Difference
Azure Cosmos DB
The Right (and Wrong) Use Cases for MongoDB
An Enterprise Architect's View of MongoDB
Key-Value NoSQL Database
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Introduction to MariaDB
Programming in Spark using PySpark
AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017
Migration From Oracle to PostgreSQL
Azure DataBricks for Data Engineering by Eugene Polonichko
Microsoft Azure Databricks
Modularized ETL Writing with Apache Spark
Serverless integration with Knative and Apache Camel on Kubernetes
Storing time series data with Apache Cassandra
Ad

Similar to When to Use MongoDB (20)

PPTX
Webinar: When to Use MongoDB
PPTX
When to Use MongoDB...and When You Should Not...
PPTX
Augmenting Mongo DB with treasure data
PPTX
Augmenting Mongo DB with Treasure Data
PDF
Mongo db 3.4 Overview
PPTX
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
PPTX
PDF
MongoDB Breakfast Milan - Mainframe Offloading Strategies
PDF
MongoDB World 2018: Data Analytics with MongoDB
PPTX
Business Jumpstart: The Right (and Wrong) Use Cases for MongoDB
PPTX
Dataweek-Talk-2014
PPTX
Nosql Now 2012: MongoDB Use Cases
PDF
Introduction to MongoDB Basics from SQL to NoSQL
PDF
Enabling Telco to Build and Run Modern Applications
PPTX
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
PPTX
MongoDB Training
PDF
MongoDB Basics
PDF
Which Questions We Should Have
PPT
MONGODB VASUDEV PRAJAPATI DOCUMENTBASE DATABASE
PPTX
Transform your DBMS to drive engagement innovation with Big Data
Webinar: When to Use MongoDB
When to Use MongoDB...and When You Should Not...
Augmenting Mongo DB with treasure data
Augmenting Mongo DB with Treasure Data
Mongo db 3.4 Overview
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
MongoDB Breakfast Milan - Mainframe Offloading Strategies
MongoDB World 2018: Data Analytics with MongoDB
Business Jumpstart: The Right (and Wrong) Use Cases for MongoDB
Dataweek-Talk-2014
Nosql Now 2012: MongoDB Use Cases
Introduction to MongoDB Basics from SQL to NoSQL
Enabling Telco to Build and Run Modern Applications
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB Training
MongoDB Basics
Which Questions We Should Have
MONGODB VASUDEV PRAJAPATI DOCUMENTBASE DATABASE
Transform your DBMS to drive engagement innovation with Big Data
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Machine Learning_overview_presentation.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
1. Introduction to Computer Programming.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
Tartificialntelligence_presentation.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
Encapsulation theory and applications.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Machine learning based COVID-19 study performance prediction
PPT
Teaching material agriculture food technology
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
cuic standard and advanced reporting.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Unlocking AI with Model Context Protocol (MCP)
Diabetes mellitus diagnosis method based random forest with bat algorithm
Machine Learning_overview_presentation.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
SOPHOS-XG Firewall Administrator PPT.pptx
1. Introduction to Computer Programming.pptx
Big Data Technologies - Introduction.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Tartificialntelligence_presentation.pptx
Spectroscopy.pptx food analysis technology
Encapsulation theory and applications.pdf
Spectral efficient network and resource selection model in 5G networks
Group 1 Presentation -Planning and Decision Making .pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Machine learning based COVID-19 study performance prediction
Teaching material agriculture food technology
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
cuic standard and advanced reporting.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...

When to Use MongoDB

  • 2. When should you use MongoDB …. And when you should not…. Edouard Servan-Schreiber, Ph.D. Director for Solution Architecture MongoDB [email protected]
  • 3. Agenda • What is MongoDB? • What is MongoDB for? • What does MongoDB do very well…. And less well • What do customers do very well with MongoDB, and what they do not do • Some unusual use cases • When you should use MongoDB
  • 4. CREATE APPLICATIONS NEVER BEFORE POSSIBLE AGILE SCALABLE
  • 5. MongoDB GENERAL PURPOSE DOCUMENT DATABASE OPEN-SOURCE
  • 6. What is MongoDB for? • The data store for all systems of engagement – Demanding, real-time SLAs – Diverse, mixed data sets – Massive concurrency – Globally deployed over multiple sites – No downtime tolerated – Able to grow with user needs – High uncertainty in sizing – Fast scaling needs – Delivers a seamless and consistent experience
  • 7. What MongoDB is NOT • An analytical suite – Not competing with SAS or SPSS • A data warehouse technology – Not competing with Teradata, Netezza, Vertica • A BI tool – Not competing with Tableau or QlikView • Backoffice transaction processing – Not competing with IBM Mainframes • Backend for a billing system or general ledger system – Not competing with Oracle RAC • A search engine – Not competing with Elasticsearch, SOLR
  • 9. MongoDB and Enterprise IT Stack OLTP OLAP
  • 10. Factors Driving Modern Applications Data • 90% data created in last 2 years • 80% enterprise data is unstructured • Unstructured data growing 2X rate of structured data Mobile • 2 Billion smartphones by 2015 • Mobile now >50% internet use • 26 Billion devices on IoT by 2020 Social • 72% of internet use is social media • 2 Billion active users monthly • 93% of businesses use social media Cloud • Compute costs declining 33% YOY • Storage costs declining 38% YOY • Network costs declining 27% YOY
  • 11. MongoDB Strategic Advantages Horizontally Scalable -Sharding Agile Flexible High Performance & Strong Consistency Application Highly Available -Replica Sets { author: “eliot”, date: new Date(), text: “MongoDB”, tags: [“database”, “flexible”, “JSON”]}
  • 12. Document Data Model Relational MongoDB { first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ] }
  • 13. Do More With Your Data MongoDB { first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } } } Rich Queries Find Paul’s cars Find everybody in London with a car built between 1970 and 1980 Geospatial Find all of the car owners within 5km of Trafalgar Sq. Text Search Find all the cars described as having leather seats Aggregation Calculate the average value of Paul’s car collection Map Reduce What is the ownership pattern of colors by geography over time? (is purple trending up in China?)
  • 14. Requirements For These Challenges Addresses Requirement Description Data Types Hierarchical data structure Can match the structure of objects in today’s OOP languages Data Types, Agile Dynamic schema Can handle differently shaped data in a table/collection and not a predefined schema Agile Native OOP language Keeps developers in one environment and encapsulates functionality/validation/rules in one place Volume Scale Can efficiently handle 100s tera & petabytes of data Volumes, New Arch Performance High throughput on a single node and scales horizontally easily Still required Software cost Open source with premium value added services Still required Data consistency How soon you can read data that was just written Still required Rich querying Querying based on any field, e.g. secondary indexes Still required Ease of use Short learning curve and easy to design
  • 15. How Databases Stack Up Requirement RDBMS MongoDB Key/value Wide column Hierarchical data structure Poor Great Poor Good Dynamic schema Poor Great Poor Poor Native OOP language Poor Great Great Great Software cost Poor Great Great Great Performance Poor Great Great Great Scale Poor Great Great Great Data consistency Great Good Poor Poor Rich querying Great Great Poor Poor Ease of use Good Great Good Poor
  • 16. How Databases Stack Up Requirement RDBMS MongoDB Key/value Wide column Hierarchical data structure Poor Great Poor Good Dynamic schema Poor Great Poor Poor Native OOP language VALUE OF NOSQL Poor Great Great Great Software cost Poor Great Great Great Performance Poor Great Great Great Scale Poor Great Great Great Data consistency Great Good Poor Poor Rich querying Great Great Poor Poor Ease of use Good Great Good Poor
  • 17. How Databases Stack Up Requirement RDBMS MongoDB Key/value Wide column Hierarchical data structure Poor Great Poor Good Dynamic schema Poor Great Poor Poor Native OOP language VALUE OF NOSQL Poor Great Great Great Software cost Poor Great Great Great Performance Poor Great Great Great Scale Poor Great Great Great Data consistency Great Good Poor Poor VALUE OF MONGODB Rich querying Great Great Poor Poor Ease of use Good Great Good Poor
  • 18. As a database, where does MongoDB shine? MongoDB does well MongoDB does less well • Straightforward replication • High performance on mixed workloads of reads, writes and updates • Scaling on demand • Location based deployments • Geo spatial queries • High Availability and auto failover • Flexible schema & secondary indexing • Agile development in most programming languages • Commodity infrastructure • Real time analytics • Text indexing • Data consistency • Compression • Resource management * • Collection scanning under load * • Absolute write availability • Faceted search • Joins across collections • SQL* • Transactions over multiple docs
  • 19. As a database, where does MongoDB shine? MongoDB does well • Straightforward replication • High performance on mixed workloads of reads, writes and updates • Scaling on demand • Location based deployment • Geo spatial queries • High Availability and auto failover • Flexible schema & secondary indexing • Agile development in most programming languages • Commodity infrastructure • Real time analytics • Text indexing • Data consistency • Compression Easy to initiate All reads, mixed, and mostly writes No expensive overprovisioning One cluster can span the globe Easy to build relevant mobile apps Low stress operations No need for complex data modeling No need to give up your favorite development language No vendor lock-in through hardware Get value from data right away ! Basic search feature Simpler app design With new version 2.8
  • 20. As a database, where does MongoDB shine? MongoDB does less well • Resource management * • Collection scanning under load * • Absolute write availability • Faceted search • Joins across collections • SQL* • Transactions over multiple docs Needs to be done at infrastructure level Concurrent scans can disrupt the working set Consistency vs Availability Core value of search engines Doc model mitigates need for this Some partial solutions (ODBC) Pushed to application level. Rarely needed with good schema design
  • 21. MongoDB Use Cases Single View Internet of Things Mobile Real-Time Analytics Catalog Personalization Content Management
  • 22. Use cases where MongoDB shines MongoDB is good for MongoDB is less good for • Single View • Search engine • Internet of Things – sensor data • Mobile apps – geospatial • Real-time analytics • Catalog • Personalization • Content management • Inventory management • Personalization engines • Shopping cart • Dependent datamarts • Archiving for fast lookup • Collaboration tools • Messaging applications • Log file aggregation • Caching • Adserving • …… • Slicing and dicing of data in unplanned ways requiring joins and full scans • Nanosecond latency writing (real time tick data) • Uptime beyond 99.999%, instant failover • Batch processing
  • 23. Use cases where MongoDB shines MongoDB is good for • Single View • Internet of Things – sensor data • Mobile apps – geospatial • Real-time analytics • Catalog • Personalization • Content management • Inventory management • Personalization engines • Shopping cart • Dependent datamarts • Archiving for fast lookup • Collaboration tools • Messaging applications • Log file aggregation • Caching • Adserving • …… Mixture of analytics and archiving Build information from data as it comes in Extract from DW for analysis Large volume, targeted queries Sharing in near real time Twitter-like apps E.g., SPLUNK Enable massive reads on consolidated data
  • 24. Use cases where MongoDB shines MongoDB is less good for • Search engine • Slicing and dicing of data in unplanned ways requiring joins and full scans • Nanosecond latency writing (real time tick data) • Uptime beyond 99.999%, instant failover • Batch processing Text indexing only for elementary uses Classic DW usage. MongoDB needs known query pattern. Specialty DBs like Kdb are built for this Requires failover in <1s That’s what Hadoop is for…. Note: transaction processing does not require database transactions. Move money from account A to account B is never instantaneous and requires actual processing…. Usually in batch
  • 25. Data Consolidation Operational Data Hub Benefits Data Warehouse Real-time or Batch Engagement Applicaiton Engagement Applicaiton • Real-time • Complete details • Agile • Higher customer retention • Increase wallet share • Proactive exception handling Strategic Reporting Operational Reporting Cards CarDdast a Source 1 Loans LoaDnasta Source 2 … Deposits Deposits Data Source n
  • 26. Data Hub for Large Investment Bank Feeds & Batch data • Pricing • Accounts • Securities Master • Corporate actions Source Master Data (RDBMS) Batch Batch Batch Batch Batch Batch Batch Destination Data (RDBMS) Each represents • People $ • Hardware $ • License $ • Reg penalty $ • & other downstream problems
  • 27. Data Hub for Large Investment Bank Feeds & Batch data • Pricing • Accounts • Securities Master • Corporate actions Source Master Data (RDBMS) Batch Batch Batch Batch Batch Batch Batch Destination Data (RDBMS) Each represents • People $ • Hardware $ • License $ • Reg penalty $ • & other downstream problems • Delays up to 36 hours in distributing data by batch • Charged multiple times globally for same data • Incurring regulatory penalties from missing SLAs • Had to manage 20 distributed systems with same data
  • 28. Data Hub for Large Investment Bank Feeds & Batch data • Pricing • Accounts • Securities Master • Corporate actions Real-time Real-time Real-time Real-time Real-time Real-time Real-time Each represents • No people $ • Less hardware $ • Less license $ • No penalty $ • & many less problems MongoDB Primary MongoDB Secondaries
  • 29. Data Hub for Large Investment Bank Feeds & Batch data • Pricing • Accounts • Securities Master • Corporate actions Real-time Real-time Real-time Real-time Real-time Real-time Real-time Each represents • No people $ • Less hardware $ • Less license $ • No penalty $ • & many less problems MongoDB Primary MongoDB Secondaries • Will save about $40,000,000 in costs and penalties over 5 years • Only charged once for data • Data in sync globally and read locally • Capacity to move to one global shared data service
  • 30. Molecular Similarity Database • Store Chemical Compounds – Fingerprints • Want to find compounds which are “close” to a given compound • Need to return quickly a small set of reasonable candidates • Few researchers working concurrently • Use Tanimoto association coefficient to compare two compounds based on their common fingerprints
  • 31. Big Data Genomics • Very large base of DNA sample sequences – Origin, collection method, sequence, date, … • Enumeration of mutations relative to reference sequence – Positions, mutation type, base • Need to retrieve efficiently all sequences showing a particular mutation • Similar to Content Management System pattern • Add tag array in sequence document with mutation names • Index tag array • Queries looking for affected sequences are indexed and very fast • Easy to setup, flexible representation and details for sequences, flexible evolution • Can scale to massive volumes
  • 32. IoT: Large Industrial Vehicle Manufacturer Shard 1 Secondary Shard 2 Secondary Shard 3 Secondary Shard 1 Primary Shard 1 Secondary Shard 1 Primary Shard 1 Secondary Shard 1 Primary Shard 1 Secondary Central Hub Regional Hub Regional Hub Regional Hub
  • 33. What database do you need for your business?
  • 34. What vehicle do you want for a race?
  • 35. WHAT ARE YOU TRYING TO ACHIEVE?
  • 36. The important aspect of MongoDB • MongoDB was not designed for niche use cases • MongoDB strives to have excellent characteristics applicable to a very broad range of use cases MongoDB is the most balanced database for Enterprise applications and performance
  • 37. Technical: Why MongoDB • High performance (1000’s – millions queries / sec) - reads & writes • Need flexible schema, rich querying with any number of secondary indexes • Need for replication across multiple data centers, even globally • Need to deploy rapidly and scale on demand (start small and fast, grow easily) • 99.999% availability • Real time analysis in the database, under load • Geospatial querying • Processing in real time, not in batch • Need to promote agile coding methodologies • Deploy over commodity computing and storage architectures • Point in Time recovery • Need strong data consistency • Advanced security
  • 38. Technical: Why MongoDB • High performance (1000’s – millions queries / sec) - reads & writes • Need flexible schema, rich querying with any number of secondary indexes • Need for replication across multiple data centers, even globally • Need to deploy rapidly and scale on demand (start small and fast, grow easily) • 99.999% availability • Real time analysis in the database, under load • Geospatial querying • Processing in real time, not in batch • Need to promote agile coding methodologies • Deploy over commodity computing and storage architectures • Point in Time recovery • Need strong data consistency • Advanced security
  • 39. Business: Why MongoDB • Management tooling and services • Ease of hiring • Commercial license • Ease of developer adoption • Global Support • Global Professional Services • IT ecosystem integration • Company stability • De facto standard for next generation database
  • 40. Business: Why MongoDB • Management tooling and services • Ease of hiring • Commercial license • Ease of developer adoption • Global Support • Global Professional Services • IT ecosystem integration • Company stability • De facto standard for next generation database
  • 41. Summary • MongoDB is for Systems of Engagement • Complements search engines, Hadoop and Data Warehouses – Does not replace these technologies • Wide range of use cases – and that’s the core point ! – Excellent across many possible use cases, not just a few • Recognized by Gartner and Forrester • De facto standard for next generation database • Enterprise maturity and integration
  • 42. We Can Help MongoDB Enterprise Advanced The best way to run MongoDB in your data center MongoDB Management Service (MMS) The easiest way to run MongoDB in the cloud Production Support In production and under control Development Support Let’s get you running Consulting We solve problems Training Get your teams up to speed

Editor's Notes

  • #13: Here we have greatly reduced the relational data model for this application to two tables. In reality no database has two tables. It is much more common to have hundreds or thousands of tables. And as a developer where do you begin when you have a complex data model?? If you’re building an app you’re really thinking about just a hand full of common things, like products, and these can be represented in a document much more easily that a complex relational model where the data is broken up in a way that doesn’t really reflect the way you think about the data or write an application.
  • #16: Add H-M-L
  • #17: Add H-M-L
  • #18: Add H-M-L
  • #43: What We Sell We are the MongoDB experts. Over 1,000 organizations rely on our commercial offerings, including leading startups and 30 of the Fortune 100. We offer software and services to make your life easier: MongoDB Enterprise Advanced is the best way to run MongoDB in your data center. It’s a finely-tuned package of advanced software, support, certifications, and other services designed for the way you do business. MongoDB Management Service (MMS) is the easiest way to run MongoDB in the cloud. It makes MongoDB the system you worry about the least and like managing the most. Production Support helps keep your system up and running and gives you peace of mind. MongoDB engineers help you with production issues and any aspect of your project. Development Support helps you get up and running quickly. It gives you a complete package of software and services for the early stages of your project. MongoDB Consulting packages get you to production faster, help you tune performance in production, help you scale, and free you up to focus on your next release. MongoDB Training helps you become a MongoDB expert, from design to operating mission-critical systems at scale. Whether you’re a developer, DBA, or architect, we can make you better at MongoDB.