SlideShare a Scribd company logo
IndicesQuery OptimizerPerformance TuningAaron Staple aaron@10gen.com
What is an index?A set of references to your documents, efficiently ordered by key{x:0.5,y:0.5}{x:2,y:0.5}{x:5,y:2}{x:-4,y:10}{x:3,y:’f’}
What is an index?A set of references to your documents, efficiently ordered by key{x:1}{x:0.5,y:0.5}{x:2,y:0.5}{x:5,y:2}{x:-4,y:10}{x:3,y:’f’}
What is an index?A set of references to your documents, efficiently ordered by key{y:1}{x:0.5,y:0.5}{x:2,y:0.5}{x:5,y:2}{x:-4,y:10}{x:3,y:’f’}
How is an index stored?B-tree{x:2}{x:3}3<=x<44<=x<5{x:0.5}2<=x<5{x:5}0<=x<1x>=5x<0{x:-4}{x:1}
What if I have multiple indices?{c:1}{a:3}{c:2}{c:3}{b:’x’}{d:null}{a:3,b:’x’,c:[1,2,3]}{a:1}{c:1}{b:1}{d:1}
How does a simple query work?Tree traversal{x:2}{x:3}3<=x<44<=x<5{x:0.5}2<=x<5{x:5}0<=x<1x>=5x<0{x:-4}{x:1}
Simple document lookup	db.c.findOne( {_id:2} ), using index {_id:1}db.c.find( {x:2} ), using index {x:1}db.c.find( {x:{$in:[2,3]}} ), using index {x:1}db.c.find( {‘x.a’:1} ), using index {‘x.a’:1}Matches {_id:1,x:{a:1}}db.c.find( {x:{a:1}} ), using index {x:1}Matches {_id:1,x:{a:1}}, but not {_id:2,x:{a:1,b:2}}QUESTION: What about db.c.find( {$where:“this.x == this.y”} ), using index {x:1}?Indices cannot be used for $where type queries, but if there are non-where elements in the query then indices can be used for the non-where elements.
How does a range query work?Tree traversal + scan: find({x:{$gte:3,$lte:5}}){x:2}{x:3}{x:4}3<=x<44<=x<5{x:0.5}2<=x<5{x:5}0<=x<1{x:6}x>=5x<0{x:-4}{x:1}
Document range scandb.c.find( {x:{$gt:2}} ), using index {x:1}db.c.find( {x:{$gt:2,$lt:5}} ), using index {x:1}db.c.find( {x:/^a/} ), using index {x:1}QUESTION: What about db.c.find( {x:/a/} ), using index {x:1}?The letter ‘a’ can appear anywhere in a matching string, so lexicographic ordering on strings won’t help.  However, we can use the index to find the range of documents where x is string (eg not a number) or x is the regular expression /a/.
Other operationsdb.c.count( {x:2} ) using index {x:1}db.c.distinct( {x:2} ) using index {x:1}db.c.update( {x:2}, {x:3} ) using index {x:1}db.c.remove( {x:2} ) using index {x:1}QUESTION: What about db.c.update( {x:2}, {$inc:{x:3}} ), using index {x:1}?Older versions of mongoDB didn’t support modifiers on indexed fields, but we now support this.
Missing fieldsdb.c.find( {x:null} ), using index {x:1}Matches {_id:5}Matches {_id:5,x:null}QUESTION: What about db.c.find( {x:{$exists:true}} ), using index {x:1}?The index is not currently used, though we will fix this in MongoDB 1.6.
Array matchingAll the following match {_id:6,x:[2,10]} and use index {x:1}db.c.find( {x:2} )db.c.find( {x:10} )db.c.find( {x:{$gt:5}} )db.c.find( {x:[2,10]} )db.c.find( {x:{$in:[2,5]}} )QUESTION: What about db.c.find( {x:{$all:[2,10]}} )?The index will be used to look up all documents matching {x:2}.
What is a compound index?{x:2,y:3}{x:1,y:5}{x:2,y:9}{x:3,y:1}{x:1,y:1}
How are bounds determined for a compound index?find( {x:{$gte:2,$lte:4},y:6} ){x:3,y:1}{x:2,y:6}{x:3,y:7}{x:3.5,y:6}{x:2,y:3}{x:4,y:6}{x:1,y:5}{x:5,y:6}{x:1,y:1}
How does an ordered range query work?Simple range scan if index already ensures desired ordering: find( {x:2} ).sort( {y:1} ){x:2,y:3}{x:1,y:5}{x:2,y:9}{x:3,y:1}{x:1,y:1}
How does an ordered range query work?Otherwise, in-memory sort of matching documents: find( {x:2} ).sort( {y:1} ){x:2,y:3}{x:2,y:9}{x:1,y:5}{x:2,y:3}{x:2,y:9}…{x:3,y:1}{x:1}
Document orderingdb.c.find( {} ).sort( {x:1} ), using index {x:1}db.c.find( {} ).sort( {x:-1} ), using index {x:1}db.c.find( {x:{$gt:4}} ).sort( {x:-1} ), using index {x:1}db.c.find( {} ).sort( {‘x.a’:1} ), using index {‘x.a’:1}QUESTION: What about db.c.find( {y:1} ).sort( {x:1} ), using index {x:1}?The index will be used to ensure ordering, provided there is no better index.
Compound indices and orderingdb.c.find( {x:10,y:20} ), using index {x:1,y:1}db.c.find( {x:10,y:20} ), using index {x:1,y:-1}db.c.find( {x:{$in:[10,20]},y:20} ), using index {x:1,y:1}db.c.find().sort( {x:1,y:1} ), using index {x:1,y:1}db.c.find().sort( {x:-1,y:1} ), using index {x:1,y:-1}db.c.find( {x:10} ).sort( {y:1} ), using index {x:1,y:1}QUESTION: What about db.c.find( {y:10} ).sort( {x:1} ), using index {x:1,y:1}?The index will be used to ensure ordering, provided no better index is available.
What if we negate a query?find({x:{$ne:2}}){x:2}{x:1}{x:2}{x:3}{x:1}
When indices are less helpfuldb.c.find( {x:{$ne:1}} )db.c.find( {x:{$mod:[10,1]}} )Uses index {x:1} to scan numbers onlydb.c.find( {x:{$not:/a/}} )db.c.find( {x:{$gte:0,$lte:10},y:5} ) using index {x:1,y:1}Currently must scan all elements from {x:0,y:5} to {x:10,y:5}, but some improvements may be possibledb.c.find( {$where:’this.x = 5’} )QUESTION: What about db.c.find( {x:{$not:/^a/}} ), using index {x:1}?The index is not used currently, but will be used in mongoDB 1.6
How is an index chosen?find( {x:2,y:3} ){x:2,y:1}{y:3,x:1}{x:2,y:3}{x:2,y:9}{y:3,x:2}{y:9,x:2}{x:1,y:3}{y:1,x:2}{x:1}{y:1}√{x:2,y:3}{x:2,y:1}{x:2,y:9}{y:3,x:2}{y:3,x:1}
Query pattern matchingVery simple algorithm, few complaints so farfind({x:1})find({x:2})find({x:100})find({x:{$gt:4}})find({x:{$gte:6}})find({x:1,y:2})find({x:{$gt:4,$lte:10}})find({x:{$gte:6,$lte:400}})find({x:1}).sort({y:1})
Query optimizerIn charge of picking which index to use for a query/count/update/delete/etcUsually it does a good job, but if you know what you’re doing you can override itdb.c.find( {x:2,y:3} ).hint( {y:1} )Use index {y:1} and avoid trying {x:1}As your data changes, different indices may be chosen.  Ordering requirements should be made explicit using sort().QUESTION: How can you force a full collection scan instead of using indices?db.c.find( {x:2,y:3} ).hint( {$natural:1} ) to bypass indices
Geospatial indicesdb.c.find( {a:[50,50]} ) using index {a:’2d’}db.c.find( {a:{$near:[50,50]}} ) using index {a:’2d’}Results are sorted closest - farthestdb.c.find( {a:{$within:{$box:[[40,40],[60,60]]}}} ) using index {a:’2d’}db.c.find( {a:{$within:{$center:[[50,50],10]}}} ) using index {a:’2d’}db.c.find( {a:{$near:[50,50]},b:2} ) using index {a:’2d’,b:1}QUESTION: Most queries can be performed with or without an index.  Is this true of geospatial queries?No.  A geospatial query requires an index.
How does an insert work?Tree traversal and insert, split if necessary{x:3.5}{x:2}{x:3}{x:4}3<=x<44<=x<5{x:0.5}2<=x<5{x:5}0<=x<1{x:6}x>=5x<0{x:-4}{x:1}
What if my keys are increasing?You’ll always insert on the right{x:2}{x:3}{x:4}3<=x<44<=x<5{x:0.5}2<=x<5{x:5}0<=x<1{x:6}x>=5x<0{x:7}{x:-4}{x:8}{x:1}{x:9}
Why is RAM important?RAM is basically used as a LIFO disk cacheWhole index in RAMPortion of index in RAM
Creating an index{_id:1} index created automaticallyFor non-capped collectionsdb.c.ensureIndex( {x:1} )Can create an index at any time, even when you already have plenty of data in your collectionCreating an index will block mongoDB unless you specify background index creationdb.c.ensureIndex( {x:1}, {background:true} )Background index creation is a still impacts performance – run at non peak times if you’re concernedQUESTION: Can an index be removed during background creation?Not at this time.
Unique key constraintsdb.c.ensureIndex( {x:1}, {unique:true} )Don’t allow {_id:10,x:2} and {_id:11,x:2}Don’t allow {_id:12} and {_id:13} (both match {x:null}What if duplicates exist before index is created?Normally index creation fails and the index is removeddb.ensureIndex( {x:1}, {unique:true,dropDups:true} )QUESTION: In dropDups mode, which duplicates will be removed?The first document according to the collection’s “natural order” will be preserved.
Cleaning up an indexdb.system.indices.find( {ns:’db.c’} )db.c.dropIndex( {x:1} )db.c.dropindices()db.c.reIndex()Rebuilds all indices, removing index cruft that has built up over large numbers of updates and deletes.  Index cruft will not exist in mongoDB 1.6, so this command will be deprecated.QUESTION: Why would you want to drop an index?See next slide…
Limits and tradeoffsMax 40 indices per collectionLogically equivalent indices are not prevented (eg {x:1} and {x:-1})indices can improve speed of queries, but make inserts slowerA more specific index {a:1,b:1,c:1} can be more helpful than less specific index {a:1} but the more specific index will be larger, thus harder to fit in RAMQUESTION: Do indices make updates slower?  How about deletes?It depends – finding your document might be faster, but if any indexed fields are changed the indices must be updated.
Mongod log outputquery test.c ntoreturn:1 reslen:69 nscanned:100000 { i: 99999.0 }  nreturned:1 157msquery test.$cmd ntoreturn:1 command: { count: "c", query: { type: 0.0, i: { $gt: 99000.0 } }, fields: {} } reslen:64 256msquery:{ query: {}, orderby: { i: 1.0 } } ... query test.c ntoreturn:0 exception  1378ms ... User Exception 10128:too much key data for sort() with no index.  add an index or specify a smaller limitquery test.c ntoreturn:0 reslen:4783 nscanned:100501 { query: { type: 500.0 }, orderby: { i: 1.0 } }  nreturned:101 390msOccasionally may see a slow operation as a result of disk activity or mongo cleaning things up – some messages about slow ops are spuriousKeep this in mind when running the same op a massive number of times, and it appears slow very rarely
ProfilingRecord same info as with log messages, but in a database collection> db.system.profile.find(){"ts" : "Thu Jan 29 2009 15:19:32 GMT-0500 (EST)" , "info" : "query test.$cmd ntoreturn:1 reslen:66 nscanned:0  <br>query: { profile: 2 }  nreturned:1 bytes:50" , "millis" : 0}...> db.system.profile.find( { info: /test.foo/ } )> db.system.profile.find( { millis : { $gt : 5 } } )> db.system.profile.find().sort({$natural:-1})Enable explicitly using levels (0:off, 1:slow ops (>100ms), 2:all ops)> db.setProfilingLevel(2);{"was" : 0 , "ok" : 1}> db.getProfilingLevel()2> db.setProfilingLevel( 1 , 10 ); // slow means > 10msProfiling impacts performance, but not severely
Query explain> db.c.find( {x:1000,y:0} ).explain(){	"cursor" : "BtreeCursor x_1",	"indexBounds" : [		[			{				"x" : 1000			},			{				"x" : 1000			}		]	],	"nscanned" : 10,	"nscannedObjects" : 10,	"n" : 10,	"millis" : 0,	"oldPlan" : {		"cursor" : "BtreeCursor x_1",		"indexBounds" : [			[				{					"x" : 1000				},				{					"x" : 1000				}			]		]	},	"allPlans" : [		{			"cursor" : "BtreeCursor x_1",			"indexBounds" : [				[					{						"x" : 1000					},					{						"x" : 1000					}				]			]		},		{			"cursor" : "BtreeCursor y_1",			"indexBounds" : [				[					{						"y" : 0					},					{						"y" : 0					}				]			]		},		{			"cursor" : "BasicCursor",			"indexBounds" : [ ]		}	]}
Example 1> db.c.findOne( {i:99999} ){ "_id" : ObjectId("4bb962dddfdcf5761c1ec6a3"), "i" : 99999 }query test.c ntoreturn:1 reslen:69 nscanned:100000 { i: 99999.0 }  nreturned:1 157ms> db.c.find( {i:99999} ).limit(1).explain(){	"cursor" : "BasicCursor",	"indexBounds" : [ ],	"nscanned" : 100000,	"nscannedObjects" : 100000,	"n" : 1,	"millis" : 161,	"allPlans" : [		{			"cursor" : "BasicCursor",			"indexBounds" : [ ]		}	]}> db.c.ensureIndex( {i:1} );> for( i = 0; i < 100000; ++i ) { db.c.save( {i:i} ); }
Example 2> db.c.count( {type:0,i:{$gt:99000}} )499query test.$cmd ntoreturn:1 command: { count: "c", query: { type: 0.0, i: { $gt: 99000.0 } }, fields: {} } reslen:64 256ms> db.c.find( {type:0,i:{$gt:99000}} ).limit(1).explain(){	"cursor" : "BtreeCursor type_1",	"indexBounds" : [		[			{				"type" : 0			},			{				"type" : 0			}		]	],	"nscanned" : 49502,	"nscannedObjects" : 49502,	"n" : 1,	"millis" : 349,...> db.c.ensureIndex( {type:1,i:1} );> for( i = 0; i < 100000; ++i ) { db.c.save( {type:i%2,i:i} ); }
Example 3> db.c.find().sort( {i:1} )error: {	"$err" : "too much key data for sort() with no index.  add an index or specify a smaller limit"}> db.c.find().sort( {i:1} ).explain()JS Error: uncaught exception: error: {	"$err" : "too much key data for sort() with no index.  add an index or specify a smaller limit"}> db.c.ensureIndex( {i:1} );> db.c.find().sort( {i:1} ).limit( 1000 ); //alternatively> for( i = 0; i < 1000000; ++i ) { db.c.save( {i:i} ); }
Example 4> db.c.find( {type:500} ).sort( {i:1} ){ "_id" : ObjectId("4bba4904dfdcf5761c2f917e"), "i" : 500, "type" : 500 }{ "_id" : ObjectId("4bba4904dfdcf5761c2f9566"), "i" : 1500, "type" : 500 }...query test.c ntoreturn:0 reslen:4783 nscanned:100501 { query: { type: 500.0 }, orderby: { i: 1.0 } }  nreturned:101 390ms> db.c.find( {type:500} ).sort( {i:1} ).explain(){	"cursor" : "BtreeCursor i_1",	"indexBounds" : [		[			{				"i" : {					"$minElement" : 1				}			},			{				"i" : {					"$maxElement" : 1				}			}		]	],	"nscanned" : 1000000,	"nscannedObjects" : 1000000,	"n" : 1000,	"millis" : 5388,...> db.c.ensureIndex( {type:1,i:1} );> for( i = 0; i < 1000000; ++i ) { db.c.save( {i:i,type:i%1000} ); }
Questions?Get involved www.mongodb.orgDownloads, user group, chat roomFollow @mongodbUpcoming events www.mongodb.org/display/DOCS/EventsSF MongoDB office hours Mondays 4-6pm at Epicenter CaféSF MongoDBmeetupMay 17 at Engine YardCommercial support www.10gen.comjobs@10gen.com
Ad

Recommended

Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)
MongoSF
 
Indexing with MongoDB
Indexing with MongoDB
MongoDB
 
Fast querying indexing for performance (4)
Fast querying indexing for performance (4)
MongoDB
 
Chapter 8 advanced sorting and hashing for print
Chapter 8 advanced sorting and hashing for print
Abdii Rashid
 
ESNext for humans - LvivJS 16 August 2014
ESNext for humans - LvivJS 16 August 2014
Jan Jongboom
 
Evolving with Java - How to remain Relevant and Effective
Evolving with Java - How to remain Relevant and Effective
Naresha K
 
Reducing Development Time with MongoDB vs. SQL
Reducing Development Time with MongoDB vs. SQL
MongoDB
 
Data handling in r
Data handling in r
Abhik Seal
 
WOTC_Import
WOTC_Import
Luther Quinn
 
binary_trees2
binary_trees2
Mohamed Elsayed
 
Data manipulation on r
Data manipulation on r
Abhik Seal
 
R code for data manipulation
R code for data manipulation
Avjinder (Avi) Kaler
 
Optimizing queries MySQL
Optimizing queries MySQL
Georgi Sotirov
 
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
PostgresOpen
 
Cassandra London - C* Spark Connector
Cassandra London - C* Spark Connector
Christopher Batey
 
Using Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query Performance
oysteing
 
R Programming: Export/Output Data In R
R Programming: Export/Output Data In R
Rsquared Academy
 
webScrapingFunctions
webScrapingFunctions
Hellen Gakuruh
 
Distributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeCon
Duyhai Doan
 
PostgreSQL: Advanced indexing
PostgreSQL: Advanced indexing
Hans-Jürgen Schönig
 
PostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with grouping
Alexey Bashtanov
 
RMySQL Tutorial For Beginners
RMySQL Tutorial For Beginners
Rsquared Academy
 
linked_lists3
linked_lists3
Mohamed Elsayed
 
PGDay UK 2016 -- Performace for queries with grouping
PGDay UK 2016 -- Performace for queries with grouping
Alexey Bashtanov
 
Data Love Conference - Window Functions for Database Analytics
Data Love Conference - Window Functions for Database Analytics
Dave Stokes
 
Java script objects 1
Java script objects 1
H K
 
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB
 
Php forum2015 tomas_final
Php forum2015 tomas_final
Bertrand Matthelie
 
Indexing documents
Indexing documents
MongoDB
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329
Douglas Duncan
 

More Related Content

What's hot (20)

WOTC_Import
WOTC_Import
Luther Quinn
 
binary_trees2
binary_trees2
Mohamed Elsayed
 
Data manipulation on r
Data manipulation on r
Abhik Seal
 
R code for data manipulation
R code for data manipulation
Avjinder (Avi) Kaler
 
Optimizing queries MySQL
Optimizing queries MySQL
Georgi Sotirov
 
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
PostgresOpen
 
Cassandra London - C* Spark Connector
Cassandra London - C* Spark Connector
Christopher Batey
 
Using Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query Performance
oysteing
 
R Programming: Export/Output Data In R
R Programming: Export/Output Data In R
Rsquared Academy
 
webScrapingFunctions
webScrapingFunctions
Hellen Gakuruh
 
Distributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeCon
Duyhai Doan
 
PostgreSQL: Advanced indexing
PostgreSQL: Advanced indexing
Hans-Jürgen Schönig
 
PostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with grouping
Alexey Bashtanov
 
RMySQL Tutorial For Beginners
RMySQL Tutorial For Beginners
Rsquared Academy
 
linked_lists3
linked_lists3
Mohamed Elsayed
 
PGDay UK 2016 -- Performace for queries with grouping
PGDay UK 2016 -- Performace for queries with grouping
Alexey Bashtanov
 
Data Love Conference - Window Functions for Database Analytics
Data Love Conference - Window Functions for Database Analytics
Dave Stokes
 
Java script objects 1
Java script objects 1
H K
 
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB
 
Php forum2015 tomas_final
Php forum2015 tomas_final
Bertrand Matthelie
 
Data manipulation on r
Data manipulation on r
Abhik Seal
 
Optimizing queries MySQL
Optimizing queries MySQL
Georgi Sotirov
 
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
PostgresOpen
 
Cassandra London - C* Spark Connector
Cassandra London - C* Spark Connector
Christopher Batey
 
Using Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query Performance
oysteing
 
R Programming: Export/Output Data In R
R Programming: Export/Output Data In R
Rsquared Academy
 
Distributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeCon
Duyhai Doan
 
PostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with grouping
Alexey Bashtanov
 
RMySQL Tutorial For Beginners
RMySQL Tutorial For Beginners
Rsquared Academy
 
PGDay UK 2016 -- Performace for queries with grouping
PGDay UK 2016 -- Performace for queries with grouping
Alexey Bashtanov
 
Data Love Conference - Window Functions for Database Analytics
Data Love Conference - Window Functions for Database Analytics
Dave Stokes
 
Java script objects 1
Java script objects 1
H K
 
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB
 

Similar to MongoDB's index and query optimize (20)

Indexing documents
Indexing documents
MongoDB
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329
Douglas Duncan
 
Query Optimization in MongoDB
Query Optimization in MongoDB
Hamoon Mohammadian Pour
 
Indexing
Indexing
Mike Dirolf
 
Mongophilly indexing-2011-04-26
Mongophilly indexing-2011-04-26
kreuter
 
Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27
MongoDB
 
MongoDB - Indexing- Types iNDEXING.pptx
MongoDB - Indexing- Types iNDEXING.pptx
revathyvr4
 
Mongo indexes
Mongo indexes
paradokslabs
 
Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)
MongoDB
 
Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)
MongoDB
 
MongoDB Days UK: Indexing and Performance Tuning
MongoDB Days UK: Indexing and Performance Tuning
MongoDB
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You Scale
MongoDB
 
Mongo db tutorials
Mongo db tutorials
Anuj Jain
 
Indexing and Query Performance in MongoDB.pdf
Indexing and Query Performance in MongoDB.pdf
Malak Abu Hammad
 
Indexing & query optimization
Indexing & query optimization
Jared Rosoff
 
Nosql part 2
Nosql part 2
Ruru Chowdhury
 
Indexing In MongoDB
Indexing In MongoDB
Kishor Parkhe
 
Mongodb Performance
Mongodb Performance
Jack
 
Indexing and Performance Tuning
Indexing and Performance Tuning
MongoDB
 
MongoDB (Advanced)
MongoDB (Advanced)
TO THE NEW | Technology
 
Indexing documents
Indexing documents
MongoDB
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329
Douglas Duncan
 
Mongophilly indexing-2011-04-26
Mongophilly indexing-2011-04-26
kreuter
 
Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27
MongoDB
 
MongoDB - Indexing- Types iNDEXING.pptx
MongoDB - Indexing- Types iNDEXING.pptx
revathyvr4
 
Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)
MongoDB
 
Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)
MongoDB
 
MongoDB Days UK: Indexing and Performance Tuning
MongoDB Days UK: Indexing and Performance Tuning
MongoDB
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You Scale
MongoDB
 
Mongo db tutorials
Mongo db tutorials
Anuj Jain
 
Indexing and Query Performance in MongoDB.pdf
Indexing and Query Performance in MongoDB.pdf
Malak Abu Hammad
 
Indexing & query optimization
Indexing & query optimization
Jared Rosoff
 
Mongodb Performance
Mongodb Performance
Jack
 
Indexing and Performance Tuning
Indexing and Performance Tuning
MongoDB
 
Ad

More from mysqlops (20)

The simplethebeautiful
The simplethebeautiful
mysqlops
 
Oracle数据库分析函数详解
Oracle数据库分析函数详解
mysqlops
 
Percona Live 2012PPT:mysql-security-privileges-and-user-management
Percona Live 2012PPT:mysql-security-privileges-and-user-management
mysqlops
 
Percona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replication
mysqlops
 
Percona Live 2012PPT: MySQL Cluster And NDB Cluster
Percona Live 2012PPT: MySQL Cluster And NDB Cluster
mysqlops
 
Percona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimization
mysqlops
 
Pldc2012 innodb architecture and internals
Pldc2012 innodb architecture and internals
mysqlops
 
DBA新人的述职报告
DBA新人的述职报告
mysqlops
 
分布式爬虫
分布式爬虫
mysqlops
 
MySQL应用优化实践
MySQL应用优化实践
mysqlops
 
eBay EDW元数据管理及应用
eBay EDW元数据管理及应用
mysqlops
 
基于协程的网络开发框架的设计与实现
基于协程的网络开发框架的设计与实现
mysqlops
 
eBay基于Hadoop平台的用户邮件数据分析
eBay基于Hadoop平台的用户邮件数据分析
mysqlops
 
对MySQL DBA的一些思考
对MySQL DBA的一些思考
mysqlops
 
QQ聊天系统后台架构的演化与启示
QQ聊天系统后台架构的演化与启示
mysqlops
 
腾讯即时聊天IM1.4亿在线背后的故事
腾讯即时聊天IM1.4亿在线背后的故事
mysqlops
 
分布式存储与TDDL
分布式存储与TDDL
mysqlops
 
MySQL数据库生产环境维护
MySQL数据库生产环境维护
mysqlops
 
Memcached
Memcached
mysqlops
 
DevOPS
DevOPS
mysqlops
 
The simplethebeautiful
The simplethebeautiful
mysqlops
 
Oracle数据库分析函数详解
Oracle数据库分析函数详解
mysqlops
 
Percona Live 2012PPT:mysql-security-privileges-and-user-management
Percona Live 2012PPT:mysql-security-privileges-and-user-management
mysqlops
 
Percona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replication
mysqlops
 
Percona Live 2012PPT: MySQL Cluster And NDB Cluster
Percona Live 2012PPT: MySQL Cluster And NDB Cluster
mysqlops
 
Percona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimization
mysqlops
 
Pldc2012 innodb architecture and internals
Pldc2012 innodb architecture and internals
mysqlops
 
DBA新人的述职报告
DBA新人的述职报告
mysqlops
 
分布式爬虫
分布式爬虫
mysqlops
 
MySQL应用优化实践
MySQL应用优化实践
mysqlops
 
eBay EDW元数据管理及应用
eBay EDW元数据管理及应用
mysqlops
 
基于协程的网络开发框架的设计与实现
基于协程的网络开发框架的设计与实现
mysqlops
 
eBay基于Hadoop平台的用户邮件数据分析
eBay基于Hadoop平台的用户邮件数据分析
mysqlops
 
对MySQL DBA的一些思考
对MySQL DBA的一些思考
mysqlops
 
QQ聊天系统后台架构的演化与启示
QQ聊天系统后台架构的演化与启示
mysqlops
 
腾讯即时聊天IM1.4亿在线背后的故事
腾讯即时聊天IM1.4亿在线背后的故事
mysqlops
 
分布式存储与TDDL
分布式存储与TDDL
mysqlops
 
MySQL数据库生产环境维护
MySQL数据库生产环境维护
mysqlops
 
Ad

Recently uploaded (20)

FME for Good: Integrating Multiple Data Sources with APIs to Support Local Ch...
FME for Good: Integrating Multiple Data Sources with APIs to Support Local Ch...
Safe Software
 
Crypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdf
Stephen Perrenod
 
Kubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too Late
Michael Furman
 
Supporting the NextGen 911 Digital Transformation with FME
Supporting the NextGen 911 Digital Transformation with FME
Safe Software
 
Floods in Valencia: Two FME-Powered Stories of Data Resilience
Floods in Valencia: Two FME-Powered Stories of Data Resilience
Safe Software
 
MuleSoft for AgentForce : Topic Center and API Catalog
MuleSoft for AgentForce : Topic Center and API Catalog
shyamraj55
 
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Alliance
 
FIDO Seminar: Targeting Trust: The Future of Identity in the Workforce.pptx
FIDO Seminar: Targeting Trust: The Future of Identity in the Workforce.pptx
FIDO Alliance
 
Down the Rabbit Hole – Solving 5 Training Roadblocks
Down the Rabbit Hole – Solving 5 Training Roadblocks
Rustici Software
 
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
June Patch Tuesday
June Patch Tuesday
Ivanti
 
The State of Web3 Industry- Industry Report
The State of Web3 Industry- Industry Report
Liveplex
 
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
AmirStern2
 
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance
 
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
Creating Inclusive Digital Learning with AI: A Smarter, Fairer Future
Creating Inclusive Digital Learning with AI: A Smarter, Fairer Future
Impelsys Inc.
 
Mastering AI Workflows with FME - Peak of Data & AI 2025
Mastering AI Workflows with FME - Peak of Data & AI 2025
Safe Software
 
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
Edge AI and Vision Alliance
 
FME for Good: Integrating Multiple Data Sources with APIs to Support Local Ch...
FME for Good: Integrating Multiple Data Sources with APIs to Support Local Ch...
Safe Software
 
Crypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdf
Stephen Perrenod
 
Kubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too Late
Michael Furman
 
Supporting the NextGen 911 Digital Transformation with FME
Supporting the NextGen 911 Digital Transformation with FME
Safe Software
 
Floods in Valencia: Two FME-Powered Stories of Data Resilience
Floods in Valencia: Two FME-Powered Stories of Data Resilience
Safe Software
 
MuleSoft for AgentForce : Topic Center and API Catalog
MuleSoft for AgentForce : Topic Center and API Catalog
shyamraj55
 
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Alliance
 
FIDO Seminar: Targeting Trust: The Future of Identity in the Workforce.pptx
FIDO Seminar: Targeting Trust: The Future of Identity in the Workforce.pptx
FIDO Alliance
 
Down the Rabbit Hole – Solving 5 Training Roadblocks
Down the Rabbit Hole – Solving 5 Training Roadblocks
Rustici Software
 
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
June Patch Tuesday
June Patch Tuesday
Ivanti
 
The State of Web3 Industry- Industry Report
The State of Web3 Industry- Industry Report
Liveplex
 
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
AmirStern2
 
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance
 
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
Creating Inclusive Digital Learning with AI: A Smarter, Fairer Future
Creating Inclusive Digital Learning with AI: A Smarter, Fairer Future
Impelsys Inc.
 
Mastering AI Workflows with FME - Peak of Data & AI 2025
Mastering AI Workflows with FME - Peak of Data & AI 2025
Safe Software
 
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
Edge AI and Vision Alliance
 

MongoDB's index and query optimize

  • 2. What is an index?A set of references to your documents, efficiently ordered by key{x:0.5,y:0.5}{x:2,y:0.5}{x:5,y:2}{x:-4,y:10}{x:3,y:’f’}
  • 3. What is an index?A set of references to your documents, efficiently ordered by key{x:1}{x:0.5,y:0.5}{x:2,y:0.5}{x:5,y:2}{x:-4,y:10}{x:3,y:’f’}
  • 4. What is an index?A set of references to your documents, efficiently ordered by key{y:1}{x:0.5,y:0.5}{x:2,y:0.5}{x:5,y:2}{x:-4,y:10}{x:3,y:’f’}
  • 5. How is an index stored?B-tree{x:2}{x:3}3<=x<44<=x<5{x:0.5}2<=x<5{x:5}0<=x<1x>=5x<0{x:-4}{x:1}
  • 6. What if I have multiple indices?{c:1}{a:3}{c:2}{c:3}{b:’x’}{d:null}{a:3,b:’x’,c:[1,2,3]}{a:1}{c:1}{b:1}{d:1}
  • 7. How does a simple query work?Tree traversal{x:2}{x:3}3<=x<44<=x<5{x:0.5}2<=x<5{x:5}0<=x<1x>=5x<0{x:-4}{x:1}
  • 8. Simple document lookup db.c.findOne( {_id:2} ), using index {_id:1}db.c.find( {x:2} ), using index {x:1}db.c.find( {x:{$in:[2,3]}} ), using index {x:1}db.c.find( {‘x.a’:1} ), using index {‘x.a’:1}Matches {_id:1,x:{a:1}}db.c.find( {x:{a:1}} ), using index {x:1}Matches {_id:1,x:{a:1}}, but not {_id:2,x:{a:1,b:2}}QUESTION: What about db.c.find( {$where:“this.x == this.y”} ), using index {x:1}?Indices cannot be used for $where type queries, but if there are non-where elements in the query then indices can be used for the non-where elements.
  • 9. How does a range query work?Tree traversal + scan: find({x:{$gte:3,$lte:5}}){x:2}{x:3}{x:4}3<=x<44<=x<5{x:0.5}2<=x<5{x:5}0<=x<1{x:6}x>=5x<0{x:-4}{x:1}
  • 10. Document range scandb.c.find( {x:{$gt:2}} ), using index {x:1}db.c.find( {x:{$gt:2,$lt:5}} ), using index {x:1}db.c.find( {x:/^a/} ), using index {x:1}QUESTION: What about db.c.find( {x:/a/} ), using index {x:1}?The letter ‘a’ can appear anywhere in a matching string, so lexicographic ordering on strings won’t help. However, we can use the index to find the range of documents where x is string (eg not a number) or x is the regular expression /a/.
  • 11. Other operationsdb.c.count( {x:2} ) using index {x:1}db.c.distinct( {x:2} ) using index {x:1}db.c.update( {x:2}, {x:3} ) using index {x:1}db.c.remove( {x:2} ) using index {x:1}QUESTION: What about db.c.update( {x:2}, {$inc:{x:3}} ), using index {x:1}?Older versions of mongoDB didn’t support modifiers on indexed fields, but we now support this.
  • 12. Missing fieldsdb.c.find( {x:null} ), using index {x:1}Matches {_id:5}Matches {_id:5,x:null}QUESTION: What about db.c.find( {x:{$exists:true}} ), using index {x:1}?The index is not currently used, though we will fix this in MongoDB 1.6.
  • 13. Array matchingAll the following match {_id:6,x:[2,10]} and use index {x:1}db.c.find( {x:2} )db.c.find( {x:10} )db.c.find( {x:{$gt:5}} )db.c.find( {x:[2,10]} )db.c.find( {x:{$in:[2,5]}} )QUESTION: What about db.c.find( {x:{$all:[2,10]}} )?The index will be used to look up all documents matching {x:2}.
  • 14. What is a compound index?{x:2,y:3}{x:1,y:5}{x:2,y:9}{x:3,y:1}{x:1,y:1}
  • 15. How are bounds determined for a compound index?find( {x:{$gte:2,$lte:4},y:6} ){x:3,y:1}{x:2,y:6}{x:3,y:7}{x:3.5,y:6}{x:2,y:3}{x:4,y:6}{x:1,y:5}{x:5,y:6}{x:1,y:1}
  • 16. How does an ordered range query work?Simple range scan if index already ensures desired ordering: find( {x:2} ).sort( {y:1} ){x:2,y:3}{x:1,y:5}{x:2,y:9}{x:3,y:1}{x:1,y:1}
  • 17. How does an ordered range query work?Otherwise, in-memory sort of matching documents: find( {x:2} ).sort( {y:1} ){x:2,y:3}{x:2,y:9}{x:1,y:5}{x:2,y:3}{x:2,y:9}…{x:3,y:1}{x:1}
  • 18. Document orderingdb.c.find( {} ).sort( {x:1} ), using index {x:1}db.c.find( {} ).sort( {x:-1} ), using index {x:1}db.c.find( {x:{$gt:4}} ).sort( {x:-1} ), using index {x:1}db.c.find( {} ).sort( {‘x.a’:1} ), using index {‘x.a’:1}QUESTION: What about db.c.find( {y:1} ).sort( {x:1} ), using index {x:1}?The index will be used to ensure ordering, provided there is no better index.
  • 19. Compound indices and orderingdb.c.find( {x:10,y:20} ), using index {x:1,y:1}db.c.find( {x:10,y:20} ), using index {x:1,y:-1}db.c.find( {x:{$in:[10,20]},y:20} ), using index {x:1,y:1}db.c.find().sort( {x:1,y:1} ), using index {x:1,y:1}db.c.find().sort( {x:-1,y:1} ), using index {x:1,y:-1}db.c.find( {x:10} ).sort( {y:1} ), using index {x:1,y:1}QUESTION: What about db.c.find( {y:10} ).sort( {x:1} ), using index {x:1,y:1}?The index will be used to ensure ordering, provided no better index is available.
  • 20. What if we negate a query?find({x:{$ne:2}}){x:2}{x:1}{x:2}{x:3}{x:1}
  • 21. When indices are less helpfuldb.c.find( {x:{$ne:1}} )db.c.find( {x:{$mod:[10,1]}} )Uses index {x:1} to scan numbers onlydb.c.find( {x:{$not:/a/}} )db.c.find( {x:{$gte:0,$lte:10},y:5} ) using index {x:1,y:1}Currently must scan all elements from {x:0,y:5} to {x:10,y:5}, but some improvements may be possibledb.c.find( {$where:’this.x = 5’} )QUESTION: What about db.c.find( {x:{$not:/^a/}} ), using index {x:1}?The index is not used currently, but will be used in mongoDB 1.6
  • 22. How is an index chosen?find( {x:2,y:3} ){x:2,y:1}{y:3,x:1}{x:2,y:3}{x:2,y:9}{y:3,x:2}{y:9,x:2}{x:1,y:3}{y:1,x:2}{x:1}{y:1}√{x:2,y:3}{x:2,y:1}{x:2,y:9}{y:3,x:2}{y:3,x:1}
  • 23. Query pattern matchingVery simple algorithm, few complaints so farfind({x:1})find({x:2})find({x:100})find({x:{$gt:4}})find({x:{$gte:6}})find({x:1,y:2})find({x:{$gt:4,$lte:10}})find({x:{$gte:6,$lte:400}})find({x:1}).sort({y:1})
  • 24. Query optimizerIn charge of picking which index to use for a query/count/update/delete/etcUsually it does a good job, but if you know what you’re doing you can override itdb.c.find( {x:2,y:3} ).hint( {y:1} )Use index {y:1} and avoid trying {x:1}As your data changes, different indices may be chosen. Ordering requirements should be made explicit using sort().QUESTION: How can you force a full collection scan instead of using indices?db.c.find( {x:2,y:3} ).hint( {$natural:1} ) to bypass indices
  • 25. Geospatial indicesdb.c.find( {a:[50,50]} ) using index {a:’2d’}db.c.find( {a:{$near:[50,50]}} ) using index {a:’2d’}Results are sorted closest - farthestdb.c.find( {a:{$within:{$box:[[40,40],[60,60]]}}} ) using index {a:’2d’}db.c.find( {a:{$within:{$center:[[50,50],10]}}} ) using index {a:’2d’}db.c.find( {a:{$near:[50,50]},b:2} ) using index {a:’2d’,b:1}QUESTION: Most queries can be performed with or without an index. Is this true of geospatial queries?No. A geospatial query requires an index.
  • 26. How does an insert work?Tree traversal and insert, split if necessary{x:3.5}{x:2}{x:3}{x:4}3<=x<44<=x<5{x:0.5}2<=x<5{x:5}0<=x<1{x:6}x>=5x<0{x:-4}{x:1}
  • 27. What if my keys are increasing?You’ll always insert on the right{x:2}{x:3}{x:4}3<=x<44<=x<5{x:0.5}2<=x<5{x:5}0<=x<1{x:6}x>=5x<0{x:7}{x:-4}{x:8}{x:1}{x:9}
  • 28. Why is RAM important?RAM is basically used as a LIFO disk cacheWhole index in RAMPortion of index in RAM
  • 29. Creating an index{_id:1} index created automaticallyFor non-capped collectionsdb.c.ensureIndex( {x:1} )Can create an index at any time, even when you already have plenty of data in your collectionCreating an index will block mongoDB unless you specify background index creationdb.c.ensureIndex( {x:1}, {background:true} )Background index creation is a still impacts performance – run at non peak times if you’re concernedQUESTION: Can an index be removed during background creation?Not at this time.
  • 30. Unique key constraintsdb.c.ensureIndex( {x:1}, {unique:true} )Don’t allow {_id:10,x:2} and {_id:11,x:2}Don’t allow {_id:12} and {_id:13} (both match {x:null}What if duplicates exist before index is created?Normally index creation fails and the index is removeddb.ensureIndex( {x:1}, {unique:true,dropDups:true} )QUESTION: In dropDups mode, which duplicates will be removed?The first document according to the collection’s “natural order” will be preserved.
  • 31. Cleaning up an indexdb.system.indices.find( {ns:’db.c’} )db.c.dropIndex( {x:1} )db.c.dropindices()db.c.reIndex()Rebuilds all indices, removing index cruft that has built up over large numbers of updates and deletes. Index cruft will not exist in mongoDB 1.6, so this command will be deprecated.QUESTION: Why would you want to drop an index?See next slide…
  • 32. Limits and tradeoffsMax 40 indices per collectionLogically equivalent indices are not prevented (eg {x:1} and {x:-1})indices can improve speed of queries, but make inserts slowerA more specific index {a:1,b:1,c:1} can be more helpful than less specific index {a:1} but the more specific index will be larger, thus harder to fit in RAMQUESTION: Do indices make updates slower? How about deletes?It depends – finding your document might be faster, but if any indexed fields are changed the indices must be updated.
  • 33. Mongod log outputquery test.c ntoreturn:1 reslen:69 nscanned:100000 { i: 99999.0 } nreturned:1 157msquery test.$cmd ntoreturn:1 command: { count: "c", query: { type: 0.0, i: { $gt: 99000.0 } }, fields: {} } reslen:64 256msquery:{ query: {}, orderby: { i: 1.0 } } ... query test.c ntoreturn:0 exception 1378ms ... User Exception 10128:too much key data for sort() with no index. add an index or specify a smaller limitquery test.c ntoreturn:0 reslen:4783 nscanned:100501 { query: { type: 500.0 }, orderby: { i: 1.0 } } nreturned:101 390msOccasionally may see a slow operation as a result of disk activity or mongo cleaning things up – some messages about slow ops are spuriousKeep this in mind when running the same op a massive number of times, and it appears slow very rarely
  • 34. ProfilingRecord same info as with log messages, but in a database collection> db.system.profile.find(){"ts" : "Thu Jan 29 2009 15:19:32 GMT-0500 (EST)" , "info" : "query test.$cmd ntoreturn:1 reslen:66 nscanned:0 <br>query: { profile: 2 } nreturned:1 bytes:50" , "millis" : 0}...> db.system.profile.find( { info: /test.foo/ } )> db.system.profile.find( { millis : { $gt : 5 } } )> db.system.profile.find().sort({$natural:-1})Enable explicitly using levels (0:off, 1:slow ops (>100ms), 2:all ops)> db.setProfilingLevel(2);{"was" : 0 , "ok" : 1}> db.getProfilingLevel()2> db.setProfilingLevel( 1 , 10 ); // slow means > 10msProfiling impacts performance, but not severely
  • 35. Query explain> db.c.find( {x:1000,y:0} ).explain(){ "cursor" : "BtreeCursor x_1", "indexBounds" : [ [ { "x" : 1000 }, { "x" : 1000 } ] ], "nscanned" : 10, "nscannedObjects" : 10, "n" : 10, "millis" : 0, "oldPlan" : { "cursor" : "BtreeCursor x_1", "indexBounds" : [ [ { "x" : 1000 }, { "x" : 1000 } ] ] }, "allPlans" : [ { "cursor" : "BtreeCursor x_1", "indexBounds" : [ [ { "x" : 1000 }, { "x" : 1000 } ] ] }, { "cursor" : "BtreeCursor y_1", "indexBounds" : [ [ { "y" : 0 }, { "y" : 0 } ] ] }, { "cursor" : "BasicCursor", "indexBounds" : [ ] } ]}
  • 36. Example 1> db.c.findOne( {i:99999} ){ "_id" : ObjectId("4bb962dddfdcf5761c1ec6a3"), "i" : 99999 }query test.c ntoreturn:1 reslen:69 nscanned:100000 { i: 99999.0 } nreturned:1 157ms> db.c.find( {i:99999} ).limit(1).explain(){ "cursor" : "BasicCursor", "indexBounds" : [ ], "nscanned" : 100000, "nscannedObjects" : 100000, "n" : 1, "millis" : 161, "allPlans" : [ { "cursor" : "BasicCursor", "indexBounds" : [ ] } ]}> db.c.ensureIndex( {i:1} );> for( i = 0; i < 100000; ++i ) { db.c.save( {i:i} ); }
  • 37. Example 2> db.c.count( {type:0,i:{$gt:99000}} )499query test.$cmd ntoreturn:1 command: { count: "c", query: { type: 0.0, i: { $gt: 99000.0 } }, fields: {} } reslen:64 256ms> db.c.find( {type:0,i:{$gt:99000}} ).limit(1).explain(){ "cursor" : "BtreeCursor type_1", "indexBounds" : [ [ { "type" : 0 }, { "type" : 0 } ] ], "nscanned" : 49502, "nscannedObjects" : 49502, "n" : 1, "millis" : 349,...> db.c.ensureIndex( {type:1,i:1} );> for( i = 0; i < 100000; ++i ) { db.c.save( {type:i%2,i:i} ); }
  • 38. Example 3> db.c.find().sort( {i:1} )error: { "$err" : "too much key data for sort() with no index. add an index or specify a smaller limit"}> db.c.find().sort( {i:1} ).explain()JS Error: uncaught exception: error: { "$err" : "too much key data for sort() with no index. add an index or specify a smaller limit"}> db.c.ensureIndex( {i:1} );> db.c.find().sort( {i:1} ).limit( 1000 ); //alternatively> for( i = 0; i < 1000000; ++i ) { db.c.save( {i:i} ); }
  • 39. Example 4> db.c.find( {type:500} ).sort( {i:1} ){ "_id" : ObjectId("4bba4904dfdcf5761c2f917e"), "i" : 500, "type" : 500 }{ "_id" : ObjectId("4bba4904dfdcf5761c2f9566"), "i" : 1500, "type" : 500 }...query test.c ntoreturn:0 reslen:4783 nscanned:100501 { query: { type: 500.0 }, orderby: { i: 1.0 } } nreturned:101 390ms> db.c.find( {type:500} ).sort( {i:1} ).explain(){ "cursor" : "BtreeCursor i_1", "indexBounds" : [ [ { "i" : { "$minElement" : 1 } }, { "i" : { "$maxElement" : 1 } } ] ], "nscanned" : 1000000, "nscannedObjects" : 1000000, "n" : 1000, "millis" : 5388,...> db.c.ensureIndex( {type:1,i:1} );> for( i = 0; i < 1000000; ++i ) { db.c.save( {i:i,type:i%1000} ); }
  • 40. Questions?Get involved www.mongodb.orgDownloads, user group, chat roomFollow @mongodbUpcoming events www.mongodb.org/display/DOCS/EventsSF MongoDB office hours Mondays 4-6pm at Epicenter CaféSF MongoDBmeetupMay 17 at Engine YardCommercial support [email protected]