SlideShare a Scribd company logo
1
MongoDB 2.4 and Spring
Data
June 10h, 2013
2
Who Am I?
 Solutions Architect with ICF Ironworks
 Part-time Adjunct Professor
 Started with HTML and Lotus Notes in 1992
• In the interim there was C, C++, VB, Lotus Script, PERL, LabVIEW,
Oracle, MS SQL Server, etc.
 Not so much an Early Adopter as much as a Fast Follower of Java
Technologies
 Alphabet Soup (MCSE, ICAAD, ICASA, SCJP, SCJD, PMP, CSM)
 LinkedIn: http://www.linkedin.com/in/iamjimmyray
 Blog: http://jimmyraywv.blogspot.com/ Avoiding Tech-sand
3
MongoDB 2.4 and Spring
Data
4
Tonight’s Agenda
 Quick introduction to NoSQL and MongoDB
• Configuration
• MongoView
 Introduction to Spring Data and MongoDB support
• Spring Data and MongoDB configuration
• Templates
• Repositories
• Query Method Conventions
• Custom Finders
• Customizing Repositories
• Metadata Mapping (including nested docs and DBRef)
• Aggregation Functions
• GridFS File Storage
• Indexes
5
What is NoSQL?
 Official: Not Only SQL
• In reality, it may or may not use SQL*, at least in its truest form
• Varies from the traditional RDBMS approach of the last few decades
• Not necessarily a replacement for RDBMS; more of a solution for more
specific needs where is RDBMS is not a great fit
• Content Management (including CDNs), document storage, object storage,
graph, etc.
 It means different things to different folks.
• It really comes down to a different way to view our data domains for
more effective storage, retrieval, and analysis…albeit with tradeoffs
that effect our design decisions.
6
From NoSQL-Database.org
“NoSQL DEFINITION: Next Generation Databases mostly
addressing some of the points: being non-relational, distributed,
open-source and horizontally scalable. The original intention has
been modern web-scale databases. The movement began early
2009 and is growing rapidly. Often more characteristics apply such
as: schema-free, easy replication support, simple API, eventually
consistent / BASE (not ACID), a huge amount of data and more.”
7
Some NoSQL Flavors
 Document Centric
• MongoDB
• Couchbase
 Wide Column/Column
Families
• Cassandra
• Hadoop Hbase
 XML (JSON, etc.)
• MarkLogic
 Graph
• Neo4J
 Key/Value Stores
• Redis
 Object
• DB4O
 Other
• LotusNotes/Domino
8
Why MongoDB
 Open Source
 Multiple platforms (Linux, Win, Solaris, Apple) and Language Drivers
 Explicitly de-normalized
 Document-centric and Schema-less (for the most part)
 Fast (low latency)
• Fast access to data
• Low CPU overhead
 Ease of scalability (replica sets), auto-sharding
 Manages complex and polymorphic data
 Great for CDN and document-based SOA solutions
 Great for location-based and geospatial data solutions
9
Why MongoDB (more)
 Because of schema-less approach is more flexible, MongoDB is
intrinsically ready for iterative (Agile) projects.
 Eliminates “impedance-mismatching” with typical RDBMS solutions
 “How do I model my object/document based application in 3NF?”
 If you are already familiar with JavaScript and JSON, MongoDB storage
and document representation is easier to understand.
 Near-real-time data aggregation support
 10gen has been responsive to the MongoDB community
10
What is schema-less?
 A.K.A. schema-free, 10gen says “flexible-schema”
 It means that MongoDB does not enforce a column data type on
the fields within your document, nor does it confine your document
to specific columns defined in a table definition.
 The schema “can be” actually controlled via the application API
layers and is implied by the “shape” (content) of your documents.
 This means that different documents in the same collection can
have different fields.
• So the schema is flexible in that way
• Only the _id field is mandatory in all documents.
 Requires more rigor on the application side.
11
Is MongoDB really schema-less?
 Technically no.
 There is the System Catalog of system collections
• <database>.system.namespaces
• <database>.system.indexes
• <database>.system.profile
• <database>.system.users
 And…because of the nature of how docs are stored in collections
(JSON/BSON), field labels are store in every doc*
12
Schema tips
 MongoDB has ObjectID, can be placed in _id
• If you have a natural unique ID, use that instead
 De-normalize when needed (you must know MongoDB restrictions)
• For example: Compound indexes cannot contain parallel arrays
 Create indexes that cover queries
• Mongo only uses one index at a time for a query
• Watch out for sorts
• What out for field sequence in compound indexes.
 Reduce size of collections (watch out for label sizes)
13
MongoDB Data Modeling and Node Setups
 Schema Design is still important
 Understand your concerns
• Do you have read-intensive or write-intensive data
• Document embedding (fastest and atomic) vs. references (normalized)
• Atomicity – Document Level Only
• Can use 2-Phase Commit Pattern
• Data Durability
• Not “truly” available in a single-server setup
• Requires write concern tuning
• Need sharding and/or replicas
 10gen offers patterns and documentation:
• http://docs.mongodb.org/manual/core/data-modeling/
14
Why Not MongoDB
 High speed and deterministic transactions:
• Banking and accounting
• See MongoDB Global Write Locking
– Improved by better yielding in 2.0
 Where SQL is absolutely required
• Where true Joins are needed*
 Traditional non-real-time data warehousing ops*
 If your organization lacks the controls and rigor to place schema
and document definition at the application level without
compromising data integrity**
15
MongoDB
 Was designed to overcome some of the performance
shortcomings of RDBMS
 Some Features
• Memory Mapped IO (32bit vs. 64bit)
• Fast Querying (atomic operations, embedded data)
• In place updates (physical writes lag in-memory changes)
• Depends on Write Concern settings
• Full Index support (including compound indexes, text, spherical)
• Replication/High Availability (see CAP Theorem)
• Auto Sharding (range-based portioning, based on shard key) for
scalability
• Aggregation, MapReduce, geo-spatial
• GridFS
16
MongoDB – In Place Updates
 No need to get document from the server, just send update
 Physical disk writes lag in-memory changes.
• Lag depends on Write-Concerns (Write-through)
• Multiple writes in memory can occur before the object is updated on
disk
 MongoDB uses an adaptive allocation algorithm for storing its
objects.
• If an object changes and fits in it’s current location, it stays there.
• However, if it is now larger, it is moved to a new location. This moving
is expensive for index updates
• MongoDB looks at collections and based on how many times items
grow within a collection, MongoDB calculates a padding factor that trys
to account for object growth
• This minimizes object relocation
17
MongoDB – A Word About Sharding…
 Need to choose the right key
• Easily divisible (“splittable”– see cardinality) so that Mongo can
distribute data among shards
• “all documents that have the same value in the state field must reside on the
same shard” – 10Gen
• Enable distributed write operations between cluster nodes
• Prevents single-shard bottle-necking
• Make it possible for “Mongos” return most query operations from
multiple shards (or single shard if you can guarantee contiguous
storage in that shard**)
• Distribute write evenly among mongos
• Minimize disk seeks per mongos
• “users will generally have a unique value for this field (Phone)
– MongoDB will be able to split as many chunks as needed” – 10Gen
 Watch out for the need to perform range queries.
18
MongoDB – Cardinality…
 In most cases, when sharding for performance, you want higher
cardinality to allow chunks of data to be split among shards
• Example: Address data components
• State – Low Cardinality
• ZipCode – Potentially low or high, depending population
• Phone Number – High Cardinality
 High cardinality is a good start for sharding, but..
• …it does not guarantee query isolation
• …it does not guarantee write scaling
• Consider computed keys (Hashed , MD5, etc.)
19
CAP Theorem
 Consistency – all nodes see the same data at the same time
 Availability – all requests receive responses, guaranteed
 Partition Tolerance (network partition tolerance)
 The theorem states that you can never have all three, so you plan
for two and make the best of the third.
• For example: Perhaps “eventual consistency” is OK for a CDN
application.
• For large scalability, you would need partitioning. That leaves C & A to
choose from
• Would you ever choose consistency over availability?
 How does CLOUD implementations change this?
20
Example MongoDB Isolated Setup
21
Container Models: RDBMS vs. MongoDB
 RDBMS: Servers > Databases > Schemas > Tables > Rows
• Joins, Group By, ACID
 MongoDB: Servers > Databases > Collections > Documents
• No Joins**
• Instead: Db References (Linking) and Nested Documents (Embedding)
22
MongoDB Collections
 Schema-less
 Can have up to 24000 (according to 10gen)
• Cheap to resource
 Contain documents (…of varying shapes)
• 100 nesting levels (version 2.2)
 Are namespaces, like indexes
 Can be “Capped”
• Limited in max size with rotating overwrites of oldest entries
• Logging anyone?
• Example: MongoDB oplog
 TTL Collections
23
MongoDB Documents
 JSON (what you see)
• Actually BSON (Internal - Binary JSON - http://bsonspec.org/)
 Elements are name/value pairs
 16 MB maximum size
 What you see is what is stored
• No default fields (columns)
24
MongoDB Documents
25
JSON Syntax
 Curly braces are used for documents/objects – {…}
 Square brackets are used for arrays – […]
 Colons are used to link keys to values – key:value
 Commas are used to separate multiple objects or elements or
key/value pairs – {ke1:value1, key2:value2…}
 JavaScript has how many data types?
• 6 – Text, Number, Array, Object, null, Boolean
26
JSON Syntax Example
{
“application”:”HR System”,
"users" : [{"name" : "bill",“age" : 60},
{"name" : "fred","age" : 29}]
}
27
Why BSON?
 Adds data types that JSON did not support – (ISO Dates, ObjectId,
etc.)
 Optimized for performance
 Adds compression
 http://bsonspec.org/#/specification
28
MongoDB Install
 Extract MongoDB
 Build config file, or use startup script
• Need dbpath configured
• Need REST configured for Web Admin tool
 Start Mongod (daemon) process
 Use Shell (mongo) to access your database
 Use MongoVUE (or other) for GUI access and to learn shell
commands
29
MongoDB Install
30
Mongo Shell
 In Windows, mongo.exe
 Interactive JavaScript shell to mongod
 Command-line interface to MongoDB (sort of like SQL*Plus for
Oracle)
 JavaScript Interpreter, behaves like a read-eval-print loop
 Can be run without database connection (use –nodb)
 Uses a fluent API with lazy cursor evaluation
• db.locations.find({state:'MN'},{city:1,state:1,_id:0}).sort({city:-
1}).limit(5).toArray();
31
MongoVUE
 GUI around MongoDB Shell
 Current version 1.61 (May 2013)
 Makes it easy to learn MongoDB Shell commands
• db.employee.find({ "lastName" : "Smith", "firstName" : "John"
}).limit(50);
• show collections
 Not sure if development is continuing, but very handy still.
 Demo…
32
Mongo Explorer
 Silverlight GUI
 Current development has stopped – for now.
 http://mongoexplorer.com/
 Demo…
33
Web Admin Interface
 Localhost:<mongod port + 1000>
 Quick stats viewer
 Run commands
 Demo
 There is also Sleepy Mongoose
• http://www.kchodorow.com/blog/2010/02/22/sleepy-mongoose-a-
mongodb-rest-interface/
34
Web Admin Interface
35
Other MongoDB Tools
 Edda – Log Visualizer
• http://blog.mongodb.org/post/28053108398/edda-a-log-visualizer-for-
mongodb
• Requires Python
 MongoDB Monitoring Service
• Free Cloud based service that monitors MongoDB instances via
configrued agents.
• Requires Python
• http://www.10gen.com/products/mongodb-monitoring-service
 Splunk
• www.splunk.com
36
MongoImport
 Binary mongoimport
 Syntax: mongoimport --stopOnError --port 29009 --db geo --
collection geos --file
C:UserDataDocsJUGsTwinCitieszips.json
 Don’t use for backup or restore in production
• Use mongodump and mongorestore
37
Spring Data
 Large Spring project with many subprojects
• Category: Document Stores, Subproject MongoDB
 “…aims to provide a familiar and consistent Spring-based
programming model…”
 Like other Spring projects, Data is POJO Oriented
 For MongoDB, provides high-level API and access to low-level API
for managing MongoDB documents.
 Provides annotation-driven meta-mapping
 Will allow you into bowels of API if you choose to hang out there
38
Spring Data MongoDB Templates
 Implements MongoOperations (mongoOps) interface
• mongoOps defines the basic set of MongoDB operations for the Spring
Data API.
• Wraps the lower-level MongoDB API
 Provides access to the lower-level API
 Provides foundation for upper-level Repository API.
 Demo
39
Spring Data MongoDB Templates - Configuration
 See mongo-config.xml
40
Spring Data MongoDB Templates - Configuration
 Or…see the config class
41
Spring Data MongoDB Templates - Configuration
42
Spring Data Repositories
 Convenience for data access
• Spring does ALL the work (unless you customize)
 Convention over configuration
• Uses a method-naming convention that Spring interprets during
implementation
 Hides complexities of Spring Data templates and underlying API
 Builds implementation for you based on interface design
• Implementation is built during Spring container load.
 Is typed (parameterized via generics) to the model objects you want to
store.
• When extending MongoRepository
• Otherwise uses @RepositoryDefinition annotation
 Demo
43
Spring Data Bulk Inserts
 All things being equal, bulk inserts in MongoDB can be faster than
inserting one record at a time, if you have batch inserts to perform.
 As of MongoDB 1.8, the max BSON size of a batch insert was
increased from 4MB to 16MB
• You can check this with the shell command: db.isMaster() or
mongo.getMaxBsonObjectSize() in the Java API
 Batch sizes can be tuned for performance
 Demo
44
Transformers
 Does the “heavy lifting” by preparing MongoDB objects for
insertion
 Transforms Java domain objects into MongoDB DBObjects.
 Demo
45
Converters
 For read and write, overrides default mapping of Java objects to
MongoDB documents
 Implements the Spring…Converter interface
 Registered with MongoDB configuration in Spring context
 Handy when integrating MongoDB to existing application.
 Can be used to remove “_class” field
46
Spring Data Meta Mapping
 Annotation-driven mapping of model object fields to Spring Data
elements in specific database dialect. – Demo
47
MongoDB DBRef
 Optional
 Instead of nesting documents
 Have to save the “referenced” document first, so that DBRef exists
before adding it to the “parent” document
48
MongoDB DBRef
49
MongoDB DBRef
50
MongoDB Custom Spring Data Repositories
 Hooks into Spring Data bean type hierarchy that allows you to add
functionality to repositories
 Important: You must write the implementation for part of this
custom repository
 And…your Spring Data repository interface must extend this
custom interface, along with the appropriate Spring Data repository
 Demo
51
Creating a Custom Repository
 Write an interface for the custom methods
 Write the implementation for that interface
 Write the traditional Spring Data Repository application interface,
extending the appropriate Spring Data interface and the (above)
custom interface
 When Spring starts, it will implement the Spring Data Repository
normally, and include the custom implementation as well.
52
MongoDB Queries
 In mongos using JS: db.collection.find( <query>, <projection> )
• Use the projection to limit fields returned, and therefore network traffic
 Example: db["employees"].find({"title":"Senior Engineer"})
 Or: db.employees.find({"title":"Senior Engineer"},{"_id":0})
 Or: db.employees.find({"title":"Senior
Engineer"},{"_id":0,"title":1})
 In Java use DBObject or Spring Data Query for mapping queries.
 You can include and exclude fields in the projection argument.
• You either include (1) or exclude (0)
• You can not include and exclude in the same projection, except for the
“_id” field.
53
DBObject and BasicDBObject
 For the Mongo Java driver, DBObject is the Interface,
BasicDBObject is the class
 This is essentially a map with additional Mongo functionality
• See partial objects when up-serting
 DBObject is used to build commands, queries, projections, and
documents
 DBObjects are used to build out the JS queries that would normally
run in the shell. Each {…} is a potential DBObject.
54
MongoDB Queries – And & Or
 Comma denotes “and”, and you can use $and
• db.employees.find({"title":"Senior
Engineer","lastName":"Bashian"},{"_id":0,"title":1})
 For Or, you must use the $or operator
• db.employees.find({$or:[{"lastName":"Bashian"},{"lastName":"Baik"}]},{"_id":0,
"title":1,"lastName":1})
 In Java, use DBObjects and ArrayLists…
• Nest or/and ArrayLists for compound queries
 Or use the Spring Data Query and Criteria classes with or criteria
 Also see QueryBuilder class
 Demo
55
MongoDB Array Queries
 db.misc.insert({users:["jimmy", "griffin"]})
 db.misc.find({users:"griffin"})
• { "_id" : ObjectId("518a5b7e18aa54b5cf8fc333"), "users" : [
"jimmy", "griffin" ]}
 db.misc.find({users:{$elemMatch:{name:"jimmy",gender:"mal
e"}}})
 { "_id" : ObjectId("518a599818aa54b5cf8fc332"), "users" : [ {
"name" : "jimmy", "gender" : "male" }, { "name" :
"griffin", "gender": "male" } ] }
56
MongoDB Array Updates
db.misc.insert({"users":[{"name":"jimmy","gender":"male"},{"n
ame":"griffin","gender":"male"}]})
db.misc.update({"_id":ObjectId("518276054e094734807395b6"),
"users.name":"jimmy"}, {$set:{"users.$.name":"george"}})
db.employees.update({products:"Softball"},
{$pull:{products:"Softball" }},false,true)
db.employees.find({products:"Softball"}).count()
0
57
Does Field Exist
 $exists
 db.locations.find({user:{$exists:false}})
 Type “it” for more – iterates over documents - paging
58
MongoDB Advanced Queries
 http://www.mongodb.org/display/DOCS/Advanced+Queries#Advan
cedQueries-%24all
 May use Mongo Java driver and BasicDBObjectBuilder
 Spring Data fluent API is much easier
 Demo - $in, $nin, $gt ($gte), $lt ($lte), $all, ranges
59
MongoDB RegEx Queries
 In JS:
db.employees.find({ "title" : { "$regex" : "seNior EngIneer" ,
"$options" : "i"}})
 In Java use java.util.regex.Pattern
60
Optimizing Queries
 Use $hint or hint() in JS to tell MongoDB to use specific index
 Use hint() in Java API with fluent API
 Use $explain or explain() to see MongoDB query explain plan
• Number of scanned objects should be close to the number of returned
objects
61
MongoDB Aggregation Functions
 Aggregation Framework
 Map/Reduce - Demo
 Distinct - Demo
 Group - Demo
• Similar to SQL Group By function
 Count
 Demo #7
62
More Aggregation
 $unwind
• Useful command to convert arrays of objects, within documents, into
sub-documents that are then searchable by query.
db.depts.aggregate({"$project":{"employees":"$employees"}},{"$un
wind":"$employees"},{"$match":{"employees.lname":"Vural"}});
 Demo
63
More Aggregation
 $unwind
• Useful command to convert arrays of objects, within documents, into
sub-documents that are then searchable by query.
db.depts.aggregate({"$project":{"employees":"$employees"}},{"$un
wind":"$employees"},{"$match":{"employees.lname":"Vural"}});
 Demo
64
MongoDB GridFS
 “…specification for storing large files in MongoDB.”
 As the name implies, “Grid” allows the storage of very large files
divided across multiple MongoDB documents.
• Uses native BSON binary formats
 16MB per document
• Will be higher in future
 Large files added to GridFS get chunked and spread across
multiple documents.
65
MongoDB GridFS
 “…specification for storing large files in MongoDB.”
 As the name implies, “Grid” allows the storage of very large files
divided across multiple MongoDB documents.
• Uses native BSON binary formats
 16MB per document
• Will be higher in future
 Large files added to GridFS get chunked and spread across
multiple documents.
66
MongoDB GridFS
 “…specification for storing large files in MongoDB.”
 As the name implies, “Grid” allows the storage of very large files
divided across multiple MongoDB documents.
• Uses native BSON binary formats
 16MB per document
• Will be higher in future
 Large files added to GridFS get chunked and spread across
multiple documents.
67
MongoDB Indexes
 Similar to RDBMS Indexes, Btree (support range queries)
 Can have many
 Can be compound
• Including indexes of array fields in document
 Makes searches, aggregates, and group functions faster
 Makes writes slower
 Sparse = true
• Only include documents in this index that actually contain a value in the
indexed field.
68
Text Indexes
 Currently in BETA, as of 2.4, not recommended for
production…yet
 Requires enabled in mongod
• --setParameter textSearchEnabled=true
 In mongo (shelll)
• db["employees"].ensureIndex({"title":"text"})
• Index “title” field with text index
69
Text Indexes
 Currently in BETA, as of 2.4, not recommended for
production…yet
 Requires enabled in mongod
• --setParameter textSearchEnabled=true
 In mongo (shelll)
• db["employees"].ensureIndex({"title":"text"})
• Index “title” field with text index
70
GEO Spatial Operations
 One of MongoDB’s sweet spots
 Used to store, index, search on geo-spatial data for GIS
operations.
 Requires special indexes, 2d and 2dsphere (new with 2.4)
 Requires Longitude and Latitude (in that order) coordinates
contained in double precision array within documents.
 Demo
71
GEO Spatial Operations
 One of MongoDB’s sweet spots
 Used to store, index, search on geo-spatial data for GIS
operations.
 Requires special indexes, 2d and 2dsphere (new with 2.4)
 Requires Longitude and Latitude (in that order) coordinates
contained in double precision array within documents.
 Demo
72
Query Pagination
 Use Spring Data and QueryDSL - http://www.querydsl.com/
 Modify Spring Data repo extend QueryDslPredicateExecutor
 Add appropriate Maven POM entries for QueryDSL
 Use Page and PageRequest objects to page through result sets
 QueryDSL will create Q<MODEL> Java classes
• Precludes developers from righting pagination code
73
Save vs. Update
 Java driver save() saves entire document.
 Use “update” to save time and bandwidth, and possibly indexing.
• Spring Data is slightly slower than lower level mongo Java driver
• Spring data fluent API is very helpful.
74
MongoDB Security
 http://www.mongodb.org/display/DOCS/Security+and+Authenticati
on
 Default is trusted mode, no security
 --auth
 --keyfile
• Replica sets require this option
 New with 2.4:
• Kerberos Support
75
MongoDB Auth Security
 Use –auth switch to enable
 Create users with roles
 Use db.authenticate in the code (if need be)
76
MongoDB Auth Security with Spring
 May need to add credentials to Spring MongoDB config
 Do not authenticate twice
java.lang.IllegalStateException: can't call authenticate twice on
the same DBObject
at com.mongodb.DB.authenticate(DB.java:476)
77
MongoDB Write Concerns
 Describes quality of writes (or write assurances) to MongoDB
 Application (MongoDB client) is concerned with this quality
 Write concerns describe the durability of a write, and can be tuned
based on application and data needs
 Adjusting write concerns can have an affect (maybe deleterious)
on write performance.
78
MongoDB Encryption
 MongoDB does not support data encryption, per se
 Use application-level encryption and store encrypted data in BSON
fields
 Or…use TDE (Transparent Data Encryption) from Gazzang
• http://www.gazzang.com/encrypt-mongodb
79
MongoDB Licensing
 Database
• “Free Software Foundation's GNU AGPL v3.0.” – 10gen
• “Commercial licenses are also available from 10gen, including free
evaluation licenses.” – 10gen
 Drivers (API):
• “mongodb.org supported drivers: Apache License v2.0.” – 10gen
• “Third parties have created drivers too; licenses will vary there.” –
10gen
80
MongoDB 2.2
 Drop-in replacement for 1.8 and 2.0.x
 Aggregation without Map Reduce
 TTL Collections (alternative to Capped Collections)
 Tag-aware Sharding
 http://docs.mongodb.org/manual/release-notes/2.2/
81
MongoDB 2.4
 Text Search
• Must be enabled, off by default
• Introduces considerable overhead for processing and storage
• Not recommended for PROD systems; it is a BETA feature.
 Hashed Index and sharding
 http://docs.mongodb.org/manual/release-notes/2.4/
82
New JavaScript Engine – V8
 MongoDB 2.4 uses the Google V8 JavaScript Engine
• https://code.google.com/p/v8/
• Open source, written in C++,
• High performance, with improved concurrency for multiple JavaScript
operations in MongoDB at the same time.
83
Some Useful Commands
 use <db> - connects to a DB
 use admin; db.runCommand({top:1})
• Returns info about collection activity
 db.currentOp() – returns info about operations currently running in mongo db
 db.serverStatus()
 db.hostInfo()
 db.isMaster()
 db.runCommand({"buildInfo":1})
 it
 db.runCommand({touch:"employees",data:true,index:true})
• { "ok" : 1 }
84
Helpful Links
 Spring Data MongoDB - Reference Documentation: http://static.springsource.org/spring-
data/data-mongodb/docs/1.0.2.RELEASE/reference/html/
 http://nosql-database.org/
 www.mongodb.org
 http://www.mongodb.org/display/DOCS/Java+Language+Center
 http://www.mongodb.org/display/DOCS/Books
 http://openmymind.net/2011/3/28/The-Little-MongoDB-Book/
 http://jimmyraywv.blogspot.com/2012/05/mongodb-and-spring-data.html
 http://jimmyraywv.blogspot.com/2012/04/mongodb-jongo-and-morphia.html
 https://www.10gen.com/presentations/webinar/online-conference-deep-dive-mongodb
 http://docs.mongodb.org/manual/faq/developers/#faq-developers-query-for-nulls
85
Questions

More Related Content

PPTX
Building Spring Data with MongoDB
PPTX
MongoDB + Spring
PDF
MongodB Internals
PPTX
Basics of MongoDB
PDF
Mongo DB
PDF
Mongo db dhruba
PPTX
MongoDB and Hadoop: Driving Business Insights
Building Spring Data with MongoDB
MongoDB + Spring
MongodB Internals
Basics of MongoDB
Mongo DB
Mongo db dhruba
MongoDB and Hadoop: Driving Business Insights

What's hot (20)

ODP
Introduction to MongoDB
PPTX
MongoDB
PPTX
MongoDB presentation
PPTX
Mongodb - NoSql Database
PPTX
Mongodb introduction and_internal(simple)
PPTX
Mongodb basics and architecture
PPTX
Mongo db operations_v2
PPTX
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
PDF
Introduction to MongoDB
PDF
Introduction to MongoDB
PDF
Using MongoDB + Hadoop Together
PPTX
PPT
Introduction to mongodb
PPTX
Back to Basics Webinar 1: Introduction to NoSQL
PPTX
An Introduction To NoSQL & MongoDB
PPTX
Mongo db intro.pptx
PPTX
Mongo DB Presentation
PPTX
Top 10 frameworks of node js
PPTX
KEY
Practical Ruby Projects With Mongo Db
Introduction to MongoDB
MongoDB
MongoDB presentation
Mongodb - NoSql Database
Mongodb introduction and_internal(simple)
Mongodb basics and architecture
Mongo db operations_v2
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Introduction to MongoDB
Introduction to MongoDB
Using MongoDB + Hadoop Together
Introduction to mongodb
Back to Basics Webinar 1: Introduction to NoSQL
An Introduction To NoSQL & MongoDB
Mongo db intro.pptx
Mongo DB Presentation
Top 10 frameworks of node js
Practical Ruby Projects With Mongo Db
Ad

Viewers also liked (6)

PDF
Spring Data MongoDB 介紹
PPTX
MongoDB + Java + Spring Data
PPTX
MongoDB Shell Tips & Tricks
PDF
JHipster for Spring Boot webinar
PDF
MongoDB for Java Devs with Spring Data - MongoPhilly 2011
PDF
Java Persistence Frameworks for MongoDB
Spring Data MongoDB 介紹
MongoDB + Java + Spring Data
MongoDB Shell Tips & Tricks
JHipster for Spring Boot webinar
MongoDB for Java Devs with Spring Data - MongoPhilly 2011
Java Persistence Frameworks for MongoDB
Ad

Similar to MongoDB 2.4 and spring data (20)

PPTX
MongoDB Notes for BSC Students for all n
PDF
Baisc introduction of mongodb for beginn
PDF
20-NoSQLMongoDbiig data analytics hB.pdf
PPTX
MongoDB
PPTX
MongoDB
PPT
9. Document Oriented Databases
PDF
MongoDB.pdf
PDF
MongoDB Basics
PPTX
Silicon Valley Code Camp: 2011 Introduction to MongoDB
PPTX
Mongo db
PDF
MongoDB NoSQL database a deep dive -MyWhitePaper
PPTX
Techorama - Evolvable Application Development with MongoDB
PPTX
MongoDB
PPT
MongoDB Pros and Cons
PDF
Nosql part1 8th December
PPTX
Everything You Need to Know About MongoDB Development.pptx
PPTX
MongoDB_Sharan_Prakash_Babu
PDF
MongoDB: a gentle, friendly overview
PPTX
Einführung in MongoDB
PPTX
Big data technology unit 3
MongoDB Notes for BSC Students for all n
Baisc introduction of mongodb for beginn
20-NoSQLMongoDbiig data analytics hB.pdf
MongoDB
MongoDB
9. Document Oriented Databases
MongoDB.pdf
MongoDB Basics
Silicon Valley Code Camp: 2011 Introduction to MongoDB
Mongo db
MongoDB NoSQL database a deep dive -MyWhitePaper
Techorama - Evolvable Application Development with MongoDB
MongoDB
MongoDB Pros and Cons
Nosql part1 8th December
Everything You Need to Know About MongoDB Development.pptx
MongoDB_Sharan_Prakash_Babu
MongoDB: a gentle, friendly overview
Einführung in MongoDB
Big data technology unit 3

Recently uploaded (20)

PPTX
OMC Textile Division Presentation 2021.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Mushroom cultivation and it's methods.pdf
PPT
Teaching material agriculture food technology
PPTX
A Presentation on Artificial Intelligence
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Machine Learning_overview_presentation.pptx
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
1. Introduction to Computer Programming.pptx
PDF
Encapsulation theory and applications.pdf
OMC Textile Division Presentation 2021.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Mushroom cultivation and it's methods.pdf
Teaching material agriculture food technology
A Presentation on Artificial Intelligence
Programs and apps: productivity, graphics, security and other tools
SOPHOS-XG Firewall Administrator PPT.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Machine Learning_overview_presentation.pptx
A comparative study of natural language inference in Swahili using monolingua...
Per capita expenditure prediction using model stacking based on satellite ima...
Heart disease approach using modified random forest and particle swarm optimi...
Mobile App Security Testing_ A Comprehensive Guide.pdf
Spectral efficient network and resource selection model in 5G networks
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Univ-Connecticut-ChatGPT-Presentaion.pdf
Assigned Numbers - 2025 - Bluetooth® Document
1. Introduction to Computer Programming.pptx
Encapsulation theory and applications.pdf

MongoDB 2.4 and spring data

  • 1. 1 MongoDB 2.4 and Spring Data June 10h, 2013
  • 2. 2 Who Am I?  Solutions Architect with ICF Ironworks  Part-time Adjunct Professor  Started with HTML and Lotus Notes in 1992 • In the interim there was C, C++, VB, Lotus Script, PERL, LabVIEW, Oracle, MS SQL Server, etc.  Not so much an Early Adopter as much as a Fast Follower of Java Technologies  Alphabet Soup (MCSE, ICAAD, ICASA, SCJP, SCJD, PMP, CSM)  LinkedIn: http://www.linkedin.com/in/iamjimmyray  Blog: http://jimmyraywv.blogspot.com/ Avoiding Tech-sand
  • 3. 3 MongoDB 2.4 and Spring Data
  • 4. 4 Tonight’s Agenda  Quick introduction to NoSQL and MongoDB • Configuration • MongoView  Introduction to Spring Data and MongoDB support • Spring Data and MongoDB configuration • Templates • Repositories • Query Method Conventions • Custom Finders • Customizing Repositories • Metadata Mapping (including nested docs and DBRef) • Aggregation Functions • GridFS File Storage • Indexes
  • 5. 5 What is NoSQL?  Official: Not Only SQL • In reality, it may or may not use SQL*, at least in its truest form • Varies from the traditional RDBMS approach of the last few decades • Not necessarily a replacement for RDBMS; more of a solution for more specific needs where is RDBMS is not a great fit • Content Management (including CDNs), document storage, object storage, graph, etc.  It means different things to different folks. • It really comes down to a different way to view our data domains for more effective storage, retrieval, and analysis…albeit with tradeoffs that effect our design decisions.
  • 6. 6 From NoSQL-Database.org “NoSQL DEFINITION: Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open-source and horizontally scalable. The original intention has been modern web-scale databases. The movement began early 2009 and is growing rapidly. Often more characteristics apply such as: schema-free, easy replication support, simple API, eventually consistent / BASE (not ACID), a huge amount of data and more.”
  • 7. 7 Some NoSQL Flavors  Document Centric • MongoDB • Couchbase  Wide Column/Column Families • Cassandra • Hadoop Hbase  XML (JSON, etc.) • MarkLogic  Graph • Neo4J  Key/Value Stores • Redis  Object • DB4O  Other • LotusNotes/Domino
  • 8. 8 Why MongoDB  Open Source  Multiple platforms (Linux, Win, Solaris, Apple) and Language Drivers  Explicitly de-normalized  Document-centric and Schema-less (for the most part)  Fast (low latency) • Fast access to data • Low CPU overhead  Ease of scalability (replica sets), auto-sharding  Manages complex and polymorphic data  Great for CDN and document-based SOA solutions  Great for location-based and geospatial data solutions
  • 9. 9 Why MongoDB (more)  Because of schema-less approach is more flexible, MongoDB is intrinsically ready for iterative (Agile) projects.  Eliminates “impedance-mismatching” with typical RDBMS solutions  “How do I model my object/document based application in 3NF?”  If you are already familiar with JavaScript and JSON, MongoDB storage and document representation is easier to understand.  Near-real-time data aggregation support  10gen has been responsive to the MongoDB community
  • 10. 10 What is schema-less?  A.K.A. schema-free, 10gen says “flexible-schema”  It means that MongoDB does not enforce a column data type on the fields within your document, nor does it confine your document to specific columns defined in a table definition.  The schema “can be” actually controlled via the application API layers and is implied by the “shape” (content) of your documents.  This means that different documents in the same collection can have different fields. • So the schema is flexible in that way • Only the _id field is mandatory in all documents.  Requires more rigor on the application side.
  • 11. 11 Is MongoDB really schema-less?  Technically no.  There is the System Catalog of system collections • <database>.system.namespaces • <database>.system.indexes • <database>.system.profile • <database>.system.users  And…because of the nature of how docs are stored in collections (JSON/BSON), field labels are store in every doc*
  • 12. 12 Schema tips  MongoDB has ObjectID, can be placed in _id • If you have a natural unique ID, use that instead  De-normalize when needed (you must know MongoDB restrictions) • For example: Compound indexes cannot contain parallel arrays  Create indexes that cover queries • Mongo only uses one index at a time for a query • Watch out for sorts • What out for field sequence in compound indexes.  Reduce size of collections (watch out for label sizes)
  • 13. 13 MongoDB Data Modeling and Node Setups  Schema Design is still important  Understand your concerns • Do you have read-intensive or write-intensive data • Document embedding (fastest and atomic) vs. references (normalized) • Atomicity – Document Level Only • Can use 2-Phase Commit Pattern • Data Durability • Not “truly” available in a single-server setup • Requires write concern tuning • Need sharding and/or replicas  10gen offers patterns and documentation: • http://docs.mongodb.org/manual/core/data-modeling/
  • 14. 14 Why Not MongoDB  High speed and deterministic transactions: • Banking and accounting • See MongoDB Global Write Locking – Improved by better yielding in 2.0  Where SQL is absolutely required • Where true Joins are needed*  Traditional non-real-time data warehousing ops*  If your organization lacks the controls and rigor to place schema and document definition at the application level without compromising data integrity**
  • 15. 15 MongoDB  Was designed to overcome some of the performance shortcomings of RDBMS  Some Features • Memory Mapped IO (32bit vs. 64bit) • Fast Querying (atomic operations, embedded data) • In place updates (physical writes lag in-memory changes) • Depends on Write Concern settings • Full Index support (including compound indexes, text, spherical) • Replication/High Availability (see CAP Theorem) • Auto Sharding (range-based portioning, based on shard key) for scalability • Aggregation, MapReduce, geo-spatial • GridFS
  • 16. 16 MongoDB – In Place Updates  No need to get document from the server, just send update  Physical disk writes lag in-memory changes. • Lag depends on Write-Concerns (Write-through) • Multiple writes in memory can occur before the object is updated on disk  MongoDB uses an adaptive allocation algorithm for storing its objects. • If an object changes and fits in it’s current location, it stays there. • However, if it is now larger, it is moved to a new location. This moving is expensive for index updates • MongoDB looks at collections and based on how many times items grow within a collection, MongoDB calculates a padding factor that trys to account for object growth • This minimizes object relocation
  • 17. 17 MongoDB – A Word About Sharding…  Need to choose the right key • Easily divisible (“splittable”– see cardinality) so that Mongo can distribute data among shards • “all documents that have the same value in the state field must reside on the same shard” – 10Gen • Enable distributed write operations between cluster nodes • Prevents single-shard bottle-necking • Make it possible for “Mongos” return most query operations from multiple shards (or single shard if you can guarantee contiguous storage in that shard**) • Distribute write evenly among mongos • Minimize disk seeks per mongos • “users will generally have a unique value for this field (Phone) – MongoDB will be able to split as many chunks as needed” – 10Gen  Watch out for the need to perform range queries.
  • 18. 18 MongoDB – Cardinality…  In most cases, when sharding for performance, you want higher cardinality to allow chunks of data to be split among shards • Example: Address data components • State – Low Cardinality • ZipCode – Potentially low or high, depending population • Phone Number – High Cardinality  High cardinality is a good start for sharding, but.. • …it does not guarantee query isolation • …it does not guarantee write scaling • Consider computed keys (Hashed , MD5, etc.)
  • 19. 19 CAP Theorem  Consistency – all nodes see the same data at the same time  Availability – all requests receive responses, guaranteed  Partition Tolerance (network partition tolerance)  The theorem states that you can never have all three, so you plan for two and make the best of the third. • For example: Perhaps “eventual consistency” is OK for a CDN application. • For large scalability, you would need partitioning. That leaves C & A to choose from • Would you ever choose consistency over availability?  How does CLOUD implementations change this?
  • 21. 21 Container Models: RDBMS vs. MongoDB  RDBMS: Servers > Databases > Schemas > Tables > Rows • Joins, Group By, ACID  MongoDB: Servers > Databases > Collections > Documents • No Joins** • Instead: Db References (Linking) and Nested Documents (Embedding)
  • 22. 22 MongoDB Collections  Schema-less  Can have up to 24000 (according to 10gen) • Cheap to resource  Contain documents (…of varying shapes) • 100 nesting levels (version 2.2)  Are namespaces, like indexes  Can be “Capped” • Limited in max size with rotating overwrites of oldest entries • Logging anyone? • Example: MongoDB oplog  TTL Collections
  • 23. 23 MongoDB Documents  JSON (what you see) • Actually BSON (Internal - Binary JSON - http://bsonspec.org/)  Elements are name/value pairs  16 MB maximum size  What you see is what is stored • No default fields (columns)
  • 25. 25 JSON Syntax  Curly braces are used for documents/objects – {…}  Square brackets are used for arrays – […]  Colons are used to link keys to values – key:value  Commas are used to separate multiple objects or elements or key/value pairs – {ke1:value1, key2:value2…}  JavaScript has how many data types? • 6 – Text, Number, Array, Object, null, Boolean
  • 26. 26 JSON Syntax Example { “application”:”HR System”, "users" : [{"name" : "bill",“age" : 60}, {"name" : "fred","age" : 29}] }
  • 27. 27 Why BSON?  Adds data types that JSON did not support – (ISO Dates, ObjectId, etc.)  Optimized for performance  Adds compression  http://bsonspec.org/#/specification
  • 28. 28 MongoDB Install  Extract MongoDB  Build config file, or use startup script • Need dbpath configured • Need REST configured for Web Admin tool  Start Mongod (daemon) process  Use Shell (mongo) to access your database  Use MongoVUE (or other) for GUI access and to learn shell commands
  • 30. 30 Mongo Shell  In Windows, mongo.exe  Interactive JavaScript shell to mongod  Command-line interface to MongoDB (sort of like SQL*Plus for Oracle)  JavaScript Interpreter, behaves like a read-eval-print loop  Can be run without database connection (use –nodb)  Uses a fluent API with lazy cursor evaluation • db.locations.find({state:'MN'},{city:1,state:1,_id:0}).sort({city:- 1}).limit(5).toArray();
  • 31. 31 MongoVUE  GUI around MongoDB Shell  Current version 1.61 (May 2013)  Makes it easy to learn MongoDB Shell commands • db.employee.find({ "lastName" : "Smith", "firstName" : "John" }).limit(50); • show collections  Not sure if development is continuing, but very handy still.  Demo…
  • 32. 32 Mongo Explorer  Silverlight GUI  Current development has stopped – for now.  http://mongoexplorer.com/  Demo…
  • 33. 33 Web Admin Interface  Localhost:<mongod port + 1000>  Quick stats viewer  Run commands  Demo  There is also Sleepy Mongoose • http://www.kchodorow.com/blog/2010/02/22/sleepy-mongoose-a- mongodb-rest-interface/
  • 35. 35 Other MongoDB Tools  Edda – Log Visualizer • http://blog.mongodb.org/post/28053108398/edda-a-log-visualizer-for- mongodb • Requires Python  MongoDB Monitoring Service • Free Cloud based service that monitors MongoDB instances via configrued agents. • Requires Python • http://www.10gen.com/products/mongodb-monitoring-service  Splunk • www.splunk.com
  • 36. 36 MongoImport  Binary mongoimport  Syntax: mongoimport --stopOnError --port 29009 --db geo -- collection geos --file C:UserDataDocsJUGsTwinCitieszips.json  Don’t use for backup or restore in production • Use mongodump and mongorestore
  • 37. 37 Spring Data  Large Spring project with many subprojects • Category: Document Stores, Subproject MongoDB  “…aims to provide a familiar and consistent Spring-based programming model…”  Like other Spring projects, Data is POJO Oriented  For MongoDB, provides high-level API and access to low-level API for managing MongoDB documents.  Provides annotation-driven meta-mapping  Will allow you into bowels of API if you choose to hang out there
  • 38. 38 Spring Data MongoDB Templates  Implements MongoOperations (mongoOps) interface • mongoOps defines the basic set of MongoDB operations for the Spring Data API. • Wraps the lower-level MongoDB API  Provides access to the lower-level API  Provides foundation for upper-level Repository API.  Demo
  • 39. 39 Spring Data MongoDB Templates - Configuration  See mongo-config.xml
  • 40. 40 Spring Data MongoDB Templates - Configuration  Or…see the config class
  • 41. 41 Spring Data MongoDB Templates - Configuration
  • 42. 42 Spring Data Repositories  Convenience for data access • Spring does ALL the work (unless you customize)  Convention over configuration • Uses a method-naming convention that Spring interprets during implementation  Hides complexities of Spring Data templates and underlying API  Builds implementation for you based on interface design • Implementation is built during Spring container load.  Is typed (parameterized via generics) to the model objects you want to store. • When extending MongoRepository • Otherwise uses @RepositoryDefinition annotation  Demo
  • 43. 43 Spring Data Bulk Inserts  All things being equal, bulk inserts in MongoDB can be faster than inserting one record at a time, if you have batch inserts to perform.  As of MongoDB 1.8, the max BSON size of a batch insert was increased from 4MB to 16MB • You can check this with the shell command: db.isMaster() or mongo.getMaxBsonObjectSize() in the Java API  Batch sizes can be tuned for performance  Demo
  • 44. 44 Transformers  Does the “heavy lifting” by preparing MongoDB objects for insertion  Transforms Java domain objects into MongoDB DBObjects.  Demo
  • 45. 45 Converters  For read and write, overrides default mapping of Java objects to MongoDB documents  Implements the Spring…Converter interface  Registered with MongoDB configuration in Spring context  Handy when integrating MongoDB to existing application.  Can be used to remove “_class” field
  • 46. 46 Spring Data Meta Mapping  Annotation-driven mapping of model object fields to Spring Data elements in specific database dialect. – Demo
  • 47. 47 MongoDB DBRef  Optional  Instead of nesting documents  Have to save the “referenced” document first, so that DBRef exists before adding it to the “parent” document
  • 50. 50 MongoDB Custom Spring Data Repositories  Hooks into Spring Data bean type hierarchy that allows you to add functionality to repositories  Important: You must write the implementation for part of this custom repository  And…your Spring Data repository interface must extend this custom interface, along with the appropriate Spring Data repository  Demo
  • 51. 51 Creating a Custom Repository  Write an interface for the custom methods  Write the implementation for that interface  Write the traditional Spring Data Repository application interface, extending the appropriate Spring Data interface and the (above) custom interface  When Spring starts, it will implement the Spring Data Repository normally, and include the custom implementation as well.
  • 52. 52 MongoDB Queries  In mongos using JS: db.collection.find( <query>, <projection> ) • Use the projection to limit fields returned, and therefore network traffic  Example: db["employees"].find({"title":"Senior Engineer"})  Or: db.employees.find({"title":"Senior Engineer"},{"_id":0})  Or: db.employees.find({"title":"Senior Engineer"},{"_id":0,"title":1})  In Java use DBObject or Spring Data Query for mapping queries.  You can include and exclude fields in the projection argument. • You either include (1) or exclude (0) • You can not include and exclude in the same projection, except for the “_id” field.
  • 53. 53 DBObject and BasicDBObject  For the Mongo Java driver, DBObject is the Interface, BasicDBObject is the class  This is essentially a map with additional Mongo functionality • See partial objects when up-serting  DBObject is used to build commands, queries, projections, and documents  DBObjects are used to build out the JS queries that would normally run in the shell. Each {…} is a potential DBObject.
  • 54. 54 MongoDB Queries – And & Or  Comma denotes “and”, and you can use $and • db.employees.find({"title":"Senior Engineer","lastName":"Bashian"},{"_id":0,"title":1})  For Or, you must use the $or operator • db.employees.find({$or:[{"lastName":"Bashian"},{"lastName":"Baik"}]},{"_id":0, "title":1,"lastName":1})  In Java, use DBObjects and ArrayLists… • Nest or/and ArrayLists for compound queries  Or use the Spring Data Query and Criteria classes with or criteria  Also see QueryBuilder class  Demo
  • 55. 55 MongoDB Array Queries  db.misc.insert({users:["jimmy", "griffin"]})  db.misc.find({users:"griffin"}) • { "_id" : ObjectId("518a5b7e18aa54b5cf8fc333"), "users" : [ "jimmy", "griffin" ]}  db.misc.find({users:{$elemMatch:{name:"jimmy",gender:"mal e"}}})  { "_id" : ObjectId("518a599818aa54b5cf8fc332"), "users" : [ { "name" : "jimmy", "gender" : "male" }, { "name" : "griffin", "gender": "male" } ] }
  • 56. 56 MongoDB Array Updates db.misc.insert({"users":[{"name":"jimmy","gender":"male"},{"n ame":"griffin","gender":"male"}]}) db.misc.update({"_id":ObjectId("518276054e094734807395b6"), "users.name":"jimmy"}, {$set:{"users.$.name":"george"}}) db.employees.update({products:"Softball"}, {$pull:{products:"Softball" }},false,true) db.employees.find({products:"Softball"}).count() 0
  • 57. 57 Does Field Exist  $exists  db.locations.find({user:{$exists:false}})  Type “it” for more – iterates over documents - paging
  • 58. 58 MongoDB Advanced Queries  http://www.mongodb.org/display/DOCS/Advanced+Queries#Advan cedQueries-%24all  May use Mongo Java driver and BasicDBObjectBuilder  Spring Data fluent API is much easier  Demo - $in, $nin, $gt ($gte), $lt ($lte), $all, ranges
  • 59. 59 MongoDB RegEx Queries  In JS: db.employees.find({ "title" : { "$regex" : "seNior EngIneer" , "$options" : "i"}})  In Java use java.util.regex.Pattern
  • 60. 60 Optimizing Queries  Use $hint or hint() in JS to tell MongoDB to use specific index  Use hint() in Java API with fluent API  Use $explain or explain() to see MongoDB query explain plan • Number of scanned objects should be close to the number of returned objects
  • 61. 61 MongoDB Aggregation Functions  Aggregation Framework  Map/Reduce - Demo  Distinct - Demo  Group - Demo • Similar to SQL Group By function  Count  Demo #7
  • 62. 62 More Aggregation  $unwind • Useful command to convert arrays of objects, within documents, into sub-documents that are then searchable by query. db.depts.aggregate({"$project":{"employees":"$employees"}},{"$un wind":"$employees"},{"$match":{"employees.lname":"Vural"}});  Demo
  • 63. 63 More Aggregation  $unwind • Useful command to convert arrays of objects, within documents, into sub-documents that are then searchable by query. db.depts.aggregate({"$project":{"employees":"$employees"}},{"$un wind":"$employees"},{"$match":{"employees.lname":"Vural"}});  Demo
  • 64. 64 MongoDB GridFS  “…specification for storing large files in MongoDB.”  As the name implies, “Grid” allows the storage of very large files divided across multiple MongoDB documents. • Uses native BSON binary formats  16MB per document • Will be higher in future  Large files added to GridFS get chunked and spread across multiple documents.
  • 65. 65 MongoDB GridFS  “…specification for storing large files in MongoDB.”  As the name implies, “Grid” allows the storage of very large files divided across multiple MongoDB documents. • Uses native BSON binary formats  16MB per document • Will be higher in future  Large files added to GridFS get chunked and spread across multiple documents.
  • 66. 66 MongoDB GridFS  “…specification for storing large files in MongoDB.”  As the name implies, “Grid” allows the storage of very large files divided across multiple MongoDB documents. • Uses native BSON binary formats  16MB per document • Will be higher in future  Large files added to GridFS get chunked and spread across multiple documents.
  • 67. 67 MongoDB Indexes  Similar to RDBMS Indexes, Btree (support range queries)  Can have many  Can be compound • Including indexes of array fields in document  Makes searches, aggregates, and group functions faster  Makes writes slower  Sparse = true • Only include documents in this index that actually contain a value in the indexed field.
  • 68. 68 Text Indexes  Currently in BETA, as of 2.4, not recommended for production…yet  Requires enabled in mongod • --setParameter textSearchEnabled=true  In mongo (shelll) • db["employees"].ensureIndex({"title":"text"}) • Index “title” field with text index
  • 69. 69 Text Indexes  Currently in BETA, as of 2.4, not recommended for production…yet  Requires enabled in mongod • --setParameter textSearchEnabled=true  In mongo (shelll) • db["employees"].ensureIndex({"title":"text"}) • Index “title” field with text index
  • 70. 70 GEO Spatial Operations  One of MongoDB’s sweet spots  Used to store, index, search on geo-spatial data for GIS operations.  Requires special indexes, 2d and 2dsphere (new with 2.4)  Requires Longitude and Latitude (in that order) coordinates contained in double precision array within documents.  Demo
  • 71. 71 GEO Spatial Operations  One of MongoDB’s sweet spots  Used to store, index, search on geo-spatial data for GIS operations.  Requires special indexes, 2d and 2dsphere (new with 2.4)  Requires Longitude and Latitude (in that order) coordinates contained in double precision array within documents.  Demo
  • 72. 72 Query Pagination  Use Spring Data and QueryDSL - http://www.querydsl.com/  Modify Spring Data repo extend QueryDslPredicateExecutor  Add appropriate Maven POM entries for QueryDSL  Use Page and PageRequest objects to page through result sets  QueryDSL will create Q<MODEL> Java classes • Precludes developers from righting pagination code
  • 73. 73 Save vs. Update  Java driver save() saves entire document.  Use “update” to save time and bandwidth, and possibly indexing. • Spring Data is slightly slower than lower level mongo Java driver • Spring data fluent API is very helpful.
  • 74. 74 MongoDB Security  http://www.mongodb.org/display/DOCS/Security+and+Authenticati on  Default is trusted mode, no security  --auth  --keyfile • Replica sets require this option  New with 2.4: • Kerberos Support
  • 75. 75 MongoDB Auth Security  Use –auth switch to enable  Create users with roles  Use db.authenticate in the code (if need be)
  • 76. 76 MongoDB Auth Security with Spring  May need to add credentials to Spring MongoDB config  Do not authenticate twice java.lang.IllegalStateException: can't call authenticate twice on the same DBObject at com.mongodb.DB.authenticate(DB.java:476)
  • 77. 77 MongoDB Write Concerns  Describes quality of writes (or write assurances) to MongoDB  Application (MongoDB client) is concerned with this quality  Write concerns describe the durability of a write, and can be tuned based on application and data needs  Adjusting write concerns can have an affect (maybe deleterious) on write performance.
  • 78. 78 MongoDB Encryption  MongoDB does not support data encryption, per se  Use application-level encryption and store encrypted data in BSON fields  Or…use TDE (Transparent Data Encryption) from Gazzang • http://www.gazzang.com/encrypt-mongodb
  • 79. 79 MongoDB Licensing  Database • “Free Software Foundation's GNU AGPL v3.0.” – 10gen • “Commercial licenses are also available from 10gen, including free evaluation licenses.” – 10gen  Drivers (API): • “mongodb.org supported drivers: Apache License v2.0.” – 10gen • “Third parties have created drivers too; licenses will vary there.” – 10gen
  • 80. 80 MongoDB 2.2  Drop-in replacement for 1.8 and 2.0.x  Aggregation without Map Reduce  TTL Collections (alternative to Capped Collections)  Tag-aware Sharding  http://docs.mongodb.org/manual/release-notes/2.2/
  • 81. 81 MongoDB 2.4  Text Search • Must be enabled, off by default • Introduces considerable overhead for processing and storage • Not recommended for PROD systems; it is a BETA feature.  Hashed Index and sharding  http://docs.mongodb.org/manual/release-notes/2.4/
  • 82. 82 New JavaScript Engine – V8  MongoDB 2.4 uses the Google V8 JavaScript Engine • https://code.google.com/p/v8/ • Open source, written in C++, • High performance, with improved concurrency for multiple JavaScript operations in MongoDB at the same time.
  • 83. 83 Some Useful Commands  use <db> - connects to a DB  use admin; db.runCommand({top:1}) • Returns info about collection activity  db.currentOp() – returns info about operations currently running in mongo db  db.serverStatus()  db.hostInfo()  db.isMaster()  db.runCommand({"buildInfo":1})  it  db.runCommand({touch:"employees",data:true,index:true}) • { "ok" : 1 }
  • 84. 84 Helpful Links  Spring Data MongoDB - Reference Documentation: http://static.springsource.org/spring- data/data-mongodb/docs/1.0.2.RELEASE/reference/html/  http://nosql-database.org/  www.mongodb.org  http://www.mongodb.org/display/DOCS/Java+Language+Center  http://www.mongodb.org/display/DOCS/Books  http://openmymind.net/2011/3/28/The-Little-MongoDB-Book/  http://jimmyraywv.blogspot.com/2012/05/mongodb-and-spring-data.html  http://jimmyraywv.blogspot.com/2012/04/mongodb-jongo-and-morphia.html  https://www.10gen.com/presentations/webinar/online-conference-deep-dive-mongodb  http://docs.mongodb.org/manual/faq/developers/#faq-developers-query-for-nulls

Editor's Notes

  • #21: Sharded data across multiple (3) instances, across multiple (3) replicas. Writes isolated to one replica.
  • #39: Demo:Run MongoTemplateTest.javaClear DBRun EmployeeLoader.javaRun EmployeeServiceTest.javaClear DBRun Short load EmployeesRun JavaMugEmployeeTest.javaClear DBRun CollectionOpsTest.java
  • #43: Demo:Show Employee service and Employee Repository
  • #44: Demo:Clean Database.Run EmployeeBulkInsertTest.java
  • #45: Demo:Show EmployeeTransformer.java
  • #46: Demo:Show mongo-config-covert.xml
  • #47: DemoShow EmployeeProperties.java,Employee.java, Customer.java, etc.
  • #48: Demo:Show Customer.java and CustomerAddress.javaShow CustomerLoader.java
  • #49: Demo:Show Customer.java and CustomerAddress.javaShow CustomerLoader.java
  • #50: Demo:Show Customer.java and CustomerAddress.javaShow CustomerLoader.java
  • #52: Demo:Show RepositoryUtils.java,RepositoryUtilsImpl.java, CustomerRepository.java, CustomerRepositoryImpl.javaRun CustomerLoader.javaRun CustomerServiceTest.javaClear DB
  • #53: Demo:Clean DatabaseRun Shortl LoaderRun BasicQueryTest.java
  • #54: Demo:Show BasicQueryTest.java
  • #55: Demo:Clean DatabaseRun Short LoaderRun BasicQueryTest.javaRun/Show RegExQueryTest.java
  • #58: Demo:Run EmployeeShortLoader.javaRun InNinQueryTest.javaRun SalaryGtQueryTest.javaRun SalaryRangeQueryTest.javaRun TagTest.java
  • #59: Demo:Run EmployeeShortLoader.javaRun InNinQueryTest.javaRun SalaryGtQueryTest.javaRun SalaryRangeQueryTest.javaRun TagTest.java
  • #60: DemoRun RegExQueryTest.java
  • #61: DemoRun ExplainAndHint.java
  • #62: DemoRun EmployeeLoader.javaRun DistinctTest.javaRun EmployeeDeptGroupTest.java or Run EmployeeTitleGroupTest.javaRun MapReduceTest.java (Show MapReduce.groovy)
  • #63: DemoRun/ShowUnwindTest.java
  • #64: DemoRun/ShowUnwindTest.java
  • #65: Demo:Run FileOps.javaShow FileStorageServiceImpl.javaShow mongo-config.xml, beans.xml, main.xml
  • #66: Demo:Run FileOps.javaShow FileStorageServiceImpl.javaShow mongo-config.xml, beans.xml, main.xml
  • #67: Demo:Run FileOps.javaShow FileStorageServiceImpl.javaShow mongo-config.xml, beans.xml, main.xml
  • #68: Demo:Show Employee.javaShow Indexes in MongoVUE
  • #69: Demo:Clean DatabaseLoad Short loadStopmongodStart mongod with auth switchRun FullTextSearchTest.java
  • #70: Demo:Clean DatabaseLoad Short loadStopmongodStart mongod with auth switchRun FullTextSearchTest.java
  • #71: Load/Showlocsdb – LocationLoader.javaShow/Run LocationQueriesTest.javaShow LocationRepository.javaShow LocationServiceImpl.java
  • #72: Load/Showlocsdb – LocationLoader.javaShow/Run LocationQueriesTest.javaShow LocationRepository.javaShow LocationServiceImpl.java
  • #73: Show EmployeeRepoShow PageQueryTestShow target/generated-sources/java
  • #74: Show UpdateTest.java
  • #75: Show mongo-config-auth.xmlShow FullTextSearchTest.java
  • #78: Demo:Stop mongodStart mongod with no authClean databaseLoad short loadRun WriteConcernTest.java