MongoDB Schema Design
-
date post
18-Oct-2014 -
Category
Education
-
view
21.792 -
download
2
description
Transcript of MongoDB Schema Design
Schema Design
by Alex Litvinok
Schema Design
Basic unit of data – Document..
Schema Design
What is document?
• BSON Document
• Embedding
• Links across documents
Schema Design
Example
event = {
_id: ObjectId(‘47cc67093475061e3d95369d’),
name: ‘MeetUP #2’,
date: ISODate(‘2012-04-05 19:00:00'),
where: {
city: ‘Minsk’,
adress: ‘Nezavisimosti, 186’ }
}
01.
02.
03.
04.
05.
06.
07.
08.
Schema Design
RDBMS? @#$.? NoSQL!
Relation DB Document DB
Database Database
Table Collection
Row(s) Document
Index Index
Join Embedding and Links
Partition Shard
Partition Key Shard Key
Schema Design
Why?
• Make queries easy and fast
• Facilitate sharding and automaticity
Schema Design
Strategy
• Start with a normalized model
• Embed docs for simplicity and optimization
Schema Design
Normalized? Denormalized?
Product
• _id
• name • price • desc
Schema Design
Normalized schema
Order = {
_id : orderId,
user : userInfo,
items : [
productId1,
productId2,
productId3
]
}
Product = {
_id: productId,
name : name,
price : price,
desc : description
}
01.
02.
03.
04.
05.
06.
07.
08.
09.
10.
11.
12.
13.
14.
15. * Link to collection of product
Order
• _id
• user • items *
Schema Design
Normalized schema
• Normalized documents are a perfectably acceptable way to use MongoDB.
• Normalized documents provide maximum flexibility.
Schema Design
Links across documents
DBRef { $ref : <collname>, $id : <idvalue>[, $db : <dbname>] }
Or simple storage of _id..
Schema Design
Denormalized schema
Order = {
_id : orderId,
user : userInfo,
items : [ {
_id: productId1,
name : name1,
price : price1
}, {
_id: productId2,
name : name2,
price : price3
} ]
}
01.
02.
03.
04.
05.
06.
07.
08.
09.
10.
11.
12.
13.
Order
• _id
• user • items
• _id
• name • price
• _id
• name • price
Schema Design
Denormalized schema
• Embedded documents are good for fast queries.
• The embedded documents always available with the parent documents.
• Embedded and nested documents are good for storing complex hierarchies.
Schema Design
Embedding documents
{
title : "Contributors",
data: [
{ name: “Grover" },
{ name: “James", surname: “Madison" },
{ surname: “Grant" }
]
}
01.
02.
03.
04.
05.
06.
07.
08.
09.
Schema Design
..fast queries
Schema Design
Indexes
Basics
> db.collection.ensureIndex({ name:1 });
Indexing on Embedded Fields
> db.collection.ensureIndex({ location.city:1 })
Compound Keys
> db.collection.ensureIndex({ name:1, age:-1 })
Schema Design
Also indexes..
The _id Index
• Automatically created except capped collection
• Index is special and cannot be deleted
• Enforces uniqueness for its keys
Indexing Array Elements
• Indexes for each element of the array
Compound Keys
• Direction of the index ( 1 for ascending or -1 for descending )
Schema Design
Again indexes...
Create options
sparse, unique, dropDups, background, v…
Geospatial Indexing
> db.places.ensureIndex( { loc : "2d" } )
> db.places.ensureIndex( { loc : "2d" } , { min : -500 , max : 500 } )
> db.places.ensureIndex( { loc : "2d" } , { bits : 26 } )
Schema Design
Analysis and Optimization Profiler | Explain
Schema Design
Database Profiler
Profiling Level
• 0 - Off
• 1 - log slow operations (by default, >100ms is considered slow)
• 2 - log all operations
> db.setProfilingLevel(2);
Schema Design
Database Profiler
Viewing the Data – collection system.profile
> db.system.profile.find()
{ "ts" : "Thu Jan 29 2009 15:19:32 GMT-0500 (EST)" , "info" : "query
test.$cmd ntoreturn:1 reslen:66 nscanned:0 <br>query: { profile: 2 }
nreturned:1 bytes:50" , "millis" : 0}
Schema Design
Explain
> db.collection.find( … ).explain()
{ cursor : "BasicCursor",
indexBounds : [ ],
nscanned : 57594,
nscannedObjects : 57594,
nYields : 2 ,
n : 3 ,
millis : 108,
indexOnly : false,
isMultiKey : false,
nChunkSkips : 0
}
Schema Design
From theory to Actions..
Schema Design
Seating plan
{ _id: ObjectId, event_id: ObjectId seats: {
A1:1, A2:1, A3:0, … H30:0
} }
Schema Design
Seating plan
{ _id: {
event_id: ObjectId, seat: ‘C9’
}, updated: new Date(), state: ‘AVALIBLE’
}
Schema Design
Feed reader
• Users
• Feed
• Entries
Schema Design
Feed reader
Storage users {
_id: ObjectId, name: ‘username’, feeds: [ ObjectId, ObjectId, … ]
}
Schema Design
Feed reader
Storage feeds {
_id: ObjectId, url: ‘http://bbc.com/news/feed’, name: ‘BBC News’, latest: Date(‘2012-01-10T12:30:13Z’), enties:[{ latest: Date(‘2012-01-10T12:30:13Z’), title: ‘Bomb kills Somali sport officials’, description: ‘…’, … }]
}
Schema Design
Some tips
1. Duplicate data for speed, reference data for integrity
2. Try to fetch data in a single query
3. Design documents to be self-sufficient
4. Override _id when you have your own simple, unique id
5. Don’t always use an index
Schema Design
Conclusion
• Embedded docs are good for fast queries
• Embedded and nested docs are good for storing hierarchies
• Normalized docs are a most acceptable
Schema Design
????