Post on 12-May-2015
description
MongoDB Basic Concepts
Norberto Leite
Senior Solutions Architect, EMEAnorberto@10gen.com
@nleite
Sunday, 21 October 12
Agenda
•Overview•Replication•Scalability•Consistency & Durability•Flexibility, Developer Experienc
Sunday, 21 October 12
http://bit.ly/OT71M4
Your data needs started here...
Sunday, 21 October 12
http://bit.ly/Oxcsis
...but soon you had to be here
Sunday, 21 October 12
Basic Concepts
Horizontally Scalable
{ author : “steve”, date : new Date(), text : “About MongoDB...”, tags : [“tech”, “database”]}
Document Oriented
Application
Fully Consistent
High Performance
Sunday, 21 October 12
depth of functionality
scal
abili
ty &
per
form
ance •memcached
•key/value
• RDBMS
Tradeoff: Scale vs Functionality
Sunday, 21 October 12
Replication
Sunday, 21 October 12
Why do we need replication
•Failover •Backups•Secondary batch jobs •High availability
Sunday, 21 October 12
Replica SetsData Availability across nodes• Data Protection
• Multiple copies of the data• Spread across Data Centers, AZs
• High Availability• Automated Failover• Automated Recovery
Sunday, 21 October 12
Replica Sets
Primary
Secondary
Secondary
Read
Write
Read
Read
App
Asynchronous Replication
Sunday, 21 October 12
Replica Sets
Primary
Secondary
Secondary
Read
Write
Read
Read
App
Sunday, 21 October 12
Replica Sets
Primary
Primary
Secondary
Read
Write
Read
Automatic Election of new Primary
App
Sunday, 21 October 12
Replica Sets
Recovering
Primary
Secondary
Read
Write
Read
New primary serves data
App
Sunday, 21 October 12
Replica Sets
Secondary
Primary
Secondary
Read
Write
Read
Read
App
Sunday, 21 October 12
Scalability
Sunday, 21 October 12
Horizontal Scalability
Sunday, 21 October 12
ShardingData Distribution across nodes• Data location transparent to your code• Data distribution is automatic• Data re-distribution is automatic• Aggregate system resources horizontally• No code changes
Sunday, 21 October 12
Sharding - Range distribution
shard01 shard02 shard03
sh.shardCollection("test.tweets", {_id: 1} , false)
Sunday, 21 October 12
Sharding - Range distribution
shard01 shard02 shard03
a-i j-r s-z
Sunday, 21 October 12
Sharding - Splits
shard01 shard02 shard03
a-i ja-jz s-z
k-r
Sunday, 21 October 12
Sharding - Splits
shard01 shard02 shard03
a-i ja-ji s-z
ji-js
js-jw
jz-r
Sunday, 21 October 12
Sharding - Auto Balancing
shard01 shard02 shard03
a-i ja-ji s-z
ji-js
js-jw
jz-r
js-jw
jz-r
Sunday, 21 October 12
Sharding - Auto Balancing
shard01 shard02 shard03
a-i ja-ji n-z
ji-js
js-jw
jz-r
Sunday, 21 October 12
Sharding - Routed Query
shard01 shard02 shard03
a-i ja-ji n-z
ji-js
js-jw
jz-r
find({_id: "norberto"})
Sunday, 21 October 12
Sharding - Routed Query
shard01 shard02 shard03
a-i ja-ji n-z
ji-js
find({_id: "norberto"})
js-jw
jz-r
Sunday, 21 October 12
Sharding - Scatter Gather
shard01 shard02 shard03
a-i ja-ji n-z
ji-js
js-jw
jz-r
find({email: "norberto@10gen.com"})
Sunday, 21 October 12
Sharding - Scatter Gather
shard01 shard02 shard03
a-i ja-ji n-z
ji-js
js-jw
jz-r
find({email: "norberto@10gen.com"})
Sunday, 21 October 12
Sharding - Caching
shard01
a-i
j-r
n-z
300
GB
Dat
a
300 GB
96 GB Mem3:1 Data/Mem
Sunday, 21 October 12
Aggregate Horizontal Resources
shard01 shard02 shard03
a-i j-r n-z
96 GB Mem1:1 Data/Mem
100 GB 100 GB 100 GB
300
GB
Dat
a
96 GB Mem1:1 Data/Mem
96 GB Mem1:1 Data/Mem
Sunday, 21 October 12
Consistency & Durability
Sunday, 21 October 12
Two choices for consistency
•Eventual consistency•Allow updates when a system has been partitioned•Resolve conflicts later•Example: CouchDB, Cassandra
•Immediate consistency•Limit the application of updates to a single master node for a given slice of data
•Another node can take over after a failure is detected•Avoids the possibility of conflicts•Example: MongoDB
Sunday, 21 October 12
Durability
•For how long is my data available?•When do I now that my data is safe?•Where?
•Mongodb style•Fire and Forget•Get Last Error•Journal Sync•Replica Safe
Sunday, 21 October 12
Fire and Forget
Driver Primary
Sunday, 21 October 12
Fire and Forget
Driver Primary
write
Sunday, 21 October 12
Fire and Forget
Driver Primary
write
apply in memory
Sunday, 21 October 12
Get Last Error
Driver Primary
Sunday, 21 October 12
Get Last Error
Driver Primary
getLastError
write
Sunday, 21 October 12
Get Last Error
Driver Primary
getLastErrorapply in memory
write
Sunday, 21 October 12
Get Last Error
Driver Primary
getLastErrorapply in memory
write
Sunday, 21 October 12
Journal Sync
Driver Primary
Sunday, 21 October 12
Journal Sync
Driver Primary
getLastErrorapply in memory
write
j:true
Sunday, 21 October 12
Journal Sync
Driver Primary
getLastErrorapply in memory
write
j:true
write to journal
Sunday, 21 October 12
Journal Sync
Driver Primary
getLastErrorapply in memory
write
j:true
write to journal
Sunday, 21 October 12
Replicas Safe
Driver Primary Secondary
Sunday, 21 October 12
Replicas Safe
Driver Primary
getLastErrorapply in memory
write
w:2
Secondary
Sunday, 21 October 12
Replicas Safe
Driver Primary
getLastErrorapply in memory
write
w:2
replicate
Secondary
Sunday, 21 October 12
Replicas Safe
Driver Primary
getLastErrorapply in memory
write
w:2
replicate
Secondary
Sunday, 21 October 12
Flexibility
Sunday, 21 October 12
• Cost effective operationalize abundant data (clickstreams, logs, tweets, ...)
• Relaxed transactional semantics enable easy scale out
• Auto Sharding for scale down and scale up
• Applications store complex data that is easier to model as documents
• Schemaless DB enables faster development cycles
What MongoDB solves
Agility
Flexibility
Cost
Sunday, 21 October 12
Challenges for Databases
✓ Build a database for scaleout• Run on clusters of 100s of commodity machines
•… that enables agile development
•… and is usable for a broad variety of applications
Sunday, 21 October 12
Data Model
• Why JSON?• Provides a simple, well understood encapsulation of data• Maps simply to the object in your OO language• Linking & Embedding to describe relationships
Sunday, 21 October 12
Json
place1 = { name : "10gen HQ", address : "578 Broadway 7th Floor", city : "New York", zip : "10011", tags : [ "business", "tech" ]}
Sunday, 21 October 12
Schema DesignRelational Database
Sunday, 21 October 12
Schema DesignMongoDB embedding
linkingSunday, 21 October 12
Schemas in MongoDB
Design documents that simply map to your application
post = {author: "Hergé", date: new Date(), text: "Destination Moon", tags: ["comic", "adventure"]}
> db.posts.save(post)
Sunday, 21 October 12
> db.blogs.find( { author: "Hergé"} )
{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Hergé", date : ISODate("2011-09-18T09:56:06.298Z"), text : "Destination Moon", tags : [ "comic", "adventure" ], comments : [! {! ! author : "Kyle",! ! date : ISODate("2011-09-19T09:56:06.298Z"),! ! text : "great book"! } ] }
Embedding
Sunday, 21 October 12
JSON & Scaleout
• Embedding removes need for• Distributed Joins• Two Phase commit
• Enables data to be distributed across many nodes without penalty
Sunday, 21 October 12
http://bit.ly/UmUnsUSunday, 21 October 12
http://bit.ly/cnP77LSunday, 21 October 12
http://bit.ly/ODoMhhSunday, 21 October 12
http://bit.ly/uW2nk
Sunday, 21 October 12
Sunday, 21 October 12