Mongo Scaling

45
Scaling with MongoDB Alvin Richards [email protected] Sunday, May 1, 2011

description

How to build scalable data persistence tier with MongoDB.

Transcript of Mongo Scaling

Page 1: Mongo Scaling

Scaling with MongoDBAlvin Richards

[email protected]

Sunday, May 1, 2011

Page 2: Mongo Scaling

Stuff we are going to cover super fast!

• Vertical Scaling • Horizontal Scaling with MongoDB

• Schema & Index design• Auto Sharding• Replication

Sunday, May 1, 2011

Page 3: Mongo Scaling

Scaling

• Operations/sec go up

• Storage needs go up

• Capacity

• IOPs

• Complexity goes up

• Caching

Sunday, May 1, 2011

Page 4: Mongo Scaling

• Optimization & Tuning• Schema & Index Design• O/S tuning• Hardware configuration

• Vertical scaling• Hardware is expensive• Hard to scale in cloud

How do you scale now?

$$$

throughputSunday, May 1, 2011

Page 5: Mongo Scaling

MongoDB Scaling - Single Node

write

read

node_a1

Sunday, May 1, 2011

Page 6: Mongo Scaling

Read scaling - add Replicas

write

read

node_b1

node_a1

Sunday, May 1, 2011

Page 7: Mongo Scaling

Read scaling - add Replicas

write

read

node_c1

node_b1

node_a1

Sunday, May 1, 2011

Page 8: Mongo Scaling

Write scaling - Sharding

write

read

shard1

node_c1

node_b1

node_a1

Sunday, May 1, 2011

Page 9: Mongo Scaling

Write scaling - add Shards

write

read

shard1

node_c1

node_b1

node_a1

shard2

node_c2

node_b2

node_a2

Sunday, May 1, 2011

Page 10: Mongo Scaling

Write scaling - add Shards

write

read

shard1

node_c1

node_b1

node_a1

shard2

node_c2

node_b2

node_a2

shard3

node_c3

node_b3

node_a3

Sunday, May 1, 2011

Page 11: Mongo Scaling

Scaling with MongoDB

• Schema & Index Design• Sharding• Replication

Sunday, May 1, 2011

Page 12: Mongo Scaling

Schema

• Data model effects performance• Embedding versus Linking

• Roundtrips to database• Disk seek time• Size of data to read & write

• Partial versus full document writes

• Performance problems can be solved by changing schema

Sunday, May 1, 2011

Page 13: Mongo Scaling

Indexes

• Index common queries• Do not over index

•(A) and (A,B) are equivalent, choose one• Right-balanced indexes keep working set small

Sunday, May 1, 2011

Page 14: Mongo Scaling

Query for {a: 7}

{...}  {...}  {...}  {...}  {...}  {...}  {...}  {...}  {...}  {...}  {...}

[-­‐∞,  5)[5,  10)

[10,  ∞)

[5,  7) [7,  9) [9,  10)[10,  ∞)  buckets[-­‐∞,  5)  buckets

With  Index

Without  index  -­‐  Scan

Sunday, May 1, 2011

Page 15: Mongo Scaling

Indexing Embedded Documents & Multikeys

db.posts.save({    title:        “My  First  blog”,    tags:          [“mongodb”,  “cool”],    comments:  [          {author:  “James”,  ts  :  new  Date()}  ]});

db.posts.ensureIndex({“comments.author”:  1})db.posts.ensureIndex({“tags”:  1})

Sunday, May 1, 2011

Page 16: Mongo Scaling

Picking an a Index

find({x:  10,  y:  “foo”})

   scan

   index  on  x

   index  on  y remember

terminate

Sunday, May 1, 2011

Page 17: Mongo Scaling

What is Sharding

• Ad-hoc partitioning

• Consistent hashing• Amazon Dynamo

• Range based partitioning• Google BigTable• Yahoo! PNUTS• MongoDB

Sunday, May 1, 2011

Page 18: Mongo Scaling

MongoDB Sharding

• Automatic partitioning and management

• Range based

• Convert to sharded system with no downtime

• Almost no functionality lost over single master

• Fully consistent

Sunday, May 1, 2011

Page 19: Mongo Scaling

How MongoDB Sharding works>  db.runCommand(  {  addshard  :  "shard1"  }  );>  db.runCommand(        {  shardCollection  :  “mydb.blogs”,            key  :  {  age  :  1}  }  )

-∞  +∞  

•Range keys from -∞ to +∞  •Ranges are stored as “chunks”

Sunday, May 1, 2011

Page 20: Mongo Scaling

How MongoDB Sharding works

>  db.posts.save(  {age:40}  )

-∞  +∞  

-∞   40 41 +∞  

•Data in inserted•Ranges are split into more “chunks”

Sunday, May 1, 2011

Page 21: Mongo Scaling

How MongoDB Sharding works

>  db.posts.save(  {age:40}  )>  db.posts.save(  {age:50}  )

-∞  +∞  

-∞   40 41 +∞  

41 50 51 +∞  

•More Data in inserted•Ranges are split into more“chunks”

Sunday, May 1, 2011

Page 22: Mongo Scaling

How MongoDB Sharding works

>  db.posts.save(  {age:40}  )>  db.posts.save(  {age:50}  )>  db.posts.save(  {age:60}  )

-∞  +∞  

-∞   40 41 +∞  

41 50 51 +∞  

61 +∞  51 60

Sunday, May 1, 2011

Page 23: Mongo Scaling

-∞  +∞  

41 +∞  

51 +∞  

How MongoDB Sharding works

>  db.posts.save(  {age:40}  )>  db.posts.save(  {age:50}  )>  db.posts.save(  {age:60}  )

-∞   40

41 50

61 +∞  51 60

Sunday, May 1, 2011

Page 24: Mongo Scaling

How MongoDB Sharding works

-∞   40

41 50

61 +∞  

51 60

shard1

Sunday, May 1, 2011

Page 25: Mongo Scaling

How MongoDB Sharding works

>  db.runCommand(  {  addshard  :  "shard2"  }  );

-∞   40

41 50

61 +∞  

51 60

Sunday, May 1, 2011

Page 26: Mongo Scaling

How MongoDB Sharding works

>  db.runCommand(  {  addshard  :  "shard2"  }  );

-∞   40

41 50

61 +∞  

51 60

shard1

Sunday, May 1, 2011

Page 27: Mongo Scaling

How MongoDB Sharding works

>  db.runCommand(  {  addshard  :  "shard2"  }  );

-∞   40

41 50

61 +∞  

51 60

shard1 shard2

Sunday, May 1, 2011

Page 28: Mongo Scaling

How MongoDB Sharding works

>  db.runCommand(  {  addshard  :  "shard2"  }  );

-∞   40

41 50

61 +∞  

51 60

shard1 shard2

>  db.runCommand(  {  addshard  :  "shard3"  }  );

shard3

Sunday, May 1, 2011

Page 29: Mongo Scaling

Sharding Key Examples

•Good  :  {server:1}• All data for one server is in a single chunk• Chunk cannot be split any smaller

•Better  :  {server:1,time:1}‣  Chunk  can  be  split  by  millisecond

{      server  :  "ny153.example.com"  ,      application  :  "apache"  ,      time  :  "2011-­‐01-­‐02T21:21:56.249Z"  ,      level  :  "ERROR"  ,      msg  :  "something  is  broken"}

Sunday, May 1, 2011

Page 30: Mongo Scaling

Sharding Key Examples

•Good  :  {time  :  1}‣ Time is an increasing number‣ All data will be first written to a single shard‣ Data balanced to other shards later

•Better  :  {server:1,application:1,time:1}‣More key values to enable writes to all shards

{      server  :  "ny153.example.com"  ,      application  :  "apache"  ,      time  :  "2011-­‐01-­‐02T21:21:56.249Z"  ,      level  :  "ERROR"  ,      msg  :  "something  is  broken"}

Sunday, May 1, 2011

Page 31: Mongo Scaling

Sharding Features

• Shard data without no downtime • Automatic balancing as data is written• Commands routed (switched) to correct node

• Inserts - must have the Shard Key• Updates - must have the Shard Key• Queries

• With Shard Key - routed to nodes• Without Shard Key - scatter gather

• Indexed Queries• With Shard Key - routed in order• Without Shard Key - distributed sort merge

Sunday, May 1, 2011

Page 32: Mongo Scaling

MongoDB Replication

• MongoDB replication like MySQL replication•Asynchronous master/slave

• Variations:•Master / slave•Replica Sets

Sunday, May 1, 2011

Page 33: Mongo Scaling

• A cluster of N servers• Any (one) node can be primary• Consensus election of primary• Automatic failover• Automatic recovery• All writes to primary• Reads can be to primary (default) or a secondary

Replica Set features

Sunday, May 1, 2011

Page 34: Mongo Scaling

How MongoDB Replication works

Member  1

Member  2

Member  3

•Set is made up of 2 or more nodes

Sunday, May 1, 2011

Page 35: Mongo Scaling

How MongoDB Replication works

Member  1

Member  2PRIMARY

Member  3

•Election establishes the PRIMARY•Data replication from PRIMARY to SECONDARY

Sunday, May 1, 2011

Page 36: Mongo Scaling

How MongoDB Replication works

Member  1

Member  2DOWN

Member  3

negotiate  new  master

•PRIMARY may fail•Automatic election of new PRIMARY

Sunday, May 1, 2011

Page 37: Mongo Scaling

How MongoDB Replication works

Member  1

Member  2DOWN

Member  3PRIMARY

•New PRIMARY elected•Replication Set re-established

Sunday, May 1, 2011

Page 38: Mongo Scaling

How MongoDB Replication works

Member  1

Member  2RECOVERING

Member  3PRIMARY

•Automatic recovery

Sunday, May 1, 2011

Page 39: Mongo Scaling

How MongoDB Replication works

Member  1

Member  2

Member  3PRIMARY

•Replication Set re-established

Sunday, May 1, 2011

Page 40: Mongo Scaling

Using Replicas

slaveOk()- driver will send read requests to Secondaries- driver will always send writes to Primary

Java examples-­‐  DB.slaveOk()-­‐  Collection.slaveOk()-­‐  find(q).addOption(Bytes.QUERYOPTION_SLAVEOK);

Sunday, May 1, 2011

Page 41: Mongo Scaling

Creating a Replica Set

>  cfg  =  {        _id  :  "acme_a",        members  :  [            {  _id  :  0,  host  :  "sf1.acme.com"  },            {  _id  :  1,  host  :  "sf2.acme.com"  },            {  _id  :  2,  host  :  "sf3.acme.com"  }  ]  }>  use  admin>  db.runCommand(  {  replSetInitiate  :  cfg  }  )

Sunday, May 1, 2011

Page 42: Mongo Scaling

Replica Set Member Types

• Normal {priority:1}• Passive {priority:0}

• Cannot be elected as PRIMARY• Arbiters

• Can vote in an election• Do not hold any data

• Hidden {hidden:True}

Sunday, May 1, 2011

Page 43: Mongo Scaling

Replication features

• Reads from Primary are always consistent

• Reads from Secondaries are eventually consistent

• Automatic failover if a Primary fails

• Automatic recovery when a node joins the set

Sunday, May 1, 2011

Page 44: Mongo Scaling

Summary

• Schema & Index design• Simplest way to scale

•Sharding • Automatically scale writes

•Replication•Automatically scale reads

Sunday, May 1, 2011

Page 45: Mongo Scaling

@mongodb

conferences,  appearances,  and  meetupshttp://www.10gen.com/events

http://bit.ly/mongoU  Facebook                    |                  Twitter                  |                  LinkedIn

http://linkd.in/joinmongo

download at mongodb.org

We’re Hiring [email protected]

Sunday, May 1, 2011