2012 mongo db_bangalore_replication

56
1 Sridhar Nanjundeswaran, 10gen [email protected] @snanjund October 26, 2012

description

 

Transcript of 2012 mongo db_bangalore_replication

Page 1: 2012 mongo db_bangalore_replication

1

Sridhar Nanjundeswaran, 10gen

[email protected]

@snanjund

October 26, 2012

Page 2: 2012 mongo db_bangalore_replication

2

• Introduction to Replica Sets

• Developing with Replica Sets

• Operational Considerations

• Behind the Curtain

Page 3: 2012 mongo db_bangalore_replication

Introduction to Replica Sets

Why?

What is it?

Configuration Options

Page 4: 2012 mongo db_bangalore_replication

4

• How many have faced node failures?

• How many have been woken to do fail overs?

• How many have experienced issues due to n/w latency?

• Different uses for data

– Normal processing

– Simple analytics

Page 5: 2012 mongo db_bangalore_replication

5

Node 1

Node 2

Node 3

Page 6: 2012 mongo db_bangalore_replication

6

Node 1

Node 2

Node 3

RS Init

Page 7: 2012 mongo db_bangalore_replication

7

Node 1 Node 2

Node 3

Heartbeat

Election

Page 8: 2012 mongo db_bangalore_replication

8

Node 1

Secondary

Node 2

Secondary

Node 3

Primary Replication Replication

Heartbeat

Page 9: 2012 mongo db_bangalore_replication

9

Node 1

Secondary

Node 2

Secondary

Node 3

Primary

Heartbeat

Primary Election

Page 10: 2012 mongo db_bangalore_replication

10

Node 1

Secondary

Node 2

Primary

Node 3

Primary

Heartbeat

Replication

Page 11: 2012 mongo db_bangalore_replication

11

Node 1

Secondary

Node 2

Primary

Node 3

Recovery

Heartbeat

Replication

Page 12: 2012 mongo db_bangalore_replication

12

Node 1

Secondary

Node 2

Primary

Node 3

Secondary Replication

Heartbeat

Replication

Page 13: 2012 mongo db_bangalore_replication

13

Node 1

Secondary

Node 2

Arbiterr

Node 3

Primary Replication

Heartbeat

Page 14: 2012 mongo db_bangalore_replication

14

> conf = {

_id : "mySet",

members : [

{_id : 0, host : "A”, priority : 3},

{_id : 1, host : "B", priority : 2},

{_id : 2, host : "C”},

{_id : 3, host : "D", hidden : true},

{_id : 4, host : "E", hidden : true, slaveDelay : 3600}

]

}

> rs.initiate(conf)

Page 15: 2012 mongo db_bangalore_replication

15

> conf = {

_id : "mySet”,

members : [

{_id : 0, host : "A”, priority : 3},

{_id : 1, host : "B", priority : 2},

{_id : 2, host : "C”},

{_id : 3, host : "D", hidden : true},

{_id : 4, host : "E", hidden : true, slaveDelay : 3600}

]

}

> rs.initiate(conf)

Primary DC

Page 16: 2012 mongo db_bangalore_replication

16

> conf = {

_id : "mySet”,

members : [

{_id : 0, host : "A”, priority : 3},

{_id : 1, host : "B", priority : 2},

{_id : 2, host : "C”},

{_id : 3, host : "D", hidden : true},

{_id : 4, host : "E", hidden : true, slaveDelay : 3600}

]

}

> rs.initiate(conf)

Secondary DC

Default priority = 1

Page 17: 2012 mongo db_bangalore_replication

17

> conf = {

_id : "mySet”,

members : [

{_id : 0, host : "A”, priority : 3},

{_id : 1, host : "B", priority : 2},

{_id : 2, host : "C”},

{_id : 3, host : "D", hidden : true},

{_id : 4, host : "E", hidden : true, slaveDelay : 3600}

]

}

> rs.initiate(conf)

Analytics node

Page 18: 2012 mongo db_bangalore_replication

18

> conf = {

_id : "mySet”,

members : [

{_id : 0, host : "A”, priority : 3},

{_id : 1, host : "B", priority : 2},

{_id : 2, host : "C”},

{_id : 3, host : "D", hidden : true},

{_id : 4, host : "E", hidden : true, slaveDelay : 3600}

]

}

> rs.initiate(conf) Backup node

Page 19: 2012 mongo db_bangalore_replication

Developing with Replica Sets

Consistency

Write Preference

Read Preference

Page 20: 2012 mongo db_bangalore_replication

20

Application

Primary

Secondary

v1

Insert

Read

Page 21: 2012 mongo db_bangalore_replication

21

Application 1

Primary

Secondary

v1

Insert

Read

Application 2

Read

Page 22: 2012 mongo db_bangalore_replication

22

Application 1

Primary

Secondary

v1

Insert

Read

Application 2

Does not

exist Read

Page 23: 2012 mongo db_bangalore_replication

23

Application 1

Primary

Secondary

v1

Insert

Read

Application 2

Read from primary is

always consistent

Read

Page 24: 2012 mongo db_bangalore_replication

24

Application 1

Primary

Secondary

v1

v1

Insert

Read

Replicate

Application 2

Read

Page 25: 2012 mongo db_bangalore_replication

25

Application 1

Primary

Secondary

v1

v1

Insert

Read

Replicate

Application 2

Read Read

Page 26: 2012 mongo db_bangalore_replication

26

Application 1

Primary

Secondary

v1

v1

Insert

Read

Replicate

Application 2

Reads V1 Read Read

Page 27: 2012 mongo db_bangalore_replication

27

Application 1

Primary

Secondary

v1

v1

Insert

Read

Replicate

Application 2

Read Read

v2

Update

Page 28: 2012 mongo db_bangalore_replication

28

Application 1

Primary

Secondary

v1

v1

Insert

Read

Replicate

Application 2

Read Read

v2

Update

Read

Page 29: 2012 mongo db_bangalore_replication

29

Application 1

Primary

Secondary

v1

v1

Insert

Read

Replicate

Application 2

Read Read

v2

Update

Read

Read

Page 30: 2012 mongo db_bangalore_replication

30

Application 1

Primary

Secondary

v1

v1

Insert

Read

Replicate

Application 2

Read Read

v2

Update

Read

Read

Page 31: 2012 mongo db_bangalore_replication

31

Application 1

Primary

Secondary

v1

v1

Insert

Read

Replicate

Application 2

Read Read

v2

Update

v2

Replicate

Read

Read

Page 32: 2012 mongo db_bangalore_replication

32

Application 1

Primary

Secondary

v1

v1

Insert

Read

Replicate

Application 2

Read Read

v2

Update

v2

Replicate

Read

Read Read

Page 33: 2012 mongo db_bangalore_replication

33

Application 1

Primary

Secondary

v1

v1

Insert

Read

Replicate

Application 2

Read Read

v2

Update

v2

Replicate

Read

Read Read

Page 34: 2012 mongo db_bangalore_replication

34

• Fire and forget

• Wait for error

• Wait for journal sync

• Wait for replication

Page 35: 2012 mongo db_bangalore_replication

35

Driver Primary

write

apply in memory

Page 36: 2012 mongo db_bangalore_replication

36

Driver Primary

getLastError apply in memory

write

Page 37: 2012 mongo db_bangalore_replication

37

Driver Primary

getLastError apply in memory

write

j:true

Write to journal

Page 38: 2012 mongo db_bangalore_replication

38

Secondary

replicate

Driver Primary

getLastError apply in memory

write

W:majority

Page 39: 2012 mongo db_bangalore_replication

39

• Since 2.0.0

• Control over where data is written to

• Each member can have one or more tags e.g.

– Dc : "ny”

– dc: "ny",ip: "192.168",rack: "row3rk7”

• Replica set defines rules for where data resides

• Rules can change without changing app code

Page 40: 2012 mongo db_bangalore_replication

40

{

_id : "mySet",

members : [

{_id : 0, host : "A", tags : {"dc": "ny"}},

{_id : 1, host : "B", tags : {"dc": "ny"}},

{_id : 2, host : "C", tags : {"dc": "sf"}},

{_id : 3, host : "D", tags : {"dc": "sf"}},

{_id : 4, host : "E", tags : {"dc": "cloud"}}]

settings : {

getLastErrorModes : {

allDCs : {"dc" : 3},

someDCs : {"dc" : 2}} }

}

> db.blogs.insert({...})

> db.runCommand({getLastError : 1, w : "allDCs"})

Page 41: 2012 mongo db_bangalore_replication

41

Driver Primary (sf)

getLastError apply in

memory

write

W:allDCs

Secondary (ny)

replicate

Secondary

(cloud)

replicate

Page 42: 2012 mongo db_bangalore_replication

42

• 5 modes (new in 2.2)

– PRIMARY(only) - Default

– PRIMARYPREFERRED

– SECONDARY (only)

– SECONDARYPREFERRED

– NEAREST

Page 43: 2012 mongo db_bangalore_replication

43

• Custom read preferences

• Control where you read from

– E.g. { "disk": "ssd", "use": "reporting" }

• Use in conjunction with standard read

preferences

– Except primary

Page 44: 2012 mongo db_bangalore_replication

Operational Considerations

Upgrade/Maintenance

Common Deployment Scenarios

Page 45: 2012 mongo db_bangalore_replication

45

• No downtime

• Rolling upgrade/maintenance

– Start with Secondary

– Primary last

Page 46: 2012 mongo db_bangalore_replication

46

• Single datacenter

• Single switch & power

• Points of failure:

– Power

– Network

– Datacenter

– Two node failure

• Automatic recovery of

single node crash

Member 1

Member 2

Member 3

Page 47: 2012 mongo db_bangalore_replication

47

• Multi datacenter

• DR node for safety

• Can’t do multi data

center durable write

safely since only 1

node in distant DC

Member 1

Member 2

Member 3

DC

1

DC

2

Page 48: 2012 mongo db_bangalore_replication

48

• Three data centers

• Can survive full data

center loss

• Can do w= { dc : 2 } to

guarantee write in 2

data centers (with

tags)

Member 1

Member 2

Member 3

Member 4

DC

2

Member 5 -

DR

DC

1

DC

3

Page 49: 2012 mongo db_bangalore_replication

Behind the Curtain

Schema

Oplog

Page 50: 2012 mongo db_bangalore_replication

50

• Local DB (not replicated)

– system.replset

– oplog.rs

• Capped collection

• Idempotent version of operation stored

Page 51: 2012 mongo db_bangalore_replication

51

• Heartbeat every 2 seconds

– Times out in 10 seconds

• Missed heartbeat considered node down

Page 52: 2012 mongo db_bangalore_replication

52

> db.replsettest.insert({_id:1,value:1})

{ "ts" : Timestamp(1350539727000, 1), "h" :

NumberLong("6375186941486301201"), "op" : "i", "ns" :

"test.replsettest", "o" : { "_id" : 1, "value" : 1 } }

> db.replsettest.update({_id:1},{$inc:{value:10}})

{ "ts" : Timestamp(1350539786000, 1), "h" :

NumberLong("5484673652472424968"), "op" : "u", "ns" :

"test.replsettest", "o2" : { "_id" : 1 }, "o" : { "$set" : { "value" :

11 } } }

Page 53: 2012 mongo db_bangalore_replication

53

> db.replsettest.update({},{$set:{name : ”foo”}, false, true})

{ "ts" : Timestamp(1350540395000, 1), "h" : NumberLong("-4727576249368135876"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 2 }, "o" : { "$set" : { "name" : "foo" } } }

{ "ts" : Timestamp(1350540395000, 2), "h" : NumberLong("-7292949613259260138"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 3 }, "o" : { "$set" : { "name" : "foo" } } }

{ "ts" : Timestamp(1350540395000, 3), "h" : NumberLong("-1888768148831990635"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 1 }, "o" : { "$set" : { "name" : "foo" } } }

Page 54: 2012 mongo db_bangalore_replication

54

• Full read preference support in mongos

– Drivers too

• Improved RS lag logging

• rs.syncFrom command

• buildIndexes setting

• replIndexPrefetch setting

Page 55: 2012 mongo db_bangalore_replication

55

• Use replica sets

• Easy to setup

– Try on a single machine

• Check doc page for RS tutorials

– http://docs.mongodb.org/manual/replication/#t

utorials

Page 56: 2012 mongo db_bangalore_replication

56

@mongodb

© Copyright 2010 10gen Inc.

conferences, appearances, and meetups http://www.10gen.com/events

http://bit.ly/mongofb

Facebook | Twitter | LinkedIn http://linkd.in/joinmongo

download at mongodb.org

We’re Hiring ! [email protected] @snanjund