Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

60
Tuesday, February 8, 2011

description

The best presentation ever. Life changing.

Transcript of Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Page 1: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Tuesday, February 8, 2011

Page 2: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Big and FatUsing MongoDB with deep and diverse datasets:

A case study

Tuesday, February 8, 2011

Page 3: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

About me

• My name is Jeremy McAnally

• “Software architect” at Intridea

• Write a lot of books, OSS, etc.

• http://github.com/jm

• http://twitter.com/jm

• http://authoringebooks.com

• http://wickhamhousebrand.com

Tuesday, February 8, 2011

Page 4: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

New book!

Tuesday, February 8, 2011

Page 5: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

New book!

-2 days

from today

Tuesday, February 8, 2011

Page 6: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Preface

The Application™

Tuesday, February 8, 2011

Page 7: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Tuesday, February 8, 2011

Page 8: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Tuesday, February 8, 2011

Page 9: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Disclaimer

We moved to (mostly) sql.

Tuesday, February 8, 2011

Page 10: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Tuesday, February 8, 2011

Page 11: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Tuesday, February 8, 2011

Page 12: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Tuesday, February 8, 2011

Page 13: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

YAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVEYAK SHAVE

Tuesday, February 8, 2011

Page 14: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Lesson 1

Abstraction is a double-edged sword.

Tuesday, February 8, 2011

Page 15: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Abstract away!Talking to all data (no matter the source) the same way will

keep you sane.

Tuesday, February 8, 2011

Page 16: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

users  =  MySQL::Query.execute("SELECT  *  FROM  users;")

users.each  do  |u|    posts  =  db.collection('posts').find(:user_id  =>  u['id'])    #  [...]    comments  =  db.collection('comments').find("$where"  =>  "sum(this.admin_count,  this.moderator_count)  ==  5")end

Tuesday, February 8, 2011

Page 17: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

users  =  User.all

users.each  do  |u|    posts  =  Post.find(:user_id  =>  u.id)    #  [...]    comments  =  Comment.where("sum(this.admin_count,                                                          this.moderator_count)  ==  5")end

Tuesday, February 8, 2011

Page 18: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

users  =  User.all

users.each  do  |u|    posts  =  Post.find(:user_id  =>  u.id)    #  [...]    comments  =  Comment.with_five_thingsend

Tuesday, February 8, 2011

Page 19: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

...but wait!MongoDB has a lot of features that will perform better and be

less (and often better) code.

Tuesday, February 8, 2011

Page 20: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

pharmacists  =  {}

Patient.all.each  do  |patient|    patient.prescriptions.each  do  |prescription|        pharmacists[presciption.name]  ||=  0        pharmacists[presciption.name]  +=  1    endend

Tuesday, February 8, 2011

Page 21: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

pharmacists  =  {}

Patient.all.each  do  |patient|    patient.prescriptions.each  do  |prescription|        pharmacists[presciption.name]  ||=  0        pharmacists[presciption.name]  +=  1    endendSLOW AS

CRAP

Tuesday, February 8, 2011

Page 22: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

map  =  "function(){        this.prescriptions.forEach(            function(p)  {                  emit(p.name,  {  count  :  1  });        })}"      reduce  =  "function(k,  v)  {    var  number  =  0;    for  v.forEach(function()  {        number  +=  v[i].count;    });    return  {  count  :  number  };  }"      pharms  =  @patients.map_reduce(map,  reduce)

Tuesday, February 8, 2011

Page 23: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

map  =  "function(){        this.prescriptions.forEach(            function(p)  {                  emit(p.name,  {  count  :  1  });        })}"      reduce  =  "function(k,  v)  {    var  number  =  0;    for  v.forEach(function()  {        number  +=  v[i].count;    });    return  {  count  :  number  };  }"      pharms  =  @patients.map_reduce(map,  reduce)

Tuesday, February 8, 2011

Page 24: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Lesson 2

Schema design matters.

Tuesday, February 8, 2011

Page 25: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Lesson 2

Schema design matters.DATA MODEL

Tuesday, February 8, 2011

Page 26: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Embedding works.

Embedding documents is a smart decision in a lot of cases.

Tuesday, February 8, 2011

Page 27: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

SELECT  *  FROM  patients  WHERE  id=212;SELECT  *  FROM  prescriptions  WHERE  patient_id=212;SELECT  *  FROM  appointments  WHERE  patient_id=212;SELECT  *  FROM  contacts  WHERE  patient_id=212;SELECT  *  FROM  claims  WHERE  patient_id=212;...

Tuesday, February 8, 2011

Page 28: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

{   "_id"  :  ObjectId("4d51959614971661303ea716"),   "title"  :  "Blogs  rawk.",   "body"  :  "Fo  realz",   "comments"  :  [     {       "user_name"  :  "Jeremy",       "user_id"  :  1234,       "body"  :  "Yup."     }   ]}

Tuesday, February 8, 2011

Page 29: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

...but watch it.You can also hit a ton of

performance and design issues.

Tuesday, February 8, 2011

Page 30: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Tuesday, February 8, 2011

Page 31: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Tuesday, February 8, 2011

Page 32: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

OUR GIANT DOCUMENT

Mongo’s Pre-Allocated Space

Tuesday, February 8, 2011

Page 33: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Patient

Pharmacy

“Reference”Pharmacy

Search, listing, etc.

Tuesday, February 8, 2011

Page 34: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Lesson 3

Don’t go nuts.

Tuesday, February 8, 2011

Page 35: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Schemaless is fun!

Having schemaless data has its own battery of advantages.

nosql

OH MAN MONGO JUST GOT REAL UP

IN HERE

Tuesday, February 8, 2011

Page 36: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Schemaless Joy• Transforming data models is a delight

Tuesday, February 8, 2011

Page 37: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Tuesday, February 8, 2011

Page 38: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Schemaless Joy• Transforming data models is a delight

• Formless data isn’t awkward

Tuesday, February 8, 2011

Page 39: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

{   "_id"  :  ObjectId("4d50c6c32472473e54122d29"),   "name"  :  "Subject  A",   "2007"  :  199,   "2008"  :  2002,   "2010"  :  387},{   "_id"  :  ObjectId("4d50c6d92472473e54122d2a"),   "name"  :  "Subject  B",   "2005"  :  8,   "2008"  :  99,   "2012"  :  466},{   "_id"  :  ObjectId("4d50c6f52472473e54122d2b"),   "name"  :  "Subject  C",   "2005"  :  100,   "2009"  :  120,   "2010"  :  1201,   "2012"  :  3469}

Tuesday, February 8, 2011

Page 40: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

>  db.subjects.find({2008:  {$ne:  null}})      {  "_id"  :  ObjectId("4d50c6c32472473e54122d29"),  "name"  :  "Subject  A",  "2007"  :  199,  "2008"  :  2002,  "2010"  :  387  }{  "_id"  :  ObjectId("4d50c6d92472473e54122d2a"),  "name"  :  "Subject  B",  "2005"  :  8,  "2008"  :  99,  "2012"  :  466  }

Tuesday, February 8, 2011

Page 41: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Schemaless Joy• Transforming data models is a delight

• Formless data isn’t awkward

• Arbitrary embedding is awesome

Tuesday, February 8, 2011

Page 42: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Tuesday, February 8, 2011

Page 43: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Schemaless Joy• Transforming data models is a delight

• Formless data isn’t awkward

• Arbitrary embedding is awesome

• Building to work with schemaless data can lead to some really powerful app concepts

Tuesday, February 8, 2011

Page 44: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

...but be wary.Going nuts will create

headaches for you.

Tuesday, February 8, 2011

Page 45: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Schemaless Pain

Tuesday, February 8, 2011

Page 46: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Schemaless Pain

• Weird app behavior

Tuesday, February 8, 2011

Page 47: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Schemaless Pain

• Weird app behavior

• Huge, long-running data transformations

Tuesday, February 8, 2011

Page 48: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Schemaless Pain

• Weird app behavior

• Huge, long-running data transformations

• Annoying data transforms for development env’s

Tuesday, February 8, 2011

Page 49: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Schemaless Pain

• Weird app behavior

• Huge, long-running data transformations

• Annoying data transforms for development env’s

• Difficult to version data models

Tuesday, February 8, 2011

Page 50: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Lesson 4

Dig deep.

Tuesday, February 8, 2011

Page 51: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

>  db.runCommand({"serverStatus"  :  1}){   "version"  :  "1.4.3",   "uptime"  :  96,   "localTime"  :  "Thu  Nov  18  2010  01:49:38  GMT-­‐0500  (EST)",   "globalLock"  :  {     "totalTime"  :  96005290,     "lockTime"  :  174040,     "ratio"  :  0.0018128167729090762   },   "mem"  :  {     "bits"  :  64,     "resident"  :  2,     "virtual"  :  2396,     "supported"  :  true,     "mapped"  :  0   },   "connections"  :  {     "current"  :  1,     "available"  :  19999   },   "extra_info"  :  {     "note"  :  "fields  vary  by  platform"   },   "indexCounters"  :  {     "btree"  :  {       "accesses"  :  0,       "hits"  :  0,       "misses"  :  0,       "resets"  :  0,       "missRatio"  :  0     }   },   "backgroundFlushing"  :  {     "flushes"  :  1,     "total_ms"  :  0,     "average_ms"  :  0,     "last_ms"  :  0,     "last_finished"  :  "Thu  Nov  18  2010  01:49:02  GMT-­‐0500  (EST)"   },   "opcounters"  :  {     "insert"  :  0,     "query"  :  1,     "update"  :  0,     "delete"  :  0,     "getmore"  :  0,     "command"  :  3   },   "asserts"  :  {     "regular"  :  0,     "warning"  :  0,     "msg"  :  0,     "user"  :  0,     "rollovers"  :  0   },   "ok"  :  1}

Tuesday, February 8, 2011

Page 52: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

"opcounters"  :  {    "insert"  :  0,    "query"  :  1,    "update"  :  0,    "delete"  :  0,    "getmore"  :  0,    "command"  :  3}

Tuesday, February 8, 2011

Page 53: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

"connections"  :  {    "current"  :  1,    "available"  :  19999}

Tuesday, February 8, 2011

Page 54: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Jeremy-­‐McAnallys-­‐MacBook-­‐Pro:~  jeremymcanally$  mongostatconnected  to:  127.0.0.1insert/s  query/s  update/s  delete/s  getmore/s  command/s  mapped    vsize        res  %  locked  %  idx  miss    conn          time                0              0                0                0                  0                  1            0      2396            3                0                    0          1  01:53:32                0              0                0                0                  0                  1            0      2396            3                0                    0          1  01:53:33                0              0                0                0                  0                  1            0      2396            3                0                    0          1  01:53:34                0              0                0                0                  0                  1            0      2396            3                0                    0          1  01:53:35                0              0                0                0                  0                  1            0      2396            3                0                    0          1  01:53:36                0              0                0                0                  0                  1            0      2396            3                0                    0          1  01:53:37                0              0                0                0                  0                  1            0      2396            3                0                    0          1  01:53:38                0              0                0                0                  0                  1            0      2396            3                0                    0          1  01:53:39  

Tuesday, February 8, 2011

Page 55: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

Tuesday, February 8, 2011

Page 56: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

db._adminCommand({  diagLogging  :  1  })

Tuesday, February 8, 2011

Page 57: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

db.currentOp(){  inprog:  [  {  "opid"  :  35  ,  "op"  :  "query"  ,  "ns"  :  "fundb.parties"  ,                            "query"  :  "{  score  :  1.0  }"  ,  "inLock"  :  1  }                    ]}

Tuesday, February 8, 2011

Page 58: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

>  db.oplog.$main.find(){  "ts"  :  {  "t"  :  1290063566000,  "i"  :  1  },  "op"  :  "i",  "ns"  :  "ming.foo",  "o"  :  {  "_id"  :  ObjectId("4ce4ceceabb1b65158000001"),  "thing"  :  2  }  }{  "ts"  :  {  "t"  :  1290063569000,  "i"  :  1  },  "op"  :  "n",  "ns"  :  "",  "o"  :  {  }  }{  "ts"  :  {  "t"  :  1290063579000,  "i"  :  1  },  "op"  :  "n",  "ns"  :  "",  "o"  :  {  }  }{  "ts"  :  {  "t"  :  1290063581000,  "i"  :  1  },  "op"  :  "i",  "ns"  :  "ming.foo",  "o"  :  {  "_id"  :  ObjectId("4ce4ceddabb1b65158000002"),  "thing"  :  2  }  }{  "ts"  :  {  "t"  :  1290063581000,  "i"  :  2  },  "op"  :  "i",  "ns"  :  "ming.foo",  "o"  :  {  "_id"  :  ObjectId("4ce4ceddabb1b65158000003"),  "thing"  :  2  }  }

Tuesday, February 8, 2011

Page 59: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

{  "ts"  :      {  "t"  :  1290063566000,          "i"  :  1      },      "op"  :  "i",      "ns"  :  "ming.foo",      "o"  :  {            "_id"  :  ObjectId("4ce4ceceabb1b65158000001"),            "field"  :  2      }  }

Tuesday, February 8, 2011

Page 60: Big and Fat: Using MongoDB with Deep and Diverse Data Sets (MongoATL version)

That’s all I got.

Questions?

Tuesday, February 8, 2011