The emerging world of mongo db csp

70
The Emerging The Emerging World World of MongoDB of MongoDB

Transcript of The emerging world of mongo db csp

Page 1: The emerging world of mongo db   csp

The Emerging The Emerging World World

of MongoDBof MongoDB

Page 2: The emerging world of mongo db   csp

schedulerReflexion

What's NOSQL meansWe're thinking in changes

How is MongoDB Let's get started MongoDB

What's going wrong?Document structure

Basic Operations CRUDIndex Explain Hint

Data Model: be quiet - the answerSharding -scaling

Page 3: The emerging world of mongo db   csp

{"NameName" : "Carlos Sánchez Pérez","LoveLove" : "Web Dev","TitleTitle" : "Rough Boy","TwitterTwitter" : "@carlossanchezp","BlogBlog" : "carlossanchezperez.wordpress.com","JobJob" : "ASPgems","GithubGithub" : "carlossanchezp"}

Page 4: The emerging world of mongo db   csp

REFLEXION Through all this time one thing has stayed constant—relational

databases store the data and our decision is almost always implied

Page 5: The emerging world of mongo db   csp

What's NOSQL means?

Page 6: The emerging world of mongo db   csp

Because the true spirit of “NoSQL” does not consist in the way data is queried. It consists in the way data is stored.

NoSQL is all about data storage.

“NoSQL” should be called “SQL with alternative storage models”

Page 7: The emerging world of mongo db   csp

we are thinking in changes.....

Wait let me show you

Page 8: The emerging world of mongo db   csp

● How will we add new machines?

● Are their any single points of failure?

● Do the writes scale as well?

● How much administration will the system require?

● If its open source, is there a healthy community?

● How much time and effort would we have to expend to deploy and integrate it?

● Does it use technology which we know we can work with?

Page 9: The emerging world of mongo db   csp
Page 10: The emerging world of mongo db   csp

How is MongoDB

Page 11: The emerging world of mongo db   csp

MongoDB is a powerful, flexible, and scalable general-purpose database

Page 12: The emerging world of mongo db   csp

Ease of Use

MongoDB is a document-oriented database, not a relational one.

One of the reason for moving away from the relational model is to make scaling out easier.

Page 13: The emerging world of mongo db   csp

Ease of Use

A document-oriented database replaces the concept

“row” with a more flexible model: “document”.

By allowing embedded documents and arrays, the document-oriented approach makes it possible to represent complex hierarchical relationships with a single record.

Page 14: The emerging world of mongo db   csp

Ease of Use

Without a fixed schema, adding or removing fields as needed becomes easier.

This makes development faster as developers can quickly iterate.

It is also easier to experiment.

Developers can try a lot of models for the data and then choose the best one.

Page 15: The emerging world of mongo db   csp

Easy Scaling

Data set sizes for applications are growing at an incredible pace.

As the amount of data that developers need to store grows, developers face a difficult decision:

how should they scale their databases?

Scaling a database comes down to the choice between:

scaling up : getting a bigger machine. scaling out : partitioning data across more machines.

Page 16: The emerging world of mongo db   csp

Let’s Get Started

Page 17: The emerging world of mongo db   csp

Let's see some of the basic concepts of MongoDB:

• A document is the basic unit of data for MongoDB and is equivalent to a row in a Relational Database.

• Collection can be thought of as a table with a dynamic schema.

• A single instance of MongoDB can host multiple independent databases, each of which can have its own collections.

• One document has a special key, "_id", that is unique within a collection.

• Awesome JavaScript shell, which is useful for the administration and data manipulation.

Page 18: The emerging world of mongo db   csp

Schemaless dynamics

MongoDB is a "schemaless" but it doesn't mean that you don't need to thinking about design your schema!!

MongoDB is a “schemaless dynamic”, the meaning is don't have an ALTER TABLE and migration.

Page 19: The emerging world of mongo db   csp

At the core of MongoDB is the document: an ordered set of keys with associated values.

In JavaScript, for example, documents are represented as objects:

{"name" : "Hello, MongoDB world!"}

{"name" : "Hello, MongoDB world!", "foo" : 3}

{"name" : "Hello, MongoDB world!", "foo" : 3,

"fruit": ["pear","apple"]}

Page 20: The emerging world of mongo db   csp

let's start working

Page 21: The emerging world of mongo db   csp

What's going wrong?

Page 22: The emerging world of mongo db   csp

The keys in a document are strings. Any UTF-8 character is allowed in a key, with a few notable exceptions:

• Keys must not contain the character \0 (the null character). This character is used to signify the end of a key.

• The . and $ characters have some special properties and should be used only in certain circumstances, but in general, they should be considered reserved.

Page 23: The emerging world of mongo db   csp

SQL Terms/Concepts MongoDB Terms/Concepts

database databasetable collectionRow document or BSON documentcolumn fieldindex indexforeign key joins embedded documents and linking

primary key automatically set to the _id field.

Page 24: The emerging world of mongo db   csp

MongoDB is type-sensitive and case-sensitive.

For example, these documents are distinct:

{"foo" : "3"}

{"foo" : 3}

{"Foo" : 3}

MongoDB cannot contain duplicate keys:

{"name" : "Hello, world!", "name" : "Hello, MongoDB!"}

Page 25: The emerging world of mongo db   csp

WAIT!!

NO JOIN

Page 26: The emerging world of mongo db   csp

Document Structure

Page 27: The emerging world of mongo db   csp

References

Page 28: The emerging world of mongo db   csp

Embedded Data

Page 29: The emerging world of mongo db   csp

Data

DataBaseUse db_name

COLLECTIONdb.blog methods_mongodb

DOCUMENT {…..}

SUBDOCUMENT {...}

FIELD name: type

Array [….] document {…..[{...}]......}

Page 30: The emerging world of mongo db   csp

{ _id: ObjectID('4bd9e8e17cefd644108961bb'),title: 'My own Adventures in Databases',url: 'http://example.com/exampledatabases.txt',author: 'csp',vote_count: 5,tags: ['databases', 'mongodb', 'indexing'],image: {

url: 'http://example.com/db.jpg',caption: '',type: 'jpg',size: 75381,data: "Binary"},

comments: [ { user: 'abc',text: 'Nice article!'},

{ user: 'jkl',text: 'Another related article is at http://example.com/db/mydb.txt'}

]}

By default IDBy default ID

ArrayArray

Array + SubDocArray + SubDoc

SubDocSubDoc

DOCUMENT

Or my Owner

_id: 1

Page 31: The emerging world of mongo db   csp

Collections

A collection is a group of documents. If a document is the MongoDB analog of a row in a relational database, then a collection can be thought of as the analog to a table.

{“title” : “First”, “edge” : 34}{“title” : “First”, “edge” : 34}

{“title” : “First”, “edge” : 34}

CollectionsLike a

Table SQL

DocumentLike a row SQLDocument

Page 32: The emerging world of mongo db   csp

Dynamic Schemas

Collections have dynamic schemas. This means that the documents within a single collection can have any number of different “shapes.”

For example, both of this documents could be stored in a single collection:

{"greeting" : "Hello, mongoDB world!"}{"foo" : 23}

Page 33: The emerging world of mongo db   csp

Subcollections

One convention for organizing collections is to use namespaced subcollections separated by the “.” character.

For example, an a Blog application might have a collection named blog.posts and a separate collection named blog.authors.

Page 34: The emerging world of mongo db   csp

Basic Operations CRUD

Page 35: The emerging world of mongo db   csp

Basic Operations

CREATE: The insert function adds a document to a collection.

> post = {"title" : "My first post in my blog",... "content" : "Here's my blog post.",... "date" : new Date()}

> db.blog.insert(post)

> db.blog.find(){"_id" : ObjectId("5037ee4a1084eb3ffeef7228"),"title" : "My first post in my blog","content" : "Here's my blog post.","date" : ISODate("2013-10-05T16:13:42.181Z")}

Page 36: The emerging world of mongo db   csp
Page 37: The emerging world of mongo db   csp

this.insertEntry = function (title, body, tags, author, callback) { "use strict"; console.log("inserting blog entry" + title + body);

// fix up the permalink to not include whitespace var permalink = title.replace( /\s/g, '_' ); permalink = permalink.replace( /\W/g, '' );

// Build a new post var post = {"title": title, "author": author, "body": body, "permalink":permalink, "tags": tags, "comments": [], "date": new Date()}

// now insert the post posts.insert(post, function (err, post) { "use strict";

if (!err) { console.log("Inserted new post");

console.dir("Successfully inserted: " + JSON.stringify(post)); return callback(null, post); }

return callback(err, null); });

}

Page 38: The emerging world of mongo db   csp

Basic Operations

FIND: find and findOne can be used to query a collection.

> db.blog.findOne(){"_id" : ObjectId("5037ee4a1084eb3ffeef7228"),"title" : "My first post in my blog","content" : "Here's my blog post.","date" : ISODate("2012-08-24T21:12:09.982Z")}

Page 39: The emerging world of mongo db   csp
Page 40: The emerging world of mongo db   csp

this.getPostsByTag = function(tag, num, callback) { "use strict"; posts.find({ tags : tag }).sort('date', -1).limit(num).toArray(function(err, items) { "use strict";

if (err) return callback(err, null);

console.log("Found " + items.length + " posts");

callback(err, items); }); }

this.getPostByPermalink = function(permalink, callback) { "use strict"; posts.findOne({'permalink': permalink}, function(err, post) { "use strict";

if (err) return callback(err, null);

callback(err, post); }); }

Page 41: The emerging world of mongo db   csp

Basic Operations

UPDATE: If we would like to modify our post, we can use update. update takes (at least) two parameters: the first is the criteria to find which document to update, and the second is the new document.

> post.comments = []

> db.blog.update({title : "My first post in my blog"}, post)

> db.blog.find(){"_id" : ObjectId("5037ee4a1084eb3ffeef7228"),"title" : "My first post in my blog","content" : "Here's my blog post.","date" : ISODate("2013-10-05T16:13:42.181Z"),"comments" : [ ]}

Page 42: The emerging world of mongo db   csp

this.addComment = function(permalink, name, email, body, callback) { "use strict";

var comment = {'author': name, 'body': body}

if (email != "") { comment['email'] = email }

posts.update({'permalink': permalink},{ $push: { "comments": comment } },{safe:true}, function (err, comment) { "use strict";

if (!err) { console.log("Inserted new comment");

console.log(comment); return callback(null, comment); }

return callback(err, null); });

}

Page 43: The emerging world of mongo db   csp

Basic Operations

DELETE: remove permanently deletes documents from the database. Called with no parameters, it removes all documents from a collection. It can also take a document specifying criteria for removal.

> db.blog.remove({title : "My first post in my blog"})

> db.blog.find(){"_id" : ObjectId("5037ee4a1084eb3ffeef7228"),"title" : "My second post in my blog","content" : "Here's my second blog post.","date" : ISODate("2013-10-05T16:13:42.181Z"),"comments" : [ ]}

Page 44: The emerging world of mongo db   csp

Name Description$gt Matches values that are greater than the value specified in the query.$gteMatches values that are equal to or greater than the value specified in the query.$in Matches any of the values that exist in an array specified in the query.$lt Matches values that are less than the value specified in the query.$lte Matches values that are less than or equal to the value specified in the query.$ne Matches all values that are not equal to the value specified in the query.$ninMatches values that do not exist in an array specified to the query.

Comparison

Name Description$or Joins query clauses with a logical OR returns all documents that match the conditions of either clause.$and Joins query clauses with a logical AND returns all documents that match the conditions of both clauses.$not Inverts the effect of a query expression and returns documents that do not match the query expression.$nor Joins query clauses with a logical NOR returns all documents that fail to match both clauses.

Logical

Page 45: The emerging world of mongo db   csp

db.scores.find( { score : { $gt : 50 }, score : { $lt : 60 } } );

db.scores.find( { $or : [ { score : { $lt : 50 } }, { score : { $gt : 90 } } ] } ) ;

db.users.find({ name : { $regex : "q" }, email : { $exists: true } } );

db.users.find( { friends : { $all : [ "Joe" , "Bob" ] }, favorites : { $in : [ "running" , "pickles" ] } } )

Examples

Page 46: The emerging world of mongo db   csp

Index Explain Hint

Page 47: The emerging world of mongo db   csp

> for (i=0; i<1000000; i++) {... ... db.users.insert(... {... "i" : i,... "username" : "user"+i,... "age" : Math.floor(Math.random()*120),... "created" : new Date()... }... );... }> db.users.count()1000000> db.users.find(){ "_id" : ObjectId("526403c77c1042777e4dd7f1"), "i" : 0, "username" : "user0", "age" : 80, "created" : ISODate("2013-10-20T16:24:39.780Z") }{ "_id" : ObjectId("526403c77c1042777e4dd7f2"), "i" : 1, "username" : "user1", "age" : 62, "created" : ISODate("2013-10-20T16:24:39.826Z") }{ "_id" : ObjectId("526403c77c1042777e4dd7f3"), "i" : 2, "username" : "user2", "age" : 5, "created" : ISODate("2013-10-20T16:24:39.826Z") }{ "_id" : ObjectId("526403c77c1042777e4dd7f4"), "i" : 3, "username" : "user3", "age" : 69, "created" : ISODate("2013-10-20T16:24:39.826Z") }{ "_id" : ObjectId("526403c77c1042777e4dd7f5"), "i" : 4, "username" : "user4", "age" : 93, "created" : ISODate("2013-10-20T16:24:39.826Z") }

Page 48: The emerging world of mongo db   csp

> db.users.find({username: "user999999"}).explain(){

"cursor" : "BasicCursor","isMultiKey" : false,"n" : 1,"nscannedObjects" : 1000000,"nscanned" : 1000000,"nscannedObjectsAllPlans" : 1000000,"nscannedAllPlans" : 1000000,"scanAndOrder" : false,"indexOnly" : false,"nYields" : 1,"nChunkSkips" : 0,"millis" : 392,"indexBounds" : {

},"server" : "desarrollo:27017"

}

Others means:

The query could be returned 5 documents - nscanned 9 documents

from the index - nscannedand then read 5

full documents from the collection

- nscannedObjects

Others means:

The query could be returned 5 documents - nscanned 9 documents

from the index - nscannedand then read 5

full documents from the collection

- nscannedObjects

Page 49: The emerging world of mongo db   csp

The results of explain() describe the details of how MongoDB executes the query.

Some of relevant fields are:

cursor: A result of BasicCursor indicates a non-indexed query. If we had used an indexed query, the cursor would have a type of BtreeCursor.

nscanned and nscannedObjects: The difference between these two similar fields is distinct but important. The total number of documents scanned by the query is represented by nscannedObjects. The number of documents and indexes is represented by nscanned. Depending on the query, it's possible for nscanned to be greater than nscannedObjects.

n: The number of matching objects.

millis: Query execution duration.

Page 50: The emerging world of mongo db   csp

> db.users.ensureIndex({"username" : 1})

> db.users.find({username: "user101"}).limit(1).explain(){

"cursor" : "BtreeCursor username_1","isMultiKey" : false,"n" : 1,"nscannedObjects" : 1,"nscanned" : 1,"nscannedObjectsAllPlans" : 1,"nscannedAllPlans" : 1,"scanAndOrder" : false,"indexOnly" : false,"nYields" : 0,"nChunkSkips" : 0,"millis" : 40,"indexBounds" : {

"username" : [[

"user101","user101"

]]

},"server" : "desarrollo:27017"

}

Page 51: The emerging world of mongo db   csp

> db.users.ensureIndex({"age" : 1, "username" : 1})

> db.users.find({"age" : {"$gte" : 41, "$lte" : 60}}).... sort({"username" : 1}).... limit(1000).... hint({"age" : 1, "username" : 1})

> db.users.find({"age" : {"$gte" : 41, "$lte" : 60}}).... sort({"username" : 1}).... limit(1000).... hint({"username" : 1, "age" : 1})

Page 52: The emerging world of mongo db   csp

Data Model

Page 53: The emerging world of mongo db   csp

1) Embedded or Link2) 1 : 1

3) 1: many 4) 1:few

4) many:many5) few:few

Where are the answers?

So gimme just a minute and I'll tell you why

Page 54: The emerging world of mongo db   csp

Because any document can be put into any collection, then i wonder:

“Why do we need separate collections at all?”

with no need for separate schemas for different kinds of documents,

why should we use more than one collection?

Page 55: The emerging world of mongo db   csp

POST COMMETS

TAGS

{_id:1,title:____,body:___,

author:___,date:___}

{ _id:1, post_id:____,

author:___, author_email:_,

order:___}

{ _id:___,tag:____,

post_id: 1 }

1) Embedded 16Mb2) Living without Constrain,

in MongoDB dependent of you

3) No JOINS

Page 56: The emerging world of mongo db   csp

Model One-to-One Relationships with Embedded Documents

{ _id: "csp", name: "Carlos Sánchez Pérez"}

{ patron_id: "csp", street: "123 Aravaca", city: "Madrid", Number: "25 3º A", zip: 12345}

If the address data is frequently retrieved with the name information, then with referencing, your application needs to issue multiple queries to resolve the reference.

1) Frequency accessThinking about the memory.

All information load2) Growing size at the items

Writen of data separated or embedbed

3) > 16Mb4) Atomicity of data

2 Collections

Page 57: The emerging world of mongo db   csp

The better data model would be to embed the address data in the patron data:

{ _id: "csp", name: "Carlos Sánchez Pérez", address: { street: "123 Aravaca", city: "Madrid", Number: "25 3º A", zip: 12345 }}

{ _id: "csp", name: "Carlos Sánchez Pérez", address: { street: "123 Aravaca", city: "Madrid", Number: "25 3º A", zip: 12345 }}

{ _id: "csp1", name: "Carlos1 Sánchez1", address: { street: "1 Aravaca", city: "Madrid", Number: "95 3º A", zip: 12345 }}

{ _id: "csp", name: "Carlos SN", address: { street: "777 Aravaca", city: "Madrid", Number: "45 3º A", zip: 12345 }}

{ _id: "csp2", name: "Carlos SP3", address: { street: "666 Aravaca", city: "Madrid", Number: "75 3º A", zip: 12345 }}

Page 58: The emerging world of mongo db   csp

Model One-to-Many Relationships with Embedded Documents

{ _id: "csp", name: "Carlos Sánchez Pérez"}

{ patron_id: "csp", street: "123 Aravaca", city: "Madrid", Number: "25 3º A", zip: 12345}{ patron_id: "csp", street: "456 Aravaca", city: "Madrid", Number: "55 1º B", zip: 12345}

3 Collections

1) Frequency access2) Growing size 3) > 16Mb o Mib

4) Atomicity of data

Page 59: The emerging world of mongo db   csp

Model One-to-Many Relationships with Embedded Documents

{ _id: "csp", name: "Carlos Sánchez Pérez" addresses: [ { street: "123 Aravaca", city: "Madrid", Number: "25 3º A", zip: 12345 }, { street: "123 Aravaca", city: "Madrid", Number: "25 3º A", zip: 12345 } ] }

Page 60: The emerging world of mongo db   csp

Model One-to-Many Relationships with Document References

People{ _id: "csp", name: "Carlos Sánchez Pérez" City: “MD”, ….................}

City{ _id: "MD", Name: ….. ….................}

If you are thinking inEmbedded: it not a good

solution in this caseWhy?

Redundance information

If you are thinking inEmbedded: it not a good

solution in this caseWhy?

Redundance information

Page 61: The emerging world of mongo db   csp

Model One-to-Many Relationships with Document References

post { title: “My first tirle”,

author : “Carlos Sánchez Pérez” , date : “19/08/2013″, comments : [ {name: "Antonio López", comment : "my comment" }, { .... } ], tags : ["tag1","tag2","tag3"]

}

autor { _id : “Carlos Sánchez Pérez “, password; “”,…….. }

Embedded it a good solution in this case

One-to-few

Embedded it a good solution in this case

One-to-fewBLOGBLOG

Page 62: The emerging world of mongo db   csp

Model Many-to-Many Relationships

Books and Authors

Books{ :id: 12

title: "MongoDB: My Definitive easy Guide",author: [32]

…........................}

Authors{ :id: 32

author: "Peter Garden",books: [12,34,5,6,78,65,99] …........................

}

MODELFew-to-few

Embedded books: it not a good solution in this case

MODELFew-to-few

Embedded books: it not a good solution in this case

Page 63: The emerging world of mongo db   csp

Benefits of Embedding

Performace and better read.Be careful if the document chage a lot, slow write.

Page 64: The emerging world of mongo db   csp

Sharding, horizontal scaling

Page 65: The emerging world of mongo db   csp

Sharding

Vertical scaling adds more CPU and storage resources to increase capacity. Scaling by adding capacity has limitations: high performance

systems with large numbers of CPUs and large amount of RAM are disproportionately more expensive than smaller systems. Additionally,

cloud-based providers may only allow users to provision smaller instances. As a result there is a practical maximum capability for

vertical scaling.

Sharding, or horizontal scaling, by contrast, divides the data set and distributes the data over multiple servers, or shards. Each shard is an independent database, and collectively, the shards make up a single

logical database.

Page 66: The emerging world of mongo db   csp

s3s3s2s2S1r1,r2,r3

S1r1,r2,r3 s4s4 s5s5

APPAPP

Mongos (router)Mongos (router) Config server

Page 67: The emerging world of mongo db   csp

User0.......................................................................................................user99999User0.......................................................................................................user99999

COLLECTIONS

FIRST: a collection is sharded, it can be thought of as a single chunk from the smallest value of the shard key to the largest

Sharding splits the collection into many chunks based on shard key ranges

$minkeyuser100

User100user300

User300user600

User600user900

User900user1200

User1200user1500

User1500$maxkey

Page 68: The emerging world of mongo db   csp

at the end......

Page 69: The emerging world of mongo db   csp

Any questions?

….. I'll shoot it to you straight and look you in the eye.So gimme just a minute and I'll tell you why

I'm a rough boy.

Page 70: The emerging world of mongo db   csp

That's all folks!!and

Thanks a lot for your attention

I'm comming soon..........