MongoDB dessi-codemotion

Post on 15-Jan-2015

4.520 views 2 download

Tags:

description

Codemotion 2010 MongoDB slides, examples with scala casbah salat

Transcript of MongoDB dessi-codemotion

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Speaker

Software architect/developer Pronetics/Sourcesense

Founder Spring Italian User Group

Chairman JugSardegna

Committer/Contributor OpenNMS - MongoDB

Author Spring 2.5 Aspect Oriented Programming

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

MongoDB Philosophy

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Some MongoDB production deployments

http://www.mongodb.org/display/DOCS/Production+Deployments

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Main Features● Document Oriented

- Documents (objects) map nicely to programming language data types

- Dynamically-typed (schemaless) for easy schema evolution

- No joins and no transactions for high performance

and easy scalability

● High Performance

- No joins and no transactions makes reads and writes fast

- Indexes with indexing into embedded documents and arrays

- Optional asynchronous writes

● Rich Query Language

● Easy scalability

- "slaveOK" reads are distributed over replicated servers

- Automatic sharding (auto-partitioning of data across servers)

- Reads and writes are distributed over shards

- No joins and no transactions make distributed queries easy and fast

● High Availability

- Replicated servers with automatic master failover

● Indexing

● Stored JavaScript

● Fixed-size collection

● File storage

● MapReduce

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

No Sql Injection

Mongo is invulnerable to injection attacks, no code execution

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Document as Basic unit of Data in BSON (Binary JSON) format{ "name" : "MongoDB",

"info" : { "storage" : "Binary JSON (BSON)",

"full index" : "true",

"scale" : "Autosharding",

"query" : "Rich document-base queries",

"replication" : "Replica sets",

"atomic modifiers" : "Fast in place update",

"binary content" : "GridFS",

"batch operation" : "Map/Reduce”,

"js server side" : ”true”

}

"greeting": {"international" : "Hello, world!", "italy" :"Ciao Mondo !" }

}, "_id” : "024x6f279578a64bb0666945"

Document: an Ordered set of keys with associated values

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Grouping

SQL Table contains Rows

MONGO Collection and subcollections contains Documents

* Document Limit: Larger than 4 Mb, the entire text of War and Peace is 3.14Mb

Collection are are created dynamically and automatically grow in size to fit additional data

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

A single instance of MongoDB can host

multiple independent databases, each of

which can have its own collections

and permissions.

Photo from http://www.aibento.net/

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

No Join's cost

● SQL SELECT * FROM posts

INNER JOIN posts_tags ON posts.id = posts_tags.post_id

INNER JOIN tags ON posts_tags.tag_id == tags.id

WHERE tags.text = 'politics' AND posts.vote_count > 10;

● MONGO db.posts.find({'tags': 'politics', 'vote_count': {'$gt': 10}});

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Collections Schema free

Documents within a single collection can have any number of different "shapes”

In theory, each document in a collection

can have a completely different structure;

in practice, a collection's documents

will be relatively uniform.

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Driver

C, C#, C++, Clojure, D, Delphi, Erlang,

Factor, Fantom, F#, Go, Groovy,

Haskell, Java, Javascript, Lua, Nodejs,

ObjectiveC, Perl, PHP, Python, R,

Ruby, Scala, Scheme (PLT), Smalltalk http://www.mongodb.org/display/DOCS/Drivers

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Common operationsFastest : Fire and Forget

Command with response : getLastError

Examples with

Server side Javascript via mongo shell

Java via MongoDB Official 10gen Driver

Scala via Casbah Official 10gen scala driver + Salat serializer

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Inserting

//mongo shell

db.dcComicsCollection.insert({"name" : "bruce", "surname" : "wayne", "alias" : "batman"})

//Java

Map fields = batman.toMap()

dcComicsCollection.insert(BasicDBObjectBuilder.start(fields).get()) //java driver

//scala

dcComicsCollection.insert(grater[BlogPost].asDBObject(post))

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Removing

//mongo shell

db.dcComicsCollection.delete({"alias" : "superman"})

//java

Map query = batman.getMap()

DBObject obj = BasicDBObjectBuilder.start(fields).get()

dcComicsColl.remove(obj);

//scala

dcComicsCollection.remove(grater[BlogPost].asDBObject(post)

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Updating (The schema can be changed dinamically)//mongo shell

var hero = db.dcComicsColl.findOne({"alias" : "batman"});

hero.gadget = {"car" : "batmobile”};

db.dcComicsColl.update({”aias” : ”batman"}, hero, true);

//java

DbObject query = BasicDBObjectBuilder.start().add("surname",”wayne”).get();

DbObject hero = BasicDBObjectBuilder.start().add("gadget",”batmobile”).get();

dcComicsCollection.update(query, hero, false, true);

//scala

val query = MongoDBObject("name" -> "bruce")

val hero = MongoDBObject("gadget" -> "batmobile")

dcComicsCollection.update(query, hero, true, false)

* The blue value it's the upsert, update or insert if not present

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Querying//mongo shell

db.dcComicsColl.find({"alias":"batman"}) // all field of the doc

db.dcComicsColl.find({"alias":"batman"},{”surname":1})//surname desc

//java

DBObject query = BasicDBObjectBuilder.start().add("alias", "batman").get();

DBObject out = BasicDBObjectBuilder.start().add("surname", "1").get()

DBCursor cursor = dcComicsColl.find(query, out);

//scala

val query = MongoDBObject("alias" -> "batman")

val obj = dcComicsColl.findOne(query)

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Modifier

Partial updates$set (set a key, or add if not present)

$inc (with numbers)

$push (add to the end of an array)

$ne

$addToSet

$each

$pop (remove from the end)

$pull (remove element that match criteria)

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Query Criteria

Conditionals$lt $lte $gt $gte

start = new Date("01/01/20111")

db.mycollection.find({"registered" : {"$lt" : start}})

db.dcComicsColl.find({

"surname":{"$ne":"parker"},

"name":{"$ne":"peter"}

})

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Query Criteria

Conditionals

$in $or, $nin, $not <, <=, >, >=, $ne, $mod, $all, $size, $exists

db.attendees.find({”external_num" : {"$nin" : [1356, 525, 874]}})

db.attendees.find({"$or" : [{"ticket_no" : 513}, {"name" : foo}]})

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Query Regex

Perl Compatible Regular Expression //mongo shell

db.mycollection.find({"name" : /foo?/i})

//java

String pattern = new StringBuilder().append(character).append("*").toString();

DbObject query =BasicDBObjectBuilder.start()

.add("surname", java.util.regex.Pattern.compile(pattern)).get()

List<DBObject> objects = coll.find(query).toArray();

//scala

coll.find(MongoDBObject("surname" -> ".*yne$".r))) {...

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Query on array and grouping

Array inside a document $all $size $slice

Grouping count distinct group finalize $key

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Javascript as part of a query

db.mycollection.find({

"$where" : function (){

for (var current in this) {

….

}

}

})

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Indexing

db.mycollection.ensureIndex({"name" : 1, "info" : 1 }

Geospatial indexing db.mycollection.ensureIndex({"gps" : "2d"})

db.star.trek.ensureIndex( {"light-years" : "2d"}, {"min”:-10, "max”:10})

Yes it's a Foursquare feature :)

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

GridFS

GridFS is a specification for storing large files such video, photos, blob in MongoDB .

GridFS uses two collections to store data:● files contains the object metadata ● chunks contains the binary chunks with some additional accounting

information

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Map Reducemap = function() {

for (var key in this) {

emit(key, {count : 1});

};

}

reduce = function(key, emits) {

total = 0;

for (var i in emits) {

total += emits[i].count;

}

return {"count" : total};

}

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Scaling

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Scaling

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Shard

Sharding = break up collections into smaller chunks

Splitting data and storing different portions of the data on different machines, also

know as partitioning

The chunks can be distributed across shards so that each shard is responsible for a subset of the total data set.

A shard is a container that holds a subset of a collection’s data. A shard is either a single mongod server (for development/testing) or a replica set (for production). Thus, even if there are many servers in a shard, there is only one master, and all of the servers contain the same data.

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Mongos

The client don't know what shard has what data, or even that data is broken up on different shards. In front of the shard run routing process called mongos.

This router know when the data are located and the client can see a normal mongod, like a noshard environment.

This is the router process and comes with all MongoDB distributions. It basically just routes requests and aggregates responses. It doesn’t store any data or config- uration information. (Although it does cache information from the config servers.)

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Config server

Config servers store the configuration of the cluster: which data is on which shard. Because mongos doesn’t store anything permanently, it needs somewhere to get the shard configuration. It syncs this data from the config servers.

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Replication

Replica set, clustering with automatic failover

Master is elected by the cluster and may change to another node if the current master goes down.

This election process will be initiated by any node that cannot reach the primary.

The highest-priority most-up-to-date server will become the new primary.

The replication is asynchronous

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Scala Webapp Demo

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Admin Interface

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Mongo Shell

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Mac Client MongoHub

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Books

http://www.mongodb.org/display/DOCS/Books

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

References

● MongoDB : http://www.mongodb.org/● Casbah: https://github.com/mongodb/casbah● Salat: https://github.com/novus/salat/● Slides scala code example: https://github.com/desmax74/codemotion-2010

Massimiliano Dessì – desmax74@yahoo.it – Jug Sardegna

Thanks !

Massimiliano Dessì

http://jroller.com/desmax

http://twitter.com/desmax74

http://www.linkedin.com/in/desmax74

http://wiki.java.net/bin/view/People/MassimilianoDessi

http://www.jugsardegna.org/vqwiki/jsp/Wiki?MassimilianoDessi