Talk MongoDB - Amil

52
{ name : ‘Marcelo Cenerino’, company: ‘Amil’, date : ‘2013-10-30T08:30:00.000Z’ }

description

Overview das principais features do MongoDB.

Transcript of Talk MongoDB - Amil

Page 1: Talk MongoDB - Amil

{ name : ‘Marcelo Cenerino’, company: ‘Amil’, date : ‘2013-10-30T08:30:00.000Z’}

Page 2: Talk MongoDB - Amil

What is MongoDB?

Page 3: Talk MongoDB - Amil

MongoDB (from humongous) is an open-source, high-

performance, scalable, general purpose database. It is

used by organizations of all sizes to power online

applications where low latency and high availability are

critical requirements of the system.

Here’s a definition:

Page 4: Talk MongoDB - Amil

You can have at most two of these properties for any shared-data system. Dr. Eric A. Brewer, 2000

Page 5: Talk MongoDB - Amil

• Document based• Schemaless• Open source (on GitHub)• High performance• Horizontally scalable• Full featured

Main characteristics

Page 6: Talk MongoDB - Amil

eBay, Ericsson, EA, SAP, Telefonica, Code School, Abril...

Customers

Page 7: Talk MongoDB - Amil

MongoDB vs. RDBMS

Page 8: Talk MongoDB - Amil

id nome idade genero

1 João 25 Masculino

2 Maria 30 Feminino

3 Pedro 40 Masculino

...

...

...

RDBMS: data is structured as tables

Page 9: Talk MongoDB - Amil

Document oriented???

Page 10: Talk MongoDB - Amil

{

_id : 1,

nome : 'João',

idade : 25,

genero : 'Masculino'

}

Size: up to 16 MB

Document oriented

MongoDB stores data as document in a binary representation called BSON (Binary JSON)

Page 11: Talk MongoDB - Amil

Table Collection

Row Document

Index Index

Joins Embedded doc.

FK Reference

Partition Shard

RDBMS MongoDBvs.

Column Field

Page 12: Talk MongoDB - Amil

Transaction Model

MongoDB guarantees atomic updates to data at the document level.

Page 13: Talk MongoDB - Amil

Relational schema design

Page 14: Talk MongoDB - Amil

Relational schema design

• Large ERD diagrams

• Create table statements

• ORM to map tables to objects

• Tables just to join tables together

• Lots of revision and alter table statements until we

get it just right

Page 15: Talk MongoDB - Amil

In a MongoDB based app we start building our app and let the schema evolve.

Page 16: Talk MongoDB - Amil

User

name

email

Article

Comment[]

Tag[]

titledatetextauthor

value

authordatetext

Mongo “schema” design

Page 17: Talk MongoDB - Amil

Getting started with MongoDB

Page 18: Talk MongoDB - Amil
Page 19: Talk MongoDB - Amil

> mongod

> mongo

Page 20: Talk MongoDB - Amil

Basic CRUD operations

Page 21: Talk MongoDB - Amil

> user = {name : ‘marcelo’, age : 29, gender : ‘Male’}> db.users.insert(user)>

Inserting a document

• No collection creation needed!

Page 22: Talk MongoDB - Amil

> db.users.findOne(){ "_id" : ObjectId("5269d66271de67aa7c3c41b4"), "name" : “marcelo", "age" : 29, “gender" : “male"}

Querying a document

• _id is the primary key in MongoDB• Automatically indexed• Automatically created as an ObjectId if not provided• Any unique immutable value could be used

Page 23: Talk MongoDB - Amil

> db.users.find({name : 'maria', age : {$gt : 25}})

{ "_id" : ObjectId("526f1af1dac0a62cdc152a96"), "name" : "maria", "age" :

26 }

Querying a document

Page 24: Talk MongoDB - Amil

Group OperatorsComparison $gt, $gte, $in, $lt, $lte, $ne, $nin

Logical $or, $and, $not, $nor

Element $exists, $type

Evaluation $mod, $regex, $where

Geospatial $geoWithin, $geoIntersects, $near, $nearSphere

Array $all, $elemMatch, $size

Projection $, $elemMatch, $slice

Operators

Page 25: Talk MongoDB - Amil

> db.users.find({age : {$gt : 25}}, {_id : 0})

{ "name" : "maria", "age" : 26 }

{ "name" : "marcelo", "age" : 29 }

>

>db.users.update({age : {$gt : 25}}, {$set : {roles : ['admin', 'dev', 'operator']}})

Updating a document

Page 26: Talk MongoDB - Amil

> db.users.remove({name : 'maria'})

> db.users.find().pretty()

{

"_id" : ObjectId("526f1cb3dac0a62cdc152a98"),

"age" : 29,

"name" : "marcelo",

"roles" : [

"admin",

"dev",

"operator"

]

}

Removing a document

Page 27: Talk MongoDB - Amil

Indexing

Page 28: Talk MongoDB - Amil

> db.estabelecimentos.count()307929>> db.estabelecimentos.find({'localizacao.cidade' : 'PIACATU'}).explain(){ "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 5, "nscannedObjects" : 307929, "nscanned" : 307929, "nscannedObjectsAllPlans" : 307929, "nscannedAllPlans" : 307929, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 1, "nChunkSkips" : 0, "millis" : 311, "indexBounds" : {

}, "server" : "Cenerino-PC:27017"}>

Querying a large collection without index

Page 29: Talk MongoDB - Amil

> db.estabelecimentos.getIndexes()[ { "v" : 1, "key" : { "_id" : 1 }, "ns" : "mapa-servicos.estabelecimentos", "name" : "_id_" }]>

Showing collection’s indexes

Page 30: Talk MongoDB - Amil

> // creating an index> db.estabelecimentos.ensureIndex({'localizacao.cidade' : 1})> db.estabelecimentos.getIndexes()[ { "v" : 1, "key" : { "_id" : 1 }, "ns" : "mapa-servicos.estabelecimentos", "name" : "_id_" }, { "v" : 1, "key" : { "localizacao.cidade" : 1 }, "ns" : "mapa-servicos.estabelecimentos", "name" : "localizacao.cidade_1" }]>

Creating an index

Page 31: Talk MongoDB - Amil

> db.estabelecimentos.find({'localizacao.cidade' : 'PIACATU'}).explain(){ "cursor" : "BtreeCursor localizacao.cidade_1", "isMultiKey" : false, "n" : 5, "nscannedObjects" : 5, "nscanned" : 5, "nscannedObjectsAllPlans" : 5, "nscannedAllPlans" : 5, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 0, "indexBounds" : { "localizacao.cidade" : [ [ "PIACATU", "PIACATU" ] ] }, "server" : "Cenerino-PC:27017"}

Same query, now using index

Page 32: Talk MongoDB - Amil

> db.estabelecimentos.ensureIndex({'localizacao.coordenadas' : '2dsphere'})> db.estabelecimentos.getIndexes()[ { "v" : 1, "key" : { "_id" : 1 }, "ns" : "mapa-servicos.estabelecimentos", "name" : "_id_" }, { "v" : 1, "key" : { "localizacao.cidade" : 1 }, "ns" : "mapa-servicos.estabelecimentos", "name" : "localizacao.cidade_1" }, { "v" : 1, "key" : { "localizacao.coordenadas" : "2dsphere" }, "ns" : "mapa-servicos.estabelecimentos", "name" : "localizacao.coordenadas_2dsphere" }]

Geospatial index

Page 33: Talk MongoDB - Amil

> lng = -46.80208830000004> lat = -23.515985699999998> distance = 30 / 6378.137>> db.estabelecimentos.find({ "localizacao.coordenadas" : { "$nearSphere" : [lng , lat] , "$maxDistance" : distance}}).limit(50)

Geospatial index

http://mapa-servicos-publicos.herokuapp.com/

Page 34: Talk MongoDB - Amil

Aggregation Framework

Page 35: Talk MongoDB - Amil

Aggregation Framework

Pipeline Operators: $project, $match, $limit, $skip, $unwind, $group, $sort, $geoNear

Page 36: Talk MongoDB - Amil

Aggregation Framework

> db.estabelecimentos.aggregate([{$match : {'localizacao.uf' : 'SP'}}, {$group : {_id : '$localizacao.cidade', qtd : {$sum : 1}}}, {$sort : {qtd : -1}}, {$limit : 3}]){ "result" : [ { "_id" : "SAO PAULO", "qtd" : 6930 }, { "_id" : "CAMPINAS", "qtd" : 881 }, { "_id" : "GUARULHOS", "qtd" : 666 } ], "ok" : 1}>

Page 37: Talk MongoDB - Amil

Mongo Driver for Java

Page 38: Talk MongoDB - Amil

Spring Data MongoDBhttp://projects.spring.io/spring-data-mongodb/

Page 39: Talk MongoDB - Amil

Replica Sets

Page 40: Talk MongoDB - Amil

Replica Sets

• A replica set is a group of mongod instances that host the

same data set

• Replication provides redundancy and increases data

availability

• The primary accepts all write operations from clients (only

one primary allowed)

• Replication can be used to increase read capacity

• Asynchronous replication

• Automatic failover

Page 41: Talk MongoDB - Amil

Replica Sets

Page 42: Talk MongoDB - Amil

Replica Sets

Page 43: Talk MongoDB - Amil

Sharding

Page 44: Talk MongoDB - Amil

Sharding

Issues of scaling:

• High query rates can exhaust the CPU capacity of the server

• Larger data sets exceed the storage capacity of a single

machine

• Working set sizes larger than the system’s RAM stress the I/O

capacity of disk drives

Vertical scaling X Sharding

Page 45: Talk MongoDB - Amil

Sharding

• Sharding is the process of storing data across multiple machines

• Each shard is an independent database, and collectively, the shards make up a

single logical database

Horizontally Scalable

Page 46: Talk MongoDB - Amil

Sharded clusters

Page 47: Talk MongoDB - Amil

Range Based Sharding

• Supports more efficient range queries

• Results in an uneven distribution of data

• Monotonically increasing keys should be avoided

Page 48: Talk MongoDB - Amil

Hash Based Sharding

Ensures an even distribution of data at the expense of efficient range queries

Page 49: Talk MongoDB - Amil

https://education.mongodb.com/

Page 50: Talk MongoDB - Amil

Books

Page 51: Talk MongoDB - Amil

db.audience.find({‘question’ : true})

Page 52: Talk MongoDB - Amil

Thanks everyone!

Hope you’ve enjoyed it.