MongoDB: What, why, when. Solutions Architect, MongoDB Inc. Massimo Brignoli #mongodb.

Post on 18-Jan-2016

274 views 1 download

Tags:

Transcript of MongoDB: What, why, when. Solutions Architect, MongoDB Inc. Massimo Brignoli #mongodb.

MongoDB: What, why, when.

Solutions Architect, MongoDB Inc.

Massimo Brignoli

#mongodb

Who Am I?

• Solutions Architect/Evangelist in MongoDB Inc.

• 24 years of experience in databases and software development

• Former MySQL employee

• Previous life: web, web, web

Innovation

Understanding Big Data – It’s Not Very “Big”

from Big Data Executive Summary – 50+ top executives from Government and F500 firms

64% - Ingest diverse, new data in real-time

15% - More than 100TB of data

20% - Less than 100TB (average of all? <20TB)

“I have not failed. I've just found 10,000 ways that won't work.” ― Thomas A. Edison

Back in 1970…Cars Were Great!

Lots of Great Innovations Since 1970

Would you use these technologies for your business today?

Including the Relational Database

For which computers the relational model has been designed for?

So Were Computers!

And Storage!

RDBMS Makes Development Hard

Relational Database

Object Relational Mapping

Application

Code XML Config DB Schema

And Even Harder To IterateNew Table

New Table

New Colum

n

Name PetPhon

eEmail

New Colum

n

3 months later…

RDBMS

From Complexity to Simplicity

MongoDB

{

_id : ObjectId("4c4ba5e5e8aabf3"),

employee_name: "Dunham, Justin",

department : "Marketing",

title : "Product Manager, Web",

report_up: "Neray, Graham",

pay_band: “C",

benefits : [

{ type :  "Health",

plan : "PPO Plus" },

{ type :   "Dental",

plan : "Standard" }

]

}

MongoDB

The leading NoSQL database

Document Database

Open-Source

General Purpose

7,000,000+ MongoDB Downloads

150,000+ Online Education Registrants

25,000+ MongoDB User Group Members

25,000+ MongoDB Days Attendees

20,000+ MongoDB Management Service (MMS) Users

Global Community

To provide the best database for how we build and run apps today

MongoDB Vision

Build– New and complex

data– Flexible– New languages– Faster development

Run– Big Data scalability– Real-time– Commodity hardware– Cloud

Enterprise Big Data Stack

EDWHadoop

Man

agem

ent

& M

on

ito

rin

gS

ecurity &

Au

ditin

g

RDBMS

CRM, ERP, Collaboration, Mobile, BI

OS & Virtualization, Compute, Storage, Network

RDBMS

Applications

Infrastructure

Data Management

Online Data Offline Data

Agile

MongoDB Overview

Scalable

Operational Database Landscape

Document Data Model

Relational MongoDB

{ first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ]}

Document Model Benefits

• Agility and flexibility– Data models can evolve easily– Companies can adapt to changes quickly

• Intuitive, natural data representation– Developers are more productive– Many types of applications are a good fit

• Reduces the need for joins, disk seeks– Programming is more simple– Performance can be delivered at scale

Developers are more productive

Developers are more productive

Automatic Sharding

• Three types of sharding: hash-based, range-based, tag-aware

• Increase or decrease capacity as you go

• Automatic balancing

Query Routing

• Multiple query optimization models

• Each sharding option appropriate for different apps

High Availability – Ensure application availability

during many types of failures

Disaster Recovery – Address the RTO and RPO goals

for business continuity

Maintenance – Perform upgrades and other

maintenance operations with no application downtime

Availability Considerations

Replica Sets

• Replica Set – two or more copies

• “Self-healing” shard

• Addresses many concerns:

- High Availability

- Disaster Recovery

- Maintenance

Single Data Center

• Automated failover

• Tolerates server failures

• Tolerates rack failures

• Number of replicas defines failure tolerance

Primary – A Primary – B Primary – C

Secondary – A

Secondary – A

Secondary – B

Secondary – B

Secondary – C

Secondary – C

Active/Standby Data Center

• Tolerates server and rack failure

• Standby data center

Data Center - West

Primary – A Primary – B Primary – C

Secondary – A

Secondary – B

Secondary – C

Data Center - East

Secondary – A

Secondary – B

Secondary – C

Active/Active Data Center

• Tolerates server, rack, data center failures, network partitions

Data Center - West

Primary – A Primary – B Primary – C

Secondary – A

Secondary – B

Secondary – C

Data Center - East

Secondary – A

Secondary – B

Secondary – C

Secondary – B

Secondary – C

Secondary – A

Data Center - Central

Arbiter – A Arbiter – B Arbiter – C

Global Data Distribution

Real-time

Real-time Real-time

Real-time

Real-time

Real-time

Real-time

Primary

Secondary

Secondary

Secondary

Secondary

Secondary

Secondary

Secondary

Read Global/Write Local

Primary:NYC

Secondary:NYC

Primary:LON

Primary:SYD

Secondary:LON

Secondary:NYC

Secondary:SYD

Secondary:LON

Secondary:SYD

Common Use Cases

High Volume Data Feeds

• More machine forms, sensors & data

• Variably structured

Machine Generated

Data

• High frequency trading• Daily closing price

Securities Data

• Multiple data sources• Each changes their format

consistently• Student Scores, ISP logs

Social Media /General Public

Operational Intelligence

• Large volume of users• Very strict latency requirements• Sentiment Analysis

Ad Targeting

• Expose data to millions of customers

• Reports on large volumes of data• Reports that update in real time

Real time dashboards

• Join the conversation• Catered Games • Customized Surveys

Social Media Monitoring

Metadata

• Diverse product portfolio• Complex querying and filtering• Multi-faceted product attributes

Product Catalogue

• Data mining• Call records• Insurance Claims

Data analysis

• Retina Scans• FingerprintsBiometric

Content Management

• Comments and user generated content

• Personalization of content and layout

News Site

• Generate layout on the fly• No need to cache static pages

Multi-device rendering

• Store large objects• Simpler modeling of metadataSharing

Questions?

Thanks!

@massimobrignoli

Massimo Brignoli

#MongoDB

Solutions Architect, MongoDB Inc.

massimo@mongodb.com