NO SQL Databases, Big Data and the cloud

33
Manu Cohen-Yashar The Cloud, Big Data and NoSQL

description

Summary on NOSQL databases.

Transcript of NO SQL Databases, Big Data and the cloud

Page 1: NO SQL Databases, Big Data and the cloud

Manu Cohen-Yashar

The Cloud, Big Data and

NoSQL

Page 2: NO SQL Databases, Big Data and the cloud

Agenda

Data boom

Problems with RDBMS

No SQL

Big Data

What’s next

Page 3: NO SQL Databases, Big Data and the cloud
Page 4: NO SQL Databases, Big Data and the cloud

Understand NO SQL

Types of databases

Primary usage

Data model

Pros and Cons

Page 5: NO SQL Databases, Big Data and the cloud

Lots of Data

Data is doubles every 18 month

Pictures

Web site

emails

Sensors

Geo Information

Financial Information

Science

Art

. . . (Infinite list)

Page 6: NO SQL Databases, Big Data and the cloud

No Limits

With the cloud it is now possible to mount any

size if cluster and conduct any computation in

any scale.

The one who will make sense of all available

data will rule the world.

The conclusion:

Use the cloud to analyze large scale of data.

Page 7: NO SQL Databases, Big Data and the cloud

Lets Talk about data

When we think of data we think of …

Page 8: NO SQL Databases, Big Data and the cloud

Data has many forms

Yet data comes in many forms and shapes

Graphs Documents

Time Series

Blobs

Geo

Sensors

Unstructured

Structured

Web

Page 9: NO SQL Databases, Big Data and the cloud

Problems with RDBMS

Does not scale very well

Sharding

Replication

Models data according to the relational model

Is this the best model for all data types?

Complex and Expensive

Require a DBA

Expensive to buy

Oracle

SQL

Page 10: NO SQL Databases, Big Data and the cloud

No Relational

Not all types of data fit well into the relational

world.

Not all data use cases fit well into the ACID

convention

The relational model does not scale very good

Difficult to distribute

Difficult to replicate

Page 11: NO SQL Databases, Big Data and the cloud

The CAP Theory

RDBMS

Replicated NoSQL

ShardedNoSQL

During a network partition, a distributed system must choose either Consistency or Availability.

Page 12: NO SQL Databases, Big Data and the cloud

NO SQL

Large family of databases

No Schema

No relations enforced

Designed for high scale and distribution

Types of NO SQL DB

Key Value

Wide Columns

Documents

Graph

Page 13: NO SQL Databases, Big Data and the cloud

Motivation for NO SQL

Large Scale and Distribution

Simplicity

Low cost

Good fit with the data model

Volume, Velocity and Variety

Page 14: NO SQL Databases, Big Data and the cloud

What Is No Schema

Some data is structured, and some does not.

No SQL databases do not ENFORCE a

schema like RDBMS systems.

You can leverage data structure by creating

indexes and smart queries.

Page 15: NO SQL Databases, Big Data and the cloud

Types of NO SQL Databases

Key values

Wide column

Document

Graph

Page 16: NO SQL Databases, Big Data and the cloud

Key values

Data is ordered as a key - values pair

Query by key and values

Simple indexes (by partition key)

ExamplesAzure Table Storage

Amazon DynamoDB

Key1 Key2 VaIue1 VaIue2 VaIue3 VaIue4 VaIue5

Israel 1234 1 2 3

France 2345 4 5 8

Page 17: NO SQL Databases, Big Data and the cloud

Demo

DynamoDB and Azure Tables

Page 18: NO SQL Databases, Big Data and the cloud

Wide column / Column Families

Data is ordered as a key – value groups

Store data by columnA column family is how the data is stored on the disk

Query by key\key range only

No Indexes (on some dbs)

ExamplesGoogle Big-Table

Cassandra

HBase

Page 19: NO SQL Databases, Big Data and the cloud

Example – Cassandra Data Model

Column

Key value

Super Column

Collection of columns

Column Family

Dictionary of columns

Super Column Family

Dictionary of Column Families

Page 20: NO SQL Databases, Big Data and the cloud

Demo

Cassandra

Page 21: NO SQL Databases, Big Data and the cloud

Document Database

Data is ordered as a Key – Document

Query by key and document content

Use indexes

Examples

Mongo

Raven

CouchDB \ Couchbase

Page 22: NO SQL Databases, Big Data and the cloud

Demo

Page 23: NO SQL Databases, Big Data and the cloud

Graph databases

Data is ordered in elements and relations.

Query by relations

Supports complicated mathematical graph

calculus

Examples

Neo 4J

StarDog (used for sematic web)

Page 24: NO SQL Databases, Big Data and the cloud

RDF and OWL

TripleSubject - Predicate – Object

Define facts

RDF (Resource Description Framework)Defines some extra structure to triples.

Example: "rdf:type“ is used to say that things are of certain types.

Schema: Defines some classes which represent the concept of subjects, objects, predicates etc.

Enables making statements about classes of thing, and types of relationship.

OWLAdds semantics to the schema.

Expressed in triples.

Example: "If A isMarriedTo B" then this implies "B isMarriedTo A".

Page 25: NO SQL Databases, Big Data and the cloud

Demo

Page 26: NO SQL Databases, Big Data and the cloud
Page 27: NO SQL Databases, Big Data and the cloud

There is no one NO SQL solution for all

use cases

Important

There are over than 150 possible offerings…

Page 28: NO SQL Databases, Big Data and the cloud

Replication and Sharding

No SQL databases can span over a large cluster

ReplicationCopy the data to multiple servers

Usually each data element is copied 3 times

One master two slaves

Result: High Availability

ShardingSplit the data between servers

Horizontal partitioning of the data

Result: Horizontal scale

Replication and Sharding can be done together

Page 29: NO SQL Databases, Big Data and the cloud

The Cloud and NO SQL

All Cloud Providers have NO SQL solutions

Azure Tables

Google Big Table

Amazon DynamoDB

NO SQL Databases are deployed on a cluster

There are large number of cloud hosting offerings for

no-sql clusters

MongoHQ (MongoDB)

Cassandra on Google Compute engine

Many more

Page 30: NO SQL Databases, Big Data and the cloud

Example – Mongo in Azure

Page 31: NO SQL Databases, Big Data and the cloud
Page 32: NO SQL Databases, Big Data and the cloud

Check your schema

Be open to use NO-SQL data stores

Identify your use-case and find the right

database for you

Create a simple POC

Page 33: NO SQL Databases, Big Data and the cloud

Questions