NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... ·...

94

Transcript of NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... ·...

Page 1: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 2: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 3: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 4: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

The world today

Data is as critical as ever

It’s what the people who

pay us care most about

Data is much more plentiful

Storage costs are lower

There are bigger data

sources:

- Web-scale applications

- Internet of Things (IoT)

- More

New data technologies abound

NoSQL

Big data analytics

Search

Our field was originally

called data processing

This isn’t the post-SQL era,

but it is the SQL+ era

Page 5: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Illustrating the intersection

On-Premises Cloud

SQL

SQL+

Page 6: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

NoSQL is varied.

NoSQL is buzzy and growing.

It’s a crowded space.

There are many on-premises options, but

significantly fewer fully managed services.

Can be a different audience than RDBMS’s.

You can store any data in any store.

Some stores are better suited for

certain types of data.

Page 7: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 8: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Platform Services

Security & Management

Infrastructure Services

Web Apps

MobileApps

APIManagement

APIApps

LogicApps

NotificationHubs

Content DeliveryNetwork (CDN)

MediaServices

HDInsight MachineLearning

StreamAnalytics

DataFactory

EventHubs

MobileEngagement

ActiveDirectory

Multi-FactorAuthentication

Automation

Portal

Key Vault

BiztalkServices

HybridConnections

ServiceBus

StorageQueues

Store /Marketplace

HybridOperations

Backup

StorSimple

SiteRecovery

Import/Export

SQLDatabase

DocumentDB

RedisCache Search

Tables

SQL DataWarehouse

Azure AD Connect Health

AD PrivilegedIdentity Management

OperationalInsights

CloudServices

Batch Remote App

ServiceFabric Visual Studio

ApplicationInsights

Azure SDK

Team Project

VM Image Gallery& VM Depot

Azure

Data Lake StoreAzure Data Lake Analytics

Page 9: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Operational Data Analytical DataA summary

Managed service

provided by Azure

Key/Value Store

(Tables, Riak, …)

Software that can run

in Azure virtual

machines

NoSQL

Technologies

SQL

Technologies

Big Data Analytics

(HDInsight, Hadoop,

Azure Data Lake)

Document Store

(DocumentDB, MongoDB, …)

Relational Database

(SQL Database,

SQL Server, Oracle, MySQL, …)

Column Family Store

(HBase, Cassandra …)

Relational Analytics

(SQL Server, Oracle, MySQL, …)

Page 10: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Tables

A relational data service

Column Name

Column Type

SQL

Query

Primary Key

Data

SQL Database

char

NameID

int date

LastUse

char

Country Age

int

1

3

2

7

Application

Page 11: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Sharding

SQL Database Elastic

Scale now supports

sharding

Shard 1 Shard 2 Shard 3

Sharded Database

Adam

Andrew

Anusha

Bertrand

Bill

Carl

Catherine

Cynthia

Database

Adam Andrew

Anusha Bertrand

Bill

Carl

Catherine

Cynthia

Atomic transactions typically

span only a single shard

Page 14: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

SQL

Database

Category

Relational

Storage

Abstractions

Tables,

rows,

columns

Transaction

Support

All rows and

tables in a

database

Secondary

Indexes

Yes

Pricing

Units of

throughput

Stored

Procedure/

Triggers

Written in

T-SQL

Query

Language

SQL

Maximum

Database

Size

1T

With Elastic Scale,

100s of TBs

Page 15: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

To scale for lots of users and lots of data

Pros: NoSQL technologies can

offer more scalability than

relational databases

Cons: Often lose some

benefits of relational

databases, e.g., database-wide

transactions

To work better with different data formats, e.g., JSON

Pros: Avoiding

object/relational mapping

makes code easier to write

Cons: Limited BI tools;

persistent data designed

for a single application is

harder to share

To work with data in a more flexible way

Pros: NoSQL technologies don’t

have fixed schemas

Cons: Fixed schemas help

prevent errors

Page 16: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

A document storeCollections

Request

Document 1

{

"name": "John",

"country": "Canada",

"age": 43,

"lastUse": "March 4, 2014"

}

{

"name": "Eva",

"country": "Germany",

"age": 25

}

Document 2

{

"name": "Lou",

"country": "Australia",

"age": 51,

"firstUse": "May 8, 2013"

}

Document 3

{

"docCount": 3,

"last": "May 1, 2014"

}

Document 4

{…}

DocumentDB

Application

Page 17: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Ways to work with data

RESTful access methods

For Create/Read/Update/Delete

(CRUD) operations

DocumentDB SQL

A query language with

SQL-derived syntax

Example:

SELECT c.age

FROM customers c

WHERE c.name = "Lou"

Executing logic in the database

Stored procedures

Triggers

User-defined functions (UDFs)

- Allow extending

DocumentDB SQL

All written in

JavaScript

Page 18: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Web Browser

JavaScript

Application

Phone/Tablet

Native

Apps

Microsoft Azure

With Node.js

Node.js

JavaScript

Server Code

Web Apps

JSON

Collection

JSON

JSON

DocumentDB

JSON

JSON

Request

JSON

Page 19: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Collection Collection Collection

Database

JSON

JSON

JSON

JSON

JSON

JSON

JSON

JSON

Collection

JSON

JSON

JSON

Sharding and transactions Atomic transactions can span

only a single collection

The unit of

sharding is a

collection

Page 20: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Database

Replication and consistency

Shard A

What does a

reader see?

A write to the

primary replica takes

time to propagate to

the secondaries

Primary replica

Secondary replica

Shard A Shard A

Replication can improve

performance and availability

Page 21: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Consistency options

Session

Strong

Bounded

Staleness

Yes, but only

for writes by

other clients

Yes, but only

within a

specified

interval

No

Eventual

Moderately

slow

No

Readers

might see

out-of-order

updates

No

Readers

might see

old data

Slowest

Speed of

reads

Speed of

writes

Slowest

Fastest

Yes

Yes, but only

for writes by

other clients

Yes

Fastest

Fastest

Moderately

fast

Fastest

The default

Page 22: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Pricing

Transaction

Support

Secondary

Indexes

SQL

Database

Storage

AbstractionsCategory

DocumentDB

Relational

Document

store

Tables,

rows,

columns

Collections,

documents

All rows and

tables in a

database

Yes

Yes

Units of

throughput

Units of

throughput

All

documents in

the same

collection

Stored

Procedures/

Triggers

Written in

T-SQL

Written in

JavaScript

Query

Language

SQL

Extended

subset of

SQL

Maximum

Database

Size

1T

100s of

TBs

Page 23: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Tables

Partition

B

Partition

A

Property

A key/value store

Entity

2B

Property Type

Data

Property Name

String intString Date

Name LastUseCountry Age

String intString

Name Country Age

String intString Date

Name FirstUseCountry Age

Row key

1

2

1

2Partition key

A

A

B

Bint Date

LastCount

2B

Azure Tables

Application

Page 24: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Sharding and transactions

Partitions are replicated;

reads and writes provide

strong consistency

Table

Partition A Partition B Partition C

A 1

A 2

A 3

B 1

B 2

B 3

C 1

C 2

C 3

Atomic transactions can

span only a single partition

The unit of

sharding is a

partition

Page 25: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Pricing

Transaction

Support

Secondary

Indexes

SQL

Database

Tables

Storage

AbstractionsCategory

DocumentDB

Relational

Key/value

store

Document

store

Tables,

rows,

columns

Collections,

documents

All rows and

tables in a

database

Yes

No

Yes

Units of

throughput

GBs of

storage

Units of

throughput

Tables,

partitions,

entities

All entities

in the same

partition

All documents

in the same

collection

Stored

Procedures/

Triggers

Written in

T-SQL

None

Written in

JavaScript

Query

Language

SQL

Subset of

OData

queries

Extended

subset of

SQL

Maximum

Database

Size

1T

100s of

TBs

100s of

TBs

Page 26: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

TablesA column family store

Column Key (Qualifier)

Column Key (Family)

2LastUse v2Usage User Usage

Name Country LastUse FirstUse

Row

KeyAge

Data (optionally with

time-stamped versions)

1

2

3

5

6

4

HDInsight HBase

Application

HDInsight supports

Phoenix for SQL

queries on HBase

Page 27: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Sharding and transactions

Table

Region A Region B Region C

The unit of

sharding is

a region

Regions are replicated;

reads and writes provide

strong consistency

HBase automatically

shards a table; users

don’t see regions

Atomic transactions can

span only a single row

Page 28: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Pricing

Transaction

Support

Secondary

Indexes

SQL

Database

Tables

Storage

AbstractionsCategory

DocumentDB

Relational

Key/value

store

Document

store

Tables,

rows,

columns

Collections,

documents

All rows and

tables in a

database

Yes

No

Yes

Units of

throughput

GBs of

storage

Units of

throughput

Tables,

partitions,

entities

All entities in

the same

partition

All documents

in the same

collection

Stored

Procedures/

Triggers

Written in

T-SQL

None

Written in

JavaScript

HDInsight

HBase

Column

family

store

Tables, rows,

columns,

cells, column

families

No

GBs of

storage

plus VMs

per hour

All cells in

the same

row

Written in

Java

Query

Language

SQL

Subset of

OData

queries

Extended

subset of

SQL

SQL

subset w/

Phoenix

Maximum

Database

Size

1T

100s of

TBs

100s of

TBs

100s of

TBs

Page 29: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Azure DocumentDB

Page 30: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

What is a document database?

Ideally suited to this

kind of document -

{

"id": "13244_user",

"firstName": "John",

"lastName": "Smith",

"age": 25,

"employmentHistory" : [

{

"company":"Contoso Inc"

"start": {"date":"Thu, 02 Apr 2015 20:54:45 GMT", "epoch":1428008086},

"position":"CEO"

},

{

"start": {"date":"Thu, 02 Apr 2012 20:54:45 GMT", "epoch":1428008086},

"end": {"date":"Thu, 01 Apr 2015 20:54:45 GMT", "epoch":1428008086},

"position":"GM"},

],

"address":

{

"streetAddress": "21 2nd Str",

"city": "New York",

"state": "NY",

"postalCode": "10021"

},

"children": [

{"name":"Megan", "age":10},

{"name": "Bruce", "age":7},

{"name": "Angus", "sports" : ["football", "basketball", "hockey"]}

]

"mobileNumber": "212 555-1234"

}

Page 31: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

What is a document database?

Not ideal, but it can work -

{

"id": "13244_post",

"text": "Lorizzle ghetto dolor tellivizzle boofron, stuff pimpin' elizzle. Nullam sapizzle

velizzle, my shizz tellivizzle, suscipizzle funky fresh, shizzle my nizzle crocodizzle

vizzle, arcu. Pellentesque eget tortizzle. Sizzle erizzle. Mammasay mammasa mamma oo sa

break it down dolor own yo' things fo shizzle mah nizzle fo rizzle, mah home g-dizzle

sure. Maurizzle pellentesque dawg ghetto turpizzle. Shiz izzle my shizz. Pellentesque

eleifend rhoncizzle nisi. In its fo rizzle owned ma nizzle dictumst. Sizzle gangsta.

Curabitur tellizzle urna, pretizzle go to hizzle, mattizzle izzle, eleifend vitae,

tellivizzle. Dawg shizzlin dizzle. Integer semper velit sizzle stuff.

Boofron mofo auctizzle ma nizzle. Pot a elizzle ut nibh pretium tincidunt. Maecenizzle

things erat. Own yo' in lacizzle sed maurizzle elementizzle tristique. I'm in the

shizzle yippiyo sizzle daahng dawg eros ultricizzle . In velit tortor, ultricizzle

ghetto, hendrerizzle fo shizzle mah nizzle fo rizzle, mah home g-dizzle, adipiscing

crunk, boom shackalack. Etizzle velit doggy, hizzle consequizzle, pharetra get down

get down, dictizzle sed, shut the shizzle up. Fo shizzle neque. Fo lorizzle. Bling

bling vitae pizzle ut libero commodo gizzle. Fusce izzle augue eu yo mamma dang.

Phasellizzle break it down fo nizzle erat. Suspendisse shizzlin dizzle owned,

sollicitudin sizzle, mah nizzle izzle, commodo nec, justo. Donizzle fizzle

porttitizzle ligula. Nunc feugizzle, tellus tellivizzle ornare tempor, sapizzle break

it down tincidunt gangster, eget dapibus daahng dawg enizzle izzle that's the shizzle.

Stuff quizzle leo, imperdizzle izzle, fo shizzle my nizzle izzle, semper izzle,

sapien. Ut boofron magna vizzle ghetto. I'm in the shizzle ante bling bling,

suscipizzle vitae, yo mamma stuff, rutrizzle pizzle, velizzle.

Mauris da bomb go to zzle. Sizzle mammasay mammasa mamma oo sa magna own yo' amet risus

congue. Boofron mofo auctizzle ma nizzle. Pot a elizzle ut nibh pretium tincidunt.

things erat. Own yo' in lacizzle sed maurizzle elementizzle tristique. I'm in the

shizzle yippiyo sizzle daahng dawg eros ultricizzle . In velit tortor, ultricizzle

ghetto, hendrerizzle fo shizzle mah nizzle fo rizzle, mah home g-dizzle, adipiscing

crunk, boom shackalack. Etizzle velit doggy, hizzle consequizzle, pharetra get down

get down, dictizzle sed, shut the shizzle up. Fo shizzle neque. Fo lorizzle. Bling "

}

Page 32: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

What is a document database?

Definitely NOT this

kind of document !

Page 33: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

I

developers

Developer Appeal

Page 34: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Azure DocumentDB

Fully-managed, highly-scalable, NoSQL document database service

query over schema-free

JSON

multi-document

transactions

tunable, high performance

fully managed and designed for massive

scale

JS{ }SQL

Page 35: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

DocumentDBis particularlysuited for web and mobile applications

Catalog data目錄資料

Preferences and state

使用者喜好設定資料

記錄資料、裝置感應器資料

Page 36: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Reliable and predictable performance

Fast & predictable

Tunable consistency

Elastic scale

Page 37: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Rapid development Build with familiar tools –

SQL, REST, JSON, JavaScript

Easy to start

Fully-managed

Page 38: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Part of the Azureecosystem Azure Search

Hadoop

Web, Logic & Mobile Apps*

Stream Analysis*

Machine Learning*

PowerBI*

Page 39: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

DocumentDB – Lightning Round

Page 40: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

JS

JS

JS

Page 41: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

JS

JS

JS

101010

Page 42: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

JS

JS

JS

101010

Page 43: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

JS

JS

JS

101010

* collection != table of homogenous entities

collection ~ a data partition

Page 44: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

JS

JS

JS

101010

{

"id" : "123"

"name" : "joe"

"age" : 30

"address" : {

"street" : "some st"

}

}

Page 45: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

JS

JS

JS

101010

Page 46: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

JSON

JSON valuesSelf-describable, self-contained values

Are trivially serialized to/from text

DocumentDB makes a deep commitment to JSON for storage, indexing, query, and JavaScript execution

{"locations":[

{"country": "Germany", "city": "Berlin"},{"country": "France", "city": "Paris"},

],"headquarter": "Belgium","exports":[{"city"; "Moscow"},{"city: "Athens"}]

};

a JSON document, as a tree

Locations Headquarter

Belgium

Country City Country City

Germany Berlin France Paris

Exports

CityCity

Moscow Athens

0 10 1

Page 47: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Indexing modesConsistent

Default mode

Index updated synchronously on writes

Lazy

Useful for bulk ingestion scenarios

Indexing policiesAutomatic

Default

Manual

Can choose to index documents via

RequestOptions

Can read non-indexed documents

via selflink

Set indexing mode

Set indexing policy

var collection = new DocumentCollection{

Id = "lazyCollection"};

collection.IndexingPolicy.IndexingMode = IndexingMode.Lazy;

client.CreateDocumentCollectionAsync(databaseLink, collection);

var collection = new DocumentCollection{

Id = "manualCollection"};

collection.IndexingPolicy.Automatic = false;

client.CreateDocumentCollectionAsync(databaseLink, collection);

Page 48: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Setting paths, types, and precisionvar collection = new DocumentCollection

{ Id = "Orders"

};

collection.IndexingPolicy.ExcludedPaths.Add("/\"metaData\"/*");

collection.IndexingPolicy.IncludedPaths.Add(new IndexingPath{

IndexType = IndexType.Hash,Path = "/",

});

collection.IndexingPolicy.IncludedPaths.Add(new IndexingPath{

IndexType = IndexType.Range,Path = @"/""shippedTimestamp""/?",NumericPrecision = 7

});

client.CreateDocumentCollectionAsync(databaseLink, collection);

Index pathsInclude and/or Exclude paths

Index typesHash

Range

Geospatial

Index precisionString precision

Numeric precision

Page 49: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 50: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 51: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

JavaScript transactions

Transactionally

process multiple

documents with

defined stored

procedures and

triggers

JavaScript as the language

Executed in an implicit transaction

Performed with ACID guarantees

Triggers invoked as pre- or post-operations

Stored

procedures

Triggers

JS

Page 52: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Query over heterogeneous documents without defining

schema or managing indexes

Query arbitrary paths, properties and values without

specifying secondary indexes or indexing hints

Execute queries with consistent results

Supported SQL features; predicates, iterations (arrays),

sub-queries, logical operators, UDFs, intra-document

JOINs, JSON transforms

In general, more predicates result in a larger request

charge.

Additional predicates can help if they result in narrowing

the overall result set.

from book in client.CreateDocumentQuery<Book>(collectionSelfLink)

where book.Title == "War and Peace"

select book;

from book in client.CreateDocumentQuery<Book>(collectionSelfLink)

where book.Author.Name == "Leo Tolstoy"

select book.Author;

-- Nested lookup against index

SELECT B.Author

FROM Books B

WHERE B.Author.Name = "Leo Tolstoy"

-- Transformation, Filters, Array access

SELECT { Name: B.Title, Author: B.Author.Name }

FROM Books B

WHERE B.Price > 10 AND B.Language[0] = "English"

-- Joins, User Defined Functions (UDF)

SELECT udf.CalculateRegionalTax(B.Price, "USA", "WA")

FROM Books B

JOIN L IN B.Languages

WHERE L.Language = "Russian"

LINQ Query

SQL Query Grammar

Page 53: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

function tax(doc) {factor based on country of headquarters.

var factor =doc.headquarters == "USA" ? 0.35 :doc.headquarters == "Germany" ? 0.3 :doc.headquarters == "Russia" ? 0.2 :0;

if (factor == 0) {

throw new Error("Unsupported country: " +doc.headquarters);

}

return doc.income * factor;}

// Execute query with UDF client.CreateDocumentQuery<dynamic>(colSelfLink, "SELECT r.name AS company, udf.Tax(r) AS tax FROM root r WHERE r.type='Company'");

The complexity of a query impacts the request units consumed for an operation:

Use of user-defined functions (UDFs)

have at least one filter

.

Page 54: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 55: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 56: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 57: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 58: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 59: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

{ "id": "1", "firstName": "Thomas", "lastName": "Andersen", "addresses": [

{ "line1": "100 Some Street", "line2": "Unit 1", "city": "Seattle", "state": "WA", "zip": 98012 }

], "contactDetails": [

{"email: "[email protected]"}, {"phone": "+1 555 555-5555", "extension": 5555}

] }

Try model your entity as a self-contained document

Generally, use embedded data models when:

contains

one-to-few

changes infrequently

won’t grow

integral

better read performance

Page 60: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

In general, use normalized data

models when:

Write performance

one-to-many

many-to-many

changes frequently

{"id": "xyz","username: "user xyz"

}

{"id": "address_xyz","userid": "xyz","address" : {

…}

}

{"id: "contact_xyz","userid": "xyz","email" : "[email protected]""phone" : "555 5555"

}

Normalizing typically provides better write performance

Page 61: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

No magic bullet

Think about how your data is

going to be written, read and

model accordingly

{ "id": "1", "firstName": "Thomas", "lastName": "Andersen", "countOfBooks": 3, "books": [1, 2, 3], "images": [

{"thumbnail": "http://....png"} {"profile": "http://....png"}

] }

{ "id": 1, "name": "DocumentDB 101", "authors": [

{"id": 1, "name": "Thomas Andersen", "thumbnail": "http://....png"}, {"id": 2, "name": "William Wakefield", "thumbnail": "http://....png"}

] }

Page 62: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

• Map properties to JSON types

• Prefer smaller documents (<16KB) for smaller footprint, less IO, lower RU charges.

• Maximum size is 512KB – be aware of unbounded arrays leading to document bloat

• Store metadata on attachments, reference binary data/free text as external links

• Prefer sparse properties – skip rather than explicit null

• Prefer fullname = "Azure DocumentDB" to firstName = "Azure" AND lastName = "DocumentDB"

Page 63: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Indexing policy• Specify range indexing on paths which use range queries (like timestamps)

• Use higher index precision (6/7) for range indexes and for dense hash indexes

• Use lazy indexing to handle bulk ingestion scenarios

• Exclude paths not required for querying

Querying• Optimize for queries with small result sets for scalability

• Limit use of scans (no range index, NOT, UDFs in WHERE)

• Use page size (MaxItemCount) and continuation tokens

• For large result sets, use a larger page size (1000)

Page 64: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Why Partition?

• Data SizeA single collection (currently*) holds 10GB

• Throughput3 Performance tiers with a max of 2,500 RU/sec

* not a commitment that this will be lifted in future, it might

Page 65: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 66: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 67: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Tenant Partition Id

Customer 1

Big Customer 2

Another 3

Page 68: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 69: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

{

record: "1",created: {

"date": "6/1/2014","epoch": 1401662986

}},

{record: "3",created: {

"date": "9/23/2014""epoch": 1411512586

}} ,

{record: "123",created: {

"date": "8/17/2013""epoch": 1376779786

}}

SELECT * FROM root r WHERE r.date.epoch BETWEEN 1376779786 AND 1401662986

{

record: "1",created: {

"date": "6/1/2014","epoch": 1401662986

}},

{record: "3",created: {

"date": "9/23/2014""epoch": 1411512586

}}

{record: "43233",created: {

"epoch": 1411512586}

} ,

{record: "1123",created: {

"date": "8/17/2013""epoch": 1376779786

}},

{ record: "43234",created: {

"epoch": 1376779786}

Page 70: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Hash sharding• Examples: Profile data (user ID, app ID), (user ID), Device and vehicle data (device/vin ID),

Catalog data (item ID)

• Pros: balanced, stateless

• Cons: reshuffling is hard

Range sharding• Examples: Operational data (timestamp), (timestamp, event ID)

• Pros: easy sliding window, range queries

• Cons: stateful

Lookup sharding• SaaS/multitenant service (tenant ID), Metadata store (type ID)

• Pros: simple, easy to reshuffle, can span accounts

• Cons: stateful, works only on discrete keys

Page 71: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

DocumentDB Client SDK

PartitionResolvers in the SDK

{

record: "1",created: {

"date": "6/1/2014",

"epoch": 1401662986}

},

{record: "3",created: {

"date": "9/23/2014"

"epoch": 1411512586}

} ,

{record: "123",created: {

"date": "8/17/2013"

"epoch": 1376779786}

}

{

record: "1",created: {

"date": "6/1/2014",

"epoch": 1401662986}

},

{record: "3",created: {

"date": "9/23/2014"

"epoch": 1411512586}

}

{record: "43233",created: {

"epoch": 1411512586}

} ,

{record: "1123",created: {

"date": "8/17/2013"

"epoch": 1376779786}

},

{ record: "43234",created: {

"epoch": 1376779786}

App Tier

Page 72: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

PartitionResolvers in the SDK

http://aka.ms/documentdb-partitioning

Page 73: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 74: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 75: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Catalog & Product

Data

User Data &

Preferences Events & Logging Geospatial Data Data Exchange

Page 76: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Challenge

Provide a personalized web

experience for millions of users

Strict requirements on performance

SolutionSchema-free for multiple verticals

High performance

Rich querying experience

Page 77: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Challenge

Track deployments of storage for

customers

Priority on project completion time

Solution

Dynamic schemas for productivity

No hardware management –

fully managed

Page 78: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Challenge

Create the destination for

customers to discover, learn, and

share

Catalog of all Gallery items’

metadata

Solution

Schema free data store

Needed flexible indexing options

Page 79: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

https://azure.microsoft.com/zh-tw/documentation/articles/documentdb-limits/

Page 80: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Table Storage

Page 81: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Azure Tables• NoSQL key-value store

• Flexible schema

• Enables rapid development

• Scaling from few records to millions of records

• Strong consistency model

PK RK Name Status EmpNo

A AliceC Alice Clarke Available 1223

PK RK Name Status WebSite Email

A AlexP Alex Pen Busy http://... alexp@...

PK RK Name Status

B BobS Bob Stevens Offline

PK RK Name Status Photo Email

B BillP Bill Peters Available http://... alexp@...

• Events & Metrics

• Address book

• Device information

• Server status

Usage examples:

Table: UserStatus

Page 82: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 83: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 84: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

http://bit.do/cqrs-pattern

http://bit.do/event-sourcing-pattern

Page 85: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 86: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Audit Events for

Bob Tabor

Audit Events for

Richard BoughtonAudit Events for

Dan Star

Partition Key: User Identifier (Bob Tabor or 12345, Richard Boughton or 12346, etc.)

Row Key: Something that uniquely identified a given entity (audit event) for that user. I.e., GUID

Page 87: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory
Page 88: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Platform Services

Security & Management

Infrastructure Services

Web Apps

MobileApps

APIManagement

APIApps

LogicApps

NotificationHubs

Content DeliveryNetwork (CDN)

MediaServices

HDInsight MachineLearning

StreamAnalytics

DataFactory

EventHubs

MobileEngagement

ActiveDirectory

Multi-FactorAuthentication

Automation

Portal

Key Vault

BiztalkServices

HybridConnections

ServiceBus

StorageQueues

Store /Marketplace

HybridOperations

Backup

StorSimple

SiteRecovery

Import/Export

SQLDatabase

DocumentDB

RedisCache Search

Tables

SQL DataWarehouse

Azure AD Connect Health

AD PrivilegedIdentity Management

OperationalInsights

CloudServices

Batch Remote App

ServiceFabric Visual Studio

ApplicationInsights

Azure SDK

Team Project

VM Image Gallery& VM Depot

Azure

Data Lake StoreAzure Data Lake Analytics

Page 89: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Azure Data Lake

Page 90: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Analytics

Storage

HDInsight(“managed clusters”)

Azure Data Lake Analytics

Azure Data Lake Storage

Page 91: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

WebHDFS

YARN

U-SQL

Analytics Service HDInsight

(managed Hadoop Clusters)Analytics

Store

Azure Data Lake

Page 92: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

95

Business Scenarios

Recommendations,

customer churn,

forecasting, etc.

Perceptual Intelligence

Face, vision

Speech, text

Personal Digital Assistant

Cortana

Dashboards and

Visualizations

Power BI

Machine Learning

and Analytics

Azure

Machine Learning

Azure

Stream Analytics

DATA

Business apps

Custom apps

Sensors and devices

INTELLIGENCE ACTION

People

AutomatedSystems

Big Data Stores

AzureSQL Data Warehouse

Information

Management

Azure

Data Factory

Azure

Data Catalog

Azure

Event Hub

Azure

Data Lake Store

Azure

HDInsight (Hadoop)

Azure

Data Lake Analytics

Azure Data Lakeas part of Cortana Analytics Suite

Page 93: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory

Azure Data LakeStore & managed clusters

Hadoop Cluster

HDFS/WebHDFS API

Azure Data Lake store

Azure Data Lake managed clusters

Page 94: NoSQL on Microsoft Azure: An introductiondownload.microsoft.com/download/5/9/E/59E3DFD2-ABD... · Learning Stream Analytics Data Factory Event Hubs Mobile Engagement Active Directory