Microsoft Database Options

38
Data Options in the Cloud Joe Shirey [email protected]

TAGS:

description

Joe Shirey's deck delivered at the 2009Q4 Microsoft Architect Council events

Transcript of Microsoft Database Options

Page 1: Microsoft Database Options

Data Options in the Cloud

Joe [email protected]

Page 2: Microsoft Database Options

ON-PREMISES WEB & CLOUDS third party cloudweb applications

lob Applicationscomposite applications

compute storage management managementrelational data connectivity access control

developer experienceuse existing skills and tools

Cloud Development

Page 3: Microsoft Database Options

ON-PREMISES WEB & CLOUDS third party cloudweb applications

lob Applicationscomposite applications

management management connectivity access control

developer experienceuse existing skills and tools

Storage in the Cloud

compute relational datastorage

Page 4: Microsoft Database Options

Windows Azure Storage

• The goal is to allow users and services– Anywhere at anytime access– Store data for any length of time – Scale to store any amount of data– Be confident that the data will not be lost – Pay for only what they use/store

Page 5: Microsoft Database Options

Windows Azure Storage

• Storage– Durable– Scalable (capacity and throughput)– Highly available

• Rich storage concepts– Large user data items: blobs– Service state: tables– Service communication: queues

• Simple and familiar programming interfaces– REST (HTTP and HTTPS)– .NET accessible

Page 6: Microsoft Database Options

Fundamental data abstractions

• Blobs – provide a simple interface for storing named files along with metadata for the file

• Tables – provide structured storage. a table is a set of entities, which contain a set of properties

• Queues – provide reliable storage and delivery of messages for an application

Page 7: Microsoft Database Options

Blob Storage Concepts

blobcontaineraccount

sally

pictures

IMG001.JPG

IMG002.JPG

movies MOV1.AVI

Page 8: Microsoft Database Options

Blob Features and Functions

• Store large objects (up to 200GB)• Associate metadata with blob– metadata is <name, value> pairs, up to 8KB per blob– set/get with or separate from blob data bits

• Standard REST Interface– PutBlob

• Inserts a new blob or overwrites the existing blob

– GetBlob• Get whole blob or a specific range

– DeleteBlob

Page 9: Microsoft Database Options

Windows Azure Tables

• Provides structured storage– Massively scalable tables

• Billions of entities (rows) and TBs of data• Automatically scales across servers as traffic grows

– Highly available• Anywhere at anytime access to your data

– Durable• Data is replicated at least 3 times

• Familiar and easy to use programming interfaces– ADO.NET data services – .NET 3.5 SP1

• .NET classes and LINQ• REST - with any platform or language

Page 10: Microsoft Database Options

Table Storage Concepts

entitytableaccount

sally

users

Name =…Email = …

Name =…Email = …

photo index

Photo ID =…Date =…

Photo ID =…Date =…

Page 11: Microsoft Database Options

Table Data Model

• Table– A storage account can create many tables– Table name is scoped by account

• Data is stored in tables– A table is a set of entities (rows)– An entity is a set of properties (columns)

• Entity– Two “key” properties that together are

the unique ID of the entity in the table• PartitionKey – enables scalability• RowKey – uniquely identifies the entity within the partition

Page 12: Microsoft Database Options

Partition Dey and Partitions

• Every table has a partition key– It is the first property (column) of your table– Used to group entities in the table into partitions

• A table partition – All entities in a table with the same

partition key value• Partition key is exposed in

the programming model – Allows application to control the granularity

of the partitions and enable scalability

Page 13: Microsoft Database Options

Partition KeyDocument Name

Row KeyVersion

Property 3Modification Time

… Property NDescription

Example Doc V1.0 8/2/2007 … Committed version

Example Doc V2.0.1 9/28/2007 Alice’s working version

FAQ Doc V1.0 5/2/2007 Committed version

FAQ Doc V1.0.1 7/6/2007 Alice’s working version

FAQ Doc V1.0.2 8/1/2007 Sally’s working version

Partition Example• Table partition – all entities in table

with same partition key value• Application controls granularity of partition

Partition 1

Partition 2

Page 14: Microsoft Database Options

Purpose of the Partition Key

• Entity locality– Entities in the same partition will be stored together

• Efficient querying and cache locality

• Entity group transactions – Atomically perform multiple insert/update/delete

over entities in same partition in a single transaction

• Table scalability– We monitor the usage patterns of partitions– Automatically load balance partitions

• Each partition can be served by a different storage node• Scale to meet the traffic needs of your table

Page 15: Microsoft Database Options

Choosing a Partition Key

• Granularity of entity group transactions– Make the partition key only as big as you need it for entity

group transactions

• Spread out load across partitions– More partitions – makes it easier to automatically balance load

• Currently have one primary index– Important to use a partition key that is common in your queries– If partition key is part of query

• Fast access to retrieve entities within a single partition

– If partition key is not specified in a query • Then every partition has to be scanned

Page 16: Microsoft Database Options

Table Entities and Properties

• Each entity can have up to 255 properties• Mandatory properties for every entity in table– Partition key– Row key

• All entities have a system maintained version• No fixed schema for rest of properties– Each property is stored as a <name, typed value> pair

• No schema stored for a table

– 2 entities within the same table can have different properties– Properties can be the standard .NET types

• String, binary, bool, DateTime, GUID, int, int64, and double

Page 17: Microsoft Database Options

Table Programming Model

• Provide familiar and easy to use interfaces– Leverage your .NET expertise

• Table entities are accessed as objects via ADO.NET Data Services – .NET 3.5 SP1– LINQ – language Integrated query– RESTful access to table and entities

• Insert/update/delete entities over the table• Query over tables– Get back a list of structured entities

Page 18: Microsoft Database Options

Web + Worker Role Pattern• Web role

– Web farm that handles request from the internet– Push work items onto storage queue

• Worker role– Process work item off storage queue

Public internet

Web role

Cloud storage (tables, blobs, queues)

Worker role

Load balancer

n m

Q

Page 19: Microsoft Database Options

Windows Azure Queues

• Provide reliable message delivery– Simple, asynchronous work dispatch– Programming semantics ensure that a

message can be processed at least once• Queues are highly available,

durable and performance efficient• Access is provided via REST

Page 20: Microsoft Database Options

Queue Storage Concepts

MessageQueueAccount

sally

thumbnail jobs

128x128, http://…

256x256, http://…

photo processing jobs

http://…

http://…

Page 21: Microsoft Database Options

Account, Queues and Messages

• An account can create many queues– Queue name is scoped by the account

• A queue contains messages– No limit on number of messages stored in a queue– A message is stored for at most a week in a queue

http://<Account>.queue.core.windows.net/<QueueName>

• Messages– Message size <= 8 KB– To store larger data, store data in blob/entity storage, and

the blob/entity name in the message

Page 22: Microsoft Database Options

queue programming API

• queues– create/delete/clear queues– inspect queue length

• messages– enqueue (queuename, message)– dequeue (queueName, invisibility time T)• returns the message with a messageID • makes the message invisible for time T

– delete(queuename, messageID)

Page 23: Microsoft Database Options

Queue Best Practices

• Make message processing idempotent– Need to deal with failures

• No fixed order for dequeue messages– Invisible messages result in out of order

processing • Use the queue length to scale your workers

Page 24: Microsoft Database Options

Demowindows azure storage

Page 25: Microsoft Database Options

SQL Azure

• SQL Server Data Services false start• Familiar relational model• Uses existing APIs & tools• Friction free provisioning and reduced management• Built for the cloud with availability and scale

Clear feedback: “I want a SQL database in the cloud”

Page 26: Microsoft Database Options

Service Provisioning Model• Each account has zero or more servers

– Azure wide, provisioned in a common portal– Billing instrument

• Each server has one or more databases– Contains metadata about the databases and usage– Unit of authentication– Unit of geo-location– Generated DNS based name

• Each database has standard SQL objects– Unit of consistency– Unit of multi-tenancy– Contains users, tables, views, indices, etc.– Most granular unit of billing

• SKU’s – Web edition -1 GB– Business edition – 10 GB

account

server

database

Page 27: Microsoft Database Options

Architecture

• Shared infrastructure at SQL database and below– Request routing, security and isolation

• Scalable HA technology provides the glue– Automatic replication and failover

• Provisioning, metering and billing infrastructure

Machine 5SQL Instance

SQL DBUserDB1

UserDB2

UserDB3

UserDB4

Scalability and Availability: Fabric, Failover, Replication, and Load balancing

SDS Provisioning (databases, accounts, roles, …, Metering, and Billing

Machine 6SQL Instance

SQL DBUserDB1

UserDB2

UserDB3

UserDB4

Machine 4SQL Instance

SQL DBUserDB1

UserDB2

UserDB3

UserDB4

Scalability and Availability: Fabric, Failover, Replication, and Load balancing

Page 28: Microsoft Database Options

Sample of SQL compatibilityIn Scope for V1• Tables, indexes and views• Stored Procedures• Triggers• Constraints• Table variables,

session temp tables (#t)• …..

Out of Scope for V1• Distributed Transactions• Distributed Query• CLR• Service Broker• Spatial• Physical server or catalog DDL

and views

Page 29: Microsoft Database Options

Logical vs. Physical Administration

• Customer can focus on logical administration– Schema creation and management– Query optimization– Security management (logins, users, roles)

• Service handles physical management– Automatically replicated with HA “out of the box”– Transparent failover in case of failure– Load balancing of data to ensure SLA

DBA role places more focus on logical management

Page 30: Microsoft Database Options

Programming Model

• Small data sets– Use a single database– Same model as on-premise SQL Server

• Large data sets– Partition data across many databases– Use parallel fan-out queries to fetch the data– Application code must be partition aware

Page 31: Microsoft Database Options

Feature Set ComparisonSQL Server SQL Azure

Authentication SQL & Windows Only SQL Authentication

Schema No limitation All tables must have PK

TSQL Some commands fully supported, some partial and some un-supported.

USE Supported Not supported

Replication/Log Shipping/DB Mirroring

Supported Not Supported

SQL Agent Supported Not Supported – no jobs… etc

Server Level options Most server level metadata/options are not available.

Providers Only ADO.NET is supported at this time

Connection Limitations To provide fair usage experience to all tenants, certain connections may be throttled due to excessive resource usage.

Page 32: Microsoft Database Options

Management Tooling

• SQL Server 2008 R2 Management Studio• SQLCMD• Third party/ community

add-ons

Page 33: Microsoft Database Options

Data Migration

• BCP• SSIS• SQL Azure Migration Wizard

Page 34: Microsoft Database Options

Customer Learning’s from TAP program

• Use SQL Azure to store metadata and BLOB storage for large files

• Highly elastic load patterns in a cost effective way is an industry challenge

• Combination of different SKUs

Page 35: Microsoft Database Options

Sync Framework

• MSDN - http://msdn.microsoft.com/en-us/sync/default.aspx

SQL Azure Local SQL DB

Sync Process

Sync Process

Local Computer

Page 36: Microsoft Database Options

DemoSQL Azure storage and Sync

Page 37: Microsoft Database Options

Pricing

Windows Azure Storage• Bandwidth: $0.10 in /

$0.15 out / GB

• $0.15/GB stored/month

SQL Azure Storage• Bandwidth: $0.10 in /

$0.15 out / GB

• Web edition (1GB): $9.99/month

• Business edition (10GB): $99.99/month

Page 38: Microsoft Database Options

Data Options in the Cloud

[email protected]://www.joeshirey.com