Introduction to Amazon Redshift

32
Introducing Amazon Redshift David Pearson Business Development Manager http://aws.amazon.com/resources/databaseservices/webinars

description

An introduction to Amazon Redshift.

Transcript of Introduction to Amazon Redshift

Page 1: Introduction to Amazon Redshift

Introducing Amazon

Redshift

David Pearson Business Development Manager

http://aws.amazon.com/resources/databaseservices/webinars

Page 2: Introduction to Amazon Redshift

What is AWS?

Compute Storage

AWS Global Infrastructure

Database

Application Services

Deployment & Administration

Networking

Page 3: Introduction to Amazon Redshift

Amazon DynamoDB Fast, Predictable, Highly-Scalable NoSQL Data Store

Amazon RDS Managed Relational Database Service for

MySQL, Oracle and SQL Server

Amazon ElastiCache In-Memory Caching Service

Amazon Redshift Fast, Powerful, Fully Managed, Petabyte-Scale

Data Warehouse Service

Compute Storage

AWS Global Infrastructure

Database

Application Services

Deployment & Administration

Networking

AWS Database Services

Scalable High Performance Application Storage in the Cloud

Page 4: Introduction to Amazon Redshift

Why Data Warehousing?

No upfront costs, pay as you go

Really fast performance at a really low price

Open and flexible with support for popular tools

Easy to provision and scale up massively

Page 5: Introduction to Amazon Redshift

Amazon Redshift

data warehouse service

petabyte-scale fast and fully managed

Page 6: Introduction to Amazon Redshift

objectives design and build a petabyte-scale data warehouse service

Amazon Redshift

A Whole Lot Simpler

A Lot Cheaper

A Lot Faster

Page 7: Introduction to Amazon Redshift

Redshift Dramatically Reduces I/O

• Direct-attached storage • Large data block sizes • Columnar storage • Data compression • Zone maps

Id Age State 123 20 CA 345 25 WA 678 40 FL

Row storage Column storage

Page 8: Introduction to Amazon Redshift

Redshift Runs on Optimized Hardware

HS1.8XL: 128GB RAM, 16 Cores, 24 Spindles, 16TB Storage, 2GB/sec scan rate

HS1.XL: 16GB RAM, 2 Cores, 3 Spindles, 2TB Storage

• Optimized for I/O intensive workloads • High disk density • Runs in HPC - fast network • HS1.8XL available on Amazon EC2

Page 9: Introduction to Amazon Redshift

Redshift Runs on Optimized Hardware

HS1.8XL: 128GB RAM, 16 Cores, 24 Spindles, 16TB Storage, 2GB/sec scan rate

HS1.XL: 16GB RAM, 2 Cores, 3 Spindles, 2TB Storage

Start Small

1 x XL = 2TB

Grow Big

100 x 8XL = 1.6PB

Page 10: Introduction to Amazon Redshift

Load Query Resize Backup Restore

Redshift Parallelizes and Distributes Everything

Compute Node 16TB

10 GigE (HPC)

Ingestion Backup Restore

SQL Clients / BI Tools

Amazon S3

Client VPC

Compute Node 16TB

Compute Node 16TB

Leader Node

Page 11: Introduction to Amazon Redshift

data v

olume

Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011

IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares

data available for analysis

data generated

Gap

Page 12: Introduction to Amazon Redshift

Redshift is Priced to Analyze All Your Data

$0.85 per hour for on-demand (2TB) $999 per TB per year (3-yr reservation)

Page 13: Introduction to Amazon Redshift

Working with Redshift

Page 14: Introduction to Amazon Redshift

differentiated effort increases the uniqueness of an application

Page 15: Introduction to Amazon Redshift

Redshift Simplifies Provisioning

• Create a cluster in minutes

• Automatically patch your OS and data warehouse software

• Scale up to 1.6PB with a few clicks and no downtime

Page 16: Introduction to Amazon Redshift
Page 17: Introduction to Amazon Redshift
Page 18: Introduction to Amazon Redshift
Page 19: Introduction to Amazon Redshift

Integrate Redshift with remote data

centers

Page 20: Introduction to Amazon Redshift
Page 21: Introduction to Amazon Redshift
Page 22: Introduction to Amazon Redshift

Compute Node 2TB

Compute Node 2TB

Compute Node 2TB

Compute Node 2TB

Leader Node

Compute Node 2TB

Compute Node 2TB

Leader Node

Amazon S3

SQL Clients / BI Tools

1. Cluster placed in read-only mode 2. New cluster provisioned 3. Data copied across (MPP)

Page 23: Introduction to Amazon Redshift

1. Cluster placed in read-only mode 2. New cluster provisioned 3. Data copied across (MPP) 4. DNS switched to new cluster (read-write) 5. Source cluster is de-provisioned

Compute Node 2TB

Compute Node 2TB

Compute Node 2TB

Compute Node 2TB

Leader Node

Compute Node 2TB

Compute Node 2TB

Leader Node

Amazon S3

SQL Clients / BI Tools

Page 24: Introduction to Amazon Redshift

Integrates With Existing BI Tools

Amazon Redshift

JDBC/ODBC

Page 25: Introduction to Amazon Redshift

Amazon Redshift

Live Demonstration

Jeremy Winters

Lead Architect and Database Warehouse Designer

Page 26: Introduction to Amazon Redshift

Getting Started

Page 27: Introduction to Amazon Redshift

Reporting Warehouse

• Accelerated operational reporting • Support for short-time use cases • Data compression, index redundancy

RDBMS Redshift

OLTP ERP Reporting

and BI

Page 28: Introduction to Amazon Redshift

Data Integration Partners*

On-Premises Integration

RDBMS Redshift

OLTP ERP Reporting

and BI

* as of 3/14/2013

Page 29: Introduction to Amazon Redshift

Live Archive for (Structured) Big Data

• Direct integration with copy command • High velocity data ages into Redshift • Low cost, high scale option for new apps

DynamoDB Redshift

OLTP Web Apps Reporting

and BI

Page 30: Introduction to Amazon Redshift

Cloud ETL for Big Data

• Maintain online SQL access to historical logs • Transformation and enrichment with EMR • Longer history ensures better insight

Redshift Reporting and BI Elastic MapReduce

S3

Page 31: Introduction to Amazon Redshift

Redshift

Fast Low Cost less than $1 / hour to get started less than $1K / TB to run Redshift for a year Easy To Get Started Please visit: http://aws.amazon.com/redshift/

“up to 50 times faster than our current OLAP solution” “exponential gains in performance”

Page 32: Introduction to Amazon Redshift

Questions?

http://aws.amazon.com/resources/databaseservices/webinars