Data Warehousing with Amazon Redshift

48
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tony Gibbs, Big Data Solutions Architect Data Warehousing with Amazon Redshift

Transcript of Data Warehousing with Amazon Redshift

Page 1: Data Warehousing with Amazon Redshift

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Tony Gibbs, Big Data Solutions Architect

Data Warehousing with Amazon Redshift

Page 2: Data Warehousing with Amazon Redshift

Deep Dive Agenda

• Amazon Redshift history and development• Cluster architecture• Concepts and terminology• Node components• Storage deep dive

• Design considerations• Parallelism deep dive• Open Q&A

Page 3: Data Warehousing with Amazon Redshift

Amazon Redshift History & Development

Page 4: Data Warehousing with Amazon Redshift

Columnar

MPP

AWS IAMAmazon VPC

Amazon SWF

Amazon S3 AWS KMS AmazonRoute 53

AmazonCloudWatch

Amazon EC2

Page 5: Data Warehousing with Amazon Redshift

February 2013

October 2016

> 100 Significant Patches

> 130 Significant Features

Page 6: Data Warehousing with Amazon Redshift

Amazon Redshift Cluster Architecture

Page 7: Data Warehousing with Amazon Redshift

Amazon Redshift Cluster Architecture

Massively parallel, shared nothingLeader node

• SQL endpoint• Stores metadata• Coordinates parallel SQL processing

Compute nodes• Local, columnar storage• Executes queries in parallel• Load, backup, restore

10 GigE(HPC)

IngestionBackupRestore

SQL Clients/BI Tools

128GB RAM

16TB disk

16 cores

S3 / EMR / DynamoDB / SSH

JDBC/ODBC

128GB RAM

16TB disk

16 coresCompute Node

128GB RAM

16TB disk

16 coresCompute Node

128GB RAM

16TB disk

16 coresCompute Node

LeaderNode

Page 8: Data Warehousing with Amazon Redshift

Compute & Leader Node Components

Page 9: Data Warehousing with Amazon Redshift

128GB RAM

16TB disk

16 cores

128GB RAM

16TB disk

16 cores

Compute Node

128GB RAM

16TB disk

16 cores

Compute Node

128GB RAM

16TB disk

16 cores

Compute Node

Leader Node

Page 10: Data Warehousing with Amazon Redshift

128GB RAM

16TB disk

16 cores

128GB RAM

16TB disk

16 cores

Compute Node

128GB RAM

16TB disk

16 cores

Compute Node

128GB RAM

16TB disk

16 cores

Compute Node

Leader Node

• Parser & Rewriter• Planner & Optimizer • Code Generator

• Input: Optimized plan• Output: >=1 C++

functions• Compiler

• Task Scheduler• WLM

• Admission• Scheduling

• PostgreSQL Catalog Tables

Page 11: Data Warehousing with Amazon Redshift

128GB RAM

16TB disk

16 cores

128GB RAM

16TB disk

16 cores

Compute Node

128GB RAM

16TB disk

16 cores

Compute Node

128GB RAM

16TB disk

16 cores

Compute Node

Leader Node

• Query execution processes• Backup & restore processes• Replication processes• Local Storage

• Disks• Slices• Tables

• Columns• Blocks

• Superblocks

Page 12: Data Warehousing with Amazon Redshift

128GB RAM

16TB disk

16 cores

128GB RAM

16TB disk

16 cores

Compute Node

128GB RAM

16TB disk

16 cores

Compute Node

128GB RAM

16TB disk

16 cores

Compute Node

Leader Node

• Query execution processes• Backup & restore processes• Replication processes• Local Storage

• Disks• Slices• Tables

• Columns• Blocks

• Superblocks

Page 13: Data Warehousing with Amazon Redshift

Concepts and Terminology

Page 14: Data Warehousing with Amazon Redshift

Designed for I/O ReductionColumnar storage

Data compression

Zone mapsaid loc dt

1 SFO 2016-09-01

2 JFK 2016-09-14

3 SFO 2017-04-01

4 JFK 2017-05-14

• Accessing dt with row storage:

– Need to read everything– Unnecessary I/O

aid loc dt

CREATE TABLE loft_deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date);

Page 15: Data Warehousing with Amazon Redshift

Designed for I/O ReductionColumnar storage

Data compression

Zone mapsaid loc dt

1 SFO 2016-09-01

2 JFK 2016-09-14

3 SFO 2017-04-01

4 JFK 2017-05-14

• Accessing dt with columnar storage:

– Only scan blocks for relevant column

aid loc dt

CREATE TABLE loft_deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date);

Page 16: Data Warehousing with Amazon Redshift

Designed for I/O ReductionColumnar storage

Data compression

Zone mapsaid loc dt

1 SFO 2016-09-01

2 JFK 2016-09-14

3 SFO 2017-04-01

4 JFK 2017-05-14

• Columns grow and shrink independently• Effective compression ratios due to like

data • Reduces storage requirements• Reduces I/O

aid loc dt

CREATE TABLE loft_deep_dive ( aid INT ENCODE LZO ,loc CHAR(3) ENCODE BYTEDICT ,dt DATE ENCODE RUNLENGTH);

Page 17: Data Warehousing with Amazon Redshift

Designed for I/O ReductionColumnar storage

Data compression

Zone mapsaid loc dt

1 SFO 2016-09-012 JFK 2016-09-143 SFO 2017-04-014 JFK 2017-05-14

aid loc dt

CREATE TABLE loft_deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date);

• In-memory block metadata• Contains per-block MIN and MAX value• Effectively prunes blocks which cannot

contain data for a given query• Eliminates unnecessary I/O

Page 18: Data Warehousing with Amazon Redshift

SELECT COUNT(*) FROM LOGS WHERE DATE = '09-JUNE-2013'

MIN: 01-JUNE-2013MAX: 20-JUNE-2013

MIN: 08-JUNE-2013MAX: 30-JUNE-2013

MIN: 12-JUNE-2013MAX: 20-JUNE-2013

MIN: 02-JUNE-2013MAX: 25-JUNE-2013

Unsorted TableMIN: 01-JUNE-2013MAX: 06-JUNE-2013

MIN: 07-JUNE-2013MAX: 12-JUNE-2013

MIN: 13-JUNE-2013MAX: 18-JUNE-2013

MIN: 19-JUNE-2013MAX: 24-JUNE-2013

Sorted By Date

Zone Maps

Page 19: Data Warehousing with Amazon Redshift

Terminology and Concepts: Data Sorting

• Goals:• Physically order rows of table data based on certain column(s)• Optimize effectiveness of zone maps• Enable MERGE JOIN operations

• Impact:• Enables rrscans to prune blocks by leveraging zone maps• Overall reduction in block I/O

• Achieved with the table property SORTKEY defined over one or more columns• Optimal SORTKEY is dependent on:

• Query patterns• Data profile• Business requirements

Page 20: Data Warehousing with Amazon Redshift

Terminology and Concepts: Slices

A slice can be thought of like a “virtual compute node”• Unit of data partitioning • Parallel query processing

Facts about slices:• Each compute node has either 2, 16, or 32 slices• Table rows are distributed to slices• A slice processes only its own data

Compute Node

Disks

Slices

Columns

Blocks

Page 21: Data Warehousing with Amazon Redshift

Data Distribution

• Distribution style is a table property which dictates how that table’s data is distributed throughout the cluster:• KEY: Value is hashed, same value goes to same location (slice)• ALL: Full table data goes to first slice of every node• EVEN: Round robin

• Goals:• Distribute data evenly for parallel processing• Minimize data movement during query processing

KEY

ALLNode 1

Slice 1

Slice 2

Node 2

Slice 3

Slice 4

Node 1

Slice 1

Slice 2

Node 2

Slice 3

Slice 4

keyA

keyB

keyC

keyD

Node 1

Slice 1

Slice 2

Node 2

Slice 3

Slice 4

EVEN

Page 22: Data Warehousing with Amazon Redshift

Data Distribution: ExampleCREATE TABLE loft_deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date) DISTSTYLE (EVEN|KEY|ALL);

CN1

Slice 0 Slice 1

CN2

Slice 2 Slice 3

Table: loft_deep_diveUser

ColumnsSystem

Columnsaid loc dt ins del row

Page 23: Data Warehousing with Amazon Redshift

Data Distribution: EVEN ExampleCREATE TABLE loft_deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date) DISTSTYLE EVEN;

CN1

Slice 0 Slice 1

CN2

Slice 2 Slice 3

INSERT INTO loft_deep_dive VALUES (1, 'SFO', '2016-09-01'),(2, 'JFK', '2016-09-14'),(3, 'SFO', '2017-04-01'),(4, 'JFK', '2017-05-14');

Table: loft_deep_diveUser Columns System Columns

aid loc dt ins del row

Table: loft_deep_diveUser Columns System Columns

aid loc dt ins del row

Table: loft_deep_diveUser Columns System Columns

aid loc dt ins del row

Table: loft_deep_diveUser Columns System Columns

aid loc dt ins del row

Rows: 0 Rows: 0 Rows: 0 Rows: 0

(3 User Columns + 3 System Columns) x (4 slices) = 24 Blocks (24MB)

Rows: 1 Rows: 1 Rows: 1 Rows: 1

Page 24: Data Warehousing with Amazon Redshift

Data Distribution: KEY Example #1CREATE TABLE loft_deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date) DISTSTYLE KEY DISTKEY (loc);

CN1

Slice 0 Slice 1

CN2

Slice 2 Slice 3

INSERT INTO loft_deep_dive VALUES (1, 'SFO', '2016-09-01'),(2, 'JFK', '2016-09-14'),(3, 'SFO', '2017-04-01'),(4, 'JFK', '2017-05-14');

Table: loft_deep_diveUser Columns System Columns

aid loc dt ins del row

Rows: 2 Rows: 0 Rows: 0

(3 User Columns + 3 System Columns) x (2 slices) = 12 Blocks (12MB)

Rows: 0Rows: 1

Table: loft_deep_diveUser Columns System Columns

aid loc dt ins del row

Rows: 2Rows: 0Rows: 1

Page 25: Data Warehousing with Amazon Redshift

Data Distribution: KEY Example #2CREATE TABLE loft_deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date) DISTSTYLE KEY DISTKEY (aid);

CN1

Slice 0 Slice 1

CN2

Slice 2 Slice 3

INSERT INTO loft_deep_dive VALUES (1, 'SFO', '2016-09-01'),(2, 'JFK', '2016-09-14'),(3, 'SFO', '2017-04-01'),(4, 'JFK', '2017-05-14');

Table: loft_deep_diveUser Columns System Columns

aid loc dt ins del row

Table: loft_deep_diveUser Columns System Columns

aid loc dt ins del row

Table: loft_deep_diveUser Columns System Columns

aid loc dt ins del row

Table: loft_deep_diveUser Columns System Columns

aid loc dt ins del row

Rows: 0 Rows: 0 Rows: 0 Rows: 0

(3 User Columns + 3 System Columns) x (4 slices) = 24 Blocks (24MB)

Rows: 1 Rows: 1 Rows: 1 Rows: 1

Page 26: Data Warehousing with Amazon Redshift

Data Distribution: ALL ExampleCREATE TABLE loft_deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date) DISTSTYLE ALL;

CN1

Slice 0 Slice 1

CN2

Slice 2 Slice 3

INSERT INTO loft_deep_dive VALUES (1, 'SFO', '2016-09-01'),(2, 'JFK', '2016-09-14'),(3, 'SFO', '2017-04-01'),(4, 'JFK', '2017-05-14');

Rows: 0 Rows: 0

(3 User Columns + 3 System Columns) x (2 slice) = 12 Blocks (12MB)

Table: loft_deep_diveUser Columns System Columns

aid loc dt ins del row

Rows: 0Rows: 1Rows: 2Rows: 4Rows: 3

Table: loft_deep_diveUser Columns System Columns

aid loc dt ins del row

Rows: 0Rows: 1Rows: 2Rows: 4Rows: 3

Page 27: Data Warehousing with Amazon Redshift

Terminology and Concepts: Data Distribution

KEY• The key creates an even distribution of data• Joins are performed between large fact/dimension tables• Optimizing merge joins and group by

ALL• Small and medium size dimension tables (< 2-3M)

EVEN• When key cannot produce an even distribution

Compute Node

Disks

Slices

Columns

Blocks

Page 28: Data Warehousing with Amazon Redshift

Storage Deep Dive

Page 29: Data Warehousing with Amazon Redshift

Storage Deep Dive: Disks

Amazon Redshift utilizes locally attached storage devices• Compute nodes have 2.5-3x the advertised storage capacity

1, 3, 8, or 24 disks depending on node typeEach disk is split into two partitions

• Local data storage, accessed by local CN• Mirrored data, accessed by remote CN

Partitions are raw devices• Local storage devices are ephemeral in nature• Tolerant to multiple disk failures on a single node

Compute Node

Disks

Slices

Columns

Blocks

Page 30: Data Warehousing with Amazon Redshift

Storage Deep Dive: Blocks

Column data is persisted to 1MB immutable blocksEach block contains in-memory metadata:

• Zone Maps (MIN/MAX value)• Location of previous/next block

• Blocks are individually compressed with 1 of 10 encodings

A full block contains between 16 and 8.4 million values

Compute Node

Disks

Slices

Columns

Blocks

Page 31: Data Warehousing with Amazon Redshift

Storage Deep Dive: ColumnsColumn: Logical structure accessible via SQLPhysical structure is a doubly linked list of blocksThese blockchains exist on each slice for each columnAll sorted & unsorted blockchains compose a column

Column properties include:

• Distribution Key• Sort Key• Compression Encoding

Columns shrink and grow independently, 1 block at a timeThree system columns per table-per slice for MVCC

Compute Node

Disks

Slices

Columns

Blocks

Page 32: Data Warehousing with Amazon Redshift

Block Properties: Design Considerations

• Small writes:• Batch processing system, optimized for processing massive amounts of

data• 1MB size + immutable blocks means that we clone blocks on write so as

not to introduce fragmentation• Small write (~1-10 rows) has similar cost to a larger write (~100 K rows)

• UPDATE and DELETE:• Immutable blocks means that we only logically delete rows on UPDATE or

DELETE• Must VACUUM or DEEP COPY to remove ghost rows from table

Page 33: Data Warehousing with Amazon Redshift

Column Properties: Design Considerations

• Compression:• COPY automatically analyzes and compresses data when loading into empty tables• ANALYZE COMPRESSION checks existing tables and proposes optimal

compression algorithms for each column• Changing column encoding requires a table rebuild

• DISTKEY and SORTKEY significantly influence performance (orders of magnitude)• Distribution Keys:

• A poor DISTKEY can introduce data skew and an unbalanced workload• A query completes only as fast as the slowest slice completes

• Sort Keys: • A sortkey is only effective as the data profile allows it to be• Selectivity needs to be considered

Page 34: Data Warehousing with Amazon Redshift

Parallelism Deep Dive

Page 35: Data Warehousing with Amazon Redshift

Storage Deep Dive: Slices

Each compute node has either 2, 16, or 32 slicesA slice can be thought of like a “virtual compute node”

• Unit of data partitioning • Parallel query processing

Facts about slices:• Table rows are distributed to slices• A slice processes only its own data• Within a compute node all slices read from and write to all

disks

Compute Node

Disks

Slices

Columns

Blocks

Page 36: Data Warehousing with Amazon Redshift

128GB RAM

16TB disk

16 cores

128GB RAM

16TB disk

16 cores

Compute Node

128GB RAM

16TB disk

16 cores

Compute Node

128GB RAM

16TB disk

16 cores

Compute Node

Leader Node

• Parser & Rewriter• Planner & Optimizer • Code Generator

• Input: Optimized plan• Output: >=1 C++

functions• Compiler

• Task Scheduler• WLM

• Admission• Scheduling

• PostgreSQL Catalog Tables

• Amazon Redshift System Tables (STV)

Page 37: Data Warehousing with Amazon Redshift

128GB RAM

16TB disk

16 cores

128GB RAM

16TB disk

16 cores

Compute Node

128GB RAM

16TB disk

16 cores

Compute Node

128GB RAM

16TB disk

16 cores

Compute Node

Leader Node

• Parser & Rewriter• Planner & Optimizer • Code Generator

• Input: Optimized plan• Output: >=1 C++

functions• Compiler

• Task Scheduler• WLM

• Admission• Scheduling

• PostgreSQL Catalog Tables• Amazon Redshift System

Tables (STV)

Page 38: Data Warehousing with Amazon Redshift

Query Execution Terminology

Step: An individual operation needed during query execution. Steps need to be combined to allow compute nodes to perform a join. Examples: scan, sort, hash, aggr

Segment: A combination of several steps that can be done by a single process. The smallest compilation unit executable by a slice. Segments within a stream run in parallel.

Stream: A collection of combined segments which output to the next stream or SQL client.

Page 39: Data Warehousing with Amazon Redshift

Visualizing Streams, Segments, and StepsStream 0

Segment 0 Step 0 Step 1 Step 2

Segment 1Step 0 Step 1 Step 2 Step 3 Step 4

Segment 2Step 0 Step 1 Step 2 Step 3

Segment 3Step 0 Step 1 Step 2 Step 3 Step 4 Step 5

Stream 1Segment 4

Step 0 Step 1 Step 2 Step 3Segment 5

Step 0 Step 1 Step 2Segment 6

Step 0 Step 1 Step 2 Step 3 Step 4Stream 2

Segment 7Step 0 Step 1

Segment 8Step 0 Step 1

Time

Page 40: Data Warehousing with Amazon Redshift

Query Lifecycleclient

JDBC ODBC

Leader Node

Parser

Query Planner

Code Generator

Final Computations

Generate code for all segments of one stream

Explain Plans

Compute Node

Receive Compiled Code

Run the Compiled Code

Return results to Leader

Compute Node

Receive Compiled Code

Run the Compiled Code

Return results to Leader

Return results to client

Segments in a stream are executed concurrently. Each step in a segment is executed serially.

Page 41: Data Warehousing with Amazon Redshift

Query Execution Deep Dive: Leader Node

1.The leader node receives the query and parses the SQL.2.The parser produces a logical representation of the original query. 3.This query tree is input into the query optimizer (volt).4.Volt rewrites the query to maximize its efficiency. Sometimes a single query will be rewritten as several dependent statements in the background.5.The rewritten query is sent to the planner which generates >= 1 query plans for the execution with the best estimated performance. 6.The query plan is sent to the execution engine, where it’s translated into steps, segments, and streams. 7.This translated plan is sent to the code generator, which generates a C++ function for each segment. 8.This generated C++ is compiled with gcc to a .o file and distributed to the compute nodes.

Page 42: Data Warehousing with Amazon Redshift

Query Execution Deep Dive: Compute Nodes

• Slices execute query segments in parallel • Executable segments are created for one stream at a

time in sequence• When the compute nodes are done, they return the

query results to the leader node for final processing • Leader node merges data into a single result set and

addresses any needed sorting or aggregation • Leader node then returns the results to the client

Page 43: Data Warehousing with Amazon Redshift

Visualizing Streams, Segments, and StepsStream 0

Segment 0 Step 0 Step 1 Step 2

Segment 1Step 0 Step 1 Step 2 Step 3 Step 4

Segment 2Step 0 Step 1 Step 2 Step 3

Segment 3Step 0 Step 1 Step 2 Step 3 Step 4 Step 5

Stream 1Segment 4

Step 0 Step 1 Step 2 Step 3Segment 5

Step 0 Step 1 Step 2Segment 6

Step 0 Step 1 Step 2 Step 3 Step 4Stream 2

Segment 7Step 0 Step 1

Segment 8Step 0 Step 1

Time

Page 44: Data Warehousing with Amazon Redshift

Query ExecutionStream 0

Segment 0 Step 0 Step 1 Step 2

Segment 1Step 0 Step 1 Step 2 Step 3 Step 4

Segment 2Step 0 Step 1 Step 2 Step 3

Segment 3Step 0 Step 1 Step 2 Step 3 Step 4 Step 5

Stream 1Segment 4

Step 0 Step 1 Step 2 Step 3Segment 5

Step 0 Step 1 Step 2Segment 6

Step 0 Step 1 Step 2 Step 3 Step 4Stream 2

Segment 7Step 0 Step 1

Segment 8Step 0 Step 1

Time

Stream 0Segment 0

Step 0 Step 1 Step 2Segment 1

Step 0 Step 1 Step 2 Step 3 Step 4Segment 2

Step 0 Step 1 Step 2 Step 3Segment 3

Step 0 Step 1 Step 2 Step 3 Step 4 Step 5Stream 1

Segment 4Step 0 Step 1 Step 2 Step 3

Segment 5Step 0 Step 1 Step 2

Segment 6Step 0 Step 1 Step 2 Step 3 Step 4

Stream 2Segment 7

Step 0 Step 1Segment 8

Step 0 Step 1

Stream 0Segment 0

Step 0 Step 1 Step 2Segment 1

Step 0 Step 1 Step 2 Step 3 Step 4Segment 2

Step 0 Step 1 Step 2 Step 3Segment 3

Step 0 Step 1 Step 2 Step 3 Step 4 Step 5Stream 1

Segment 4Step 0 Step 1 Step 2 Step 3

Segment 5Step 0 Step 1 Step 2

Segment 6Step 0 Step 1 Step 2 Step 3 Step 4

Stream 2Segment 7

Step 0 Step 1Segment 8

Step 0 Step 1

Stream 0Segment 0

Step 0 Step 1 Step 2Segment 1

Step 0 Step 1 Step 2 Step 3 Step 4Segment 2

Step 0 Step 1 Step 2 Step 3Segment 3

Step 0 Step 1 Step 2 Step 3 Step 4 Step 5Stream 1

Segment 4Step 0 Step 1 Step 2 Step 3

Segment 5Step 0 Step 1 Step 2

Segment 6Step 0 Step 1 Step 2 Step 3 Step 4

Stream 2Segment 7

Step 0 Step 1Segment 8

Step 0 Step 1

Slic

es

0

1

2

3

Page 45: Data Warehousing with Amazon Redshift

Parallelism considerations with Redshift slices

DS2.8XL Compute Node

Ingestion Throughput:• Each slice’s query processors can load one file at a time:

• Streaming decompression• Parse• Distribute• Write

Realizing only partial node usage as 6.25% of slices are active

0 2 4 6 8 10 12 141 3 5 7 9 11 13 15

Page 46: Data Warehousing with Amazon Redshift

Design considerations for Redshift slices

Use at least as many input files as there are slices in the cluster

With 16 input files, all slices are working so you maximize throughput

COPY continues to scale linearly as you add nodes

16 Input Files

DS2.8XL Compute Node

0 2 4 6 8 10 12 141 3 5 7 9 11 13 15

Page 47: Data Warehousing with Amazon Redshift

Open source tools

https://github.com/awslabs/amazon-redshift-utilshttps://github.com/awslabs/amazon-redshift-monitoringhttps://github.com/awslabs/amazon-redshift-udfs

Admin scriptsCollection of utilities for running diagnostics on your cluster

Admin viewsCollection of utilities for managing your cluster, generating schema DDL, etc.

ColumnEncodingUtilityGives you the ability to apply optimal column encoding to an established schema with data already loaded