Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10...

60
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora: A deep-dive into what’s new Sailesh Krishnamurthy Amazon Web Services Percona Live April, 2017

Transcript of Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10...

Page 1: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Amazon Aurora: A deep-dive into what’s new

Sailesh Krishnamurthy

Amazon Web Services

Percona Live

April, 2017

Page 2: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Agenda

Catching up – 18 months since GA

Aurora performance deep-dive

Aurora availability architecture

What is Aurora?

Who is using it?

Recent announcements

Designing for performance

Aurora Performance enhancements

Designing for availability

Recent availability enhancements

Page 3: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Amazon Aurora

Speed and availability of high-end commercial databases

Simplicity and cost-effectiveness of open source databases

Drop-in compatibility with MySQL and PostgreSQL

Simple pay as you go pricing

Delivered as a managed service

Reimagining Databases for the Cloud

Page 4: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Re-imagining the relational database

Fully managed service – automate administrative tasks 3

1 Scale-out, distributed, multi-tenant design

2 Service-oriented architecture leveraging AWS services

Page 5: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Scale-out, distributed, multi-tenant architecture

Master Replica Replica Replica

Availability

Zone 1

Shared storage volume

Availability

Zone 2

Availability

Zone 3

Storage nodes with SSDs

▪ Purpose-built log-structured distributed

storage system designed for databases

▪ Storage volume is striped across

hundreds of storage nodes distributed

over 3 different availability zones

▪ Six copies of data, two copies in each

availability zone to protect against

AZ+1 failures

▪ Plan to apply same principles to other

layers of the stack

SQL

Transactions

Caching

SQL

Transactions

Caching

SQL

Transactions

Caching

Page 6: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Leveraging cloud ecosystem

Lambda

S3

IAM

CloudWatch

Invoke Lambda events from stored procedures/triggers.

Load data from S3, store snapshots and backups in S3.

Use IAM roles to manage database access control.

Upload systems metrics and audit logs to CloudWatch.

Page 7: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Automate administrative tasks

Schema design

Query construction

Query optimization

Automatic fail-over

Backup & recovery

Isolation & security

Industry compliance

Push-button scaling

Automated patching

Advanced monitoring

Routine maintenance

Takes care of your time-consuming database management tasks,

freeing you to focus on your applications and business

You

AWS

Page 8: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Aurora customer adoption

Aurora is used by:

2/3 of top 100 AWS customers

8 of top 10 gaming customers

Fastest growing service in AWS history

Page 9: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Who is moving to Aurora and why ?

Customers using

commercial engines

Customers using

MySQL engines

▪ Higher performance – up to 5x

▪ Better availability and durability

▪ Reduces cost – up to 60%

▪ Easy migration; no application change

▪ One tenth of the cost; no licenses

▪ Integration with cloud ecosystem

▪ Comparable performance and availability

▪ Migration tooling and services

Page 10: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Recent announcements

• Performance enhancements:

Fast DDL, fast index build, spatial indexing, hot row contention

• Availability features:

Zero-downtime patching, database cloning (Q2), database backtrack (Q2)

• Eco-system integration:

Load from S3, IAM integration (Q2), select into S3 (Q2), log upload to CloudWatch

Logs & S3 (Q2)

• Cost reduction:

t2.small – cuts cost of entry by half – you can run Aurora for $1 / day

• Growing footprint:

Launched in LHR, YUL, CMH, and SFO – now available in all 3AZ regions

Page 11: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

MySQL compatibility

MySQL 5.6 / InnoDB compatible

▪ No application compatibility issues reported in last

18 months

▪ MySQL ISV applications run pretty much as is

Working on 5.7 compatibility

▪ Running slower than expected – hope to make it

available within Q2

▪ Back ported 81 fixes from different MySQL

releases

Business Intelligence Data Integration Query and Monitoring

“We ran our compatibility test suites against Amazon Aurora and everything just

worked."

- Dan Jewett, VP, Product Management at Tableau

Page 12: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Designing for Performance

Page 13: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Aurora Performance Tenets

Do fewer IOs

Minimize network packets

Cache prior results

Offload the database engine

DO LESS WORK

Process asynchronously

Reduce latency path

Use lock-free data structures

Batch operations together

BE MORE EFFICIENT

DATABASES ARE ALL ABOUT I/O

NETWORK-ATTACHED STORAGE IS ALL ABOUT PACKETS/SECOND

HIGH-THROUGHPUT PROCESSING IS ALL ABOUT CONTEXT SWITCHES

Page 14: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

IO traffic in MySQL

BINLOG DATA DOUBLE-WRITELOG FRM FILES

T Y P E O F W R I T E

MYSQL WITH REPLICA

EBS mirrorEBS mirror

AZ 1 AZ 2

Amazon S3

EBSAmazon Elastic

Block Store (EBS)

Primary

Instance

Replica

Instance

1

2

3

4

5

Issue write to EBS – EBS issues to mirror, ack when both done

Stage write to standby instance using synchronous replication

Issue write to EBS on standby instance

IO FLOW

Steps 1, 3, 4 are sequential and synchronous

This amplifies both latency and jitter

Many types of writes for each user operation

Have to write data blocks twice to avoid torn writes

OBSERVATIONS

780K transactions

7,388K I/Os per million txns (excludes mirroring, standby)

Average 7.4 I/Os per transaction

PERFORMANCE

30 minute SysBench writeonly workload, 100GB dataset, RDS MultiAZ, 30K PIOPS

Page 15: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

IO traffic in Aurora (Database Node)

AZ 1 AZ 3

Primary

Instance

Amazon S3

AZ 2

Replica

Instance

AMAZON AURORA

ASYNC

4/6 QUORUM

DISTRIBUTED

WRITES

BINLOG DATA DOUBLE-WRITELOG FRM FILES

T Y P E O F W R I T E

IO FLOW

Only write redo log records; all steps asynchronous

No data block writes (checkpoint, cache replacement)

6X more log writes, but 9X less network traffic

Tolerant of network and storage outlier latency

OBSERVATIONS

27,378K transactions 35X MORE

950K I/Os per 1M txns (6X amplification) 7.7X LESS

PERFORMANCE

Boxcar redo log records – fully ordered by LSN

Shuffle to appropriate segments – partially ordered

Boxcar to storage nodes and issue writesReplica

Instance

Page 16: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

IO traffic in Aurora (Storage Node)

LOG RECORDS

Primary

Instance

INCOMING QUEUE

STORAGE NODE

S3 BACKUP

1

2

3

4

5

6

7

8

UPDATE

QUEUE

ACK

HOT

LOG

DATA

BLOCKS

POINT IN TIME

SNAPSHOT

GC

SCRUB

COALESCE

SORT

GROUP

PEER TO PEER GOSSIPPeer

Storage

Nodes

All steps are asynchronous

Only steps 1 and 2 are in foreground latency path

Input queue is 46X less than MySQL (unamplified, per node)

Favor latency-sensitive operations

Use disk space to buffer against spikes in activity

OBSERVATIONS

IO FLOW

① Receive record and add to in-memory queue

② Persist record and ACK

③ Organize records and identify gaps in log

④ Gossip with peers to fill in holes

⑤ Coalesce log records into new data block versions

⑥ Periodically stage log and new block versions to S3

⑦ Periodically garbage collect old versions

⑧ Periodically validate CRC codes on blocks

Page 17: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

IO traffic in Aurora Replicas

PAGE CACHE

UPDATE

Aurora Master

30% Read

70% Write

Aurora Replica

100% New Reads

Shared Multi-AZ Storage

MySQL Master

30% Read

70% Write

MySQL Replica

30% New Reads

70% Write

SINGLE-THREADED

BINLOG APPLY

Data Volume Data Volume

Logical: Ship SQL statements to Replica

Write workload similar on both instances

Independent storage

Can result in data drift between Master and Replica

Physical: Ship redo from Master to Replica

Replica shares storage. No writes performed

Cached pages have redo applied

Advance read view when all commits seen

MYSQL READ SCALING AMAZON AURORA READ SCALING

Page 18: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

“In MySQL, we saw replica lag spike to almost 12 minutes which is

almost absurd from an application’s perspective. With Aurora, the

maximum read replica lag across 4 replicas never exceeded 20 ms.”

Real-life data - read replica latency

Page 19: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Asynchronous group commits

Read

Write

Commit

Read

Read

T1

Commit (T1)

Commit (T2)

Commit (T3)

LSN 10

LSN 12

LSN 22

LSN 50

LSN 30

LSN 34

LSN 41

LSN 47

LSN 20

LSN 49

Commit (T4)

Commit (T5)

Commit (T6)

Commit (T7)

Commit (T8)

LSN GROWTHDurable LSN at head-node

COMMIT QUEUEPending commits in LSN order

TIME

GROUP

COMMIT

TRANSACTIONS

Read

Write

Commit

Read

Read

T1

Read

Write

Commit

Read

Read

Tn

TRADITIONAL APPROACH AMAZON AURORA

Maintain a buffer of log records to write out to disk

Issue write when buffer full or time out waiting for writes

First writer has latency penalty when write rate is low

Request I/O with first write, fill buffer till write picked up

Individual write durable when 4 of 6 storage nodes ACK

Advance DB Durable point up to earliest pending ACK

Page 20: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Re-entrant connections multiplexed to active threads

Kernel-space epoll() inserts into latch-free event queue

Dynamically size threads pool

Gracefully handles 5000+ concurrent client sessions on r3.8xl

Standard MySQL – one thread per connection

Doesn’t scale with connection count

MySQL EE – connections assigned to thread group

Requires careful stall threshold tuning

CLIE

NT

CO

NN

EC

TIO

N

CLIE

NT

CO

NN

EC

TIO

N

LATCH FREE

TASK QUEUE

ep

oll(

)

MYSQL THREAD MODEL AURORA THREAD MODEL

Adaptive thread pool

Page 21: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Scan

Delete

Aurora lock management

Scan

Delete

Insert

Scan Scan

Insert

Delete

Scan

Insert

Insert

MySQL lock manager Aurora lock manager

▪ Same locking semantics as MySQL

▪ Concurrent access to lock chains

▪ Multiple scanners allowed in an individual lock chains

▪ Lock-free deadlock detection

Needed to support many concurrent sessions, high update throughput

Page 22: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

WRITE PERFORMANCE READ PERFORMANCE

Scaling with instance sizes

Aurora scales with instance size for both read and write.

Aurora MySQL 5.6 MySQL 5.7

Page 23: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Real-life data – gaming workloadAurora vs. RDS MySQL – r3.4XL, MAZ

Aurora 3X faster on r3.4xlarge

Page 24: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Recent Performance Enhancements

Page 25: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Faster index build (Aug 2016)

▪ MySQL 5.6 inserts entries top down into

the new btree, causing splits and lots of

logging.

▪ Aurora builds the leaf blocks and then the

branches of the tree (similar to MySQL

5.7).

• No splits during the build.

• Each page touched only once.

• One log record per page.

2-4X better than MySQL 5.6 or MySQL 5.70

2

4

6

8

10

12

r3.large on 10GBdataset

r3.8xlarge on10GB dataset

r3.8xlarge on100GB dataset

Ho

urs

RDS MySQL 5.6 RDS MySQL 5.7 Aurora 2016

Page 26: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Scan

Delete

Hot row contention (Nov 2016)

Scan

Delete

Insert

Scan Scan

Insert

Delete

Scan

Insert

Insert

MySQL lock manager Aurora lock manager

Highly contended workloads had high memory and CPU

▪ 1.9 (Nov) – lock compression (bitmap for hot locks)

▪ 1.9 – replace spinlocks with blocking futex – up to 12x reduction in CPU, 3x improvement in throughput

▪ December – use dynamic programming to release locks: from O(totalLocks * waitLocks) to O(totalLocks)

Throughput on Percona TPC-C 100 improved 29x (from 1,452 txns/min to 42,181 txns/min)

Page 27: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Hot row contention

MySQL 5.6 MySQL 5.7 Aurora Improvement

500 connections 6,093 25,289 73,955 2.92x

5000 connections 1,671 2,592 42,181 16.3x

Percona TPC-C – 10GB

* Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS

MySQL 5.6 MySQL 5.7 Aurora Improvement

500 connections 3,231 11,868 70,663 5.95x

5000 connections 5,575 13,005 30,221 2.32x

Percona TPC-C – 100GB

Page 28: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

▪ Accelerates batch inserts sorted by

primary key – works by caching the

cursor position in an index traversal.

▪ Dynamically turns itself on or off based on

data pattern

▪ Avoids contention in acquiring latches

while navigating down the tree.

▪ Bi-directional, works across all insert

statements

• LOAD INFILE, INSERT INTO SELECT, INSERT

INTO REPLACE and, Multi-value inserts.

Insert performance (Dec 2015)

Index

R4 R5R2 R3R0 R1 R6 R7 R8

Index

Root

Index

R4 R5R2 R3R0 R1 R6 R7 R8

Index

Root

MySQL: Traverses B-tree starting from root for all inserts

Aurora: Inserts avoids index traversal;

Page 29: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Spatial Indexes in Aurora (Dec 2016)

Z-index used in Aurora

Challenges with R-Trees

Keeping it efficient while balanced

Rectangles should not overlap or cover empty space

Degenerates over time

Re-indexing is expensive

R-Tree used in MySQL 5.7

Z-index (dimensionally ordered space filling curve)

Uses regular B-Tree for storing and indexing

Removes sensitivity to resolution parameter

Adapts to granularity of actual data without user declaration

Eg GeoWave (National Geospatial-Intelligence Agency)

Page 30: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Spatial Index BenchmarksSysbench – points and polygons

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . .. .

. . . . . . . . . . . . .

* r3.8xlarge using Sysbench on <1GB dataset

* Write Only: 4000 clients, Select Only: 2000 clients, ST_EQUALS

0

20000

40000

60000

80000

100000

120000

140000

Select-only(reads/sec) Write-only(writes/sec)

Aurora

MySQL5.7

Page 31: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Cached read performance (Dec 2016)

Catalog concurrency: Improved data dictionary

synchronization and cache eviction.

NUMA aware scheduler: Aurora scheduler is

now NUMA aware. Helps scale on multi-socket

instances.

Read views: Aurora now uses a latch-free

concurrent read-view algorithm to construct read

views.

0

100

200

300

400

500

600

700

MySQL 5.6 MySQL 5.7 Aurora 2015 Aurora 2016

In thousands of read requests/sec

* R3.8xlarge instance, <1GB dataset using Sysbench

25% Throughput gain

Page 32: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Smart scheduler: Aurora scheduler now

dynamically assigns threads between IO heavy

and CPU heavy workloads.

Smart selector: Aurora reduces read latency by

selecting the copy of data on a storage node with

best performance

Logical read ahead (LRA): We avoid read IO

waits by prefetching pages based on their order

in the btree.

Non-cached read performance (Dec 2016)

0

20

40

60

80

100

120

MySQL 5.6 MySQL 5.7 Aurora 2015 Aurora 2016

In thousands of requests/sec

* R3.8xlarge instance, 1TB dataset using Sysbench

10% Throughput gain

Page 33: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

High Performance Auditing (Jan 2017)

MariaDB server_audit plugin Aurora native audit support

We can sustain over 500K events/sec

Create event string

DDL

DML

Query

DCL

Connect

DDL

DML

Query

DCL

Connect

Write

to File

Create event string

Create event string

Create event string

Create event string

Create event string

Latch-free

queue

Write to File

Write to File

Write to File

MySQL 5.7 Aurora

Audit Off 95K 615K 6.47x

Audit On 33K 525K 15.9x

Sysbench Select-only Workload on 8xlarge Instance

Page 34: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Fast DDL: Aurora vs. MySQL (Mar 2017)

▪ Full Table copy; rebuilds all indexes – can take

hours or days to complete.

▪ Needs temporary space for DML operations

▪ DDL operation impacts DML throughput

Index

LeafLeafLeaf Leaf

Index

Roottable name operation column-name time-stamp

Table 1

Table 2

Table 3

add-col

add-col

add-col

column-abc

column-qpr

column-xyz

t1

t2

t3

▪ Add entry to the metadata table and use schema

versioning to decode the page.

▪ Modify-on-write primitive to upgrade the block to latest

schema when it is modified. .

MySQL Amazon Aurora

Support NULLable column at the end;

other add column, drop/reorder, modify data types

Page 35: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Fast DDL performance

On r3.large

On r3.8xlarge

Aurora MySQL 5.6 MySQL 5.7

10GB table 0.27 sec 3,960 sec 1,600 sec

50GB table 0.25 sec 23,400 sec 5,040 sec

100GB table 0.26 sec 53,460 sec 9,720 sec

Aurora MySQL 5.6 MySQL 5.7

10GB table 0.06 sec 900 sec 1,080 sec

50GB table 0.08 sec 4,680 sec 5,040 sec

100GB table 0.15 sec 14,400 sec 9,720 sec

Page 36: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Fast DDL Roadmap

Q1 2017 Q2 2017 2H 2017+

✓Add nullable column at

the end of the table,

without default values

• Other ADD COLUMN operations

• Nullable and not null columns

• With and without default value

• At the end, or in any any

position

• Mark columns nullable

• Reorder columns

• Increase the size of the

column in the same type

family, such as int to bigint

• Drop column

• Mark columns not-nullable

Page 37: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Designing for Availability

Page 38: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

6-way replicated storage

Six copies across three Availability Zones

4 out 6 write quorum; 3 out of 6 read quorum

Peer-to-peer replication for repairs

Volume striped across hundreds of storage nodes

SQL

Transact

ion

AZ 1 AZ 2 AZ 3

Caching

SQL

Transact

ion

AZ 1 AZ 2 AZ 3

Caching

Read and write availability Read availability

Survives catastrophic failures

Page 39: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Storage Durability

Storage volume automatically grows up to 64 TB

Quorum system for read/write; latency tolerant

Peer to peer gossip replication to fill in holes

Continuous backup to S3 (built for 11 9s durability)

Continuous monitoring of nodes and disks for repair

10GB segments as unit of repair or hotspot rebalance

Quorum membership changes do not stall writes

AZ 1 AZ 2 AZ 3

Amazon S3

Page 40: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Continuous Backup and Point-in-Time Restore

Segment snapshot Log records

Recovery point

Segment 1

Segment 2

Segment 3

Time

• Take periodic snapshot of each segment in parallel; stream the redo logs to Amazon S3

• Backup happens continuously without performance or availability impact

• At restore, retrieve the appropriate segment snapshots and log streams to storage nodes

• Apply log streams to segment snapshots in parallel and asynchronously

Page 41: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Traditional Databases

Have to replay logs since the last

checkpoint

Typically 5 minutes between checkpoints

Single-threaded in MySQL; requires a

large number of disk accesses

Amazon Aurora

Underlying storage replays redo records

on demand as part of a disk read

Parallel, distributed, asynchronous

No replay for startup

Checkpointed Data Redo Log

Crash at T0 requires

a re-application of the

SQL in the redo log since

last checkpoint

T0 T0

Crash at T0 will result in redo logs being

applied to each segment on demand, in

parallel, asynchronously

Instant Crash Recovery

Page 42: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Survivable Caches

We moved the cache out of the

database process

Cache remains warm in the event of

database restart

Lets you resume fully loaded

operations much faster

Instant crash recovery + survivable

cache = quick and easy recovery from

DB failures

SQL

Transactions

Caching

SQL

Transactions

Caching

SQL

Transactions

Caching

Caching process is outside the DB process

and remains warm across a database restart

Page 43: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Up to 15 promotable read replicas

►Up to 15 promotable read replicas across multiple availability zones

►Re-do log based replication leads to low replica lag – typically < 10ms

►Reader end-point with load balancing; customer specifiable failover order

MasterRead

ReplicaRead

ReplicaRead

Replica

Shared distributed storage volume

Reader end-point

Writer end-point

Page 44: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Faster Fail-Over

AppRunningFailure Detection DNS Propagation

Recovery Recovery

DBFailure

MYSQL

App

Running

Failure Detection DNS Propagation

Recovery

DB

Failure

AURORA WITH MARIADB DRIVER

1 5 - 2 0 s e c

3 - 2 0 s e c

Page 45: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Database fail-over time

0,00%

5,00%

10,00%

15,00%

20,00%

25,00%

30,00%

35,00%

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35

0 - 5s – 30% of fail-overs

0,00%

5,00%

10,00%

15,00%

20,00%

25,00%

30,00%

35,00%

40,00%

45,00%

50,00%

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35

5 - 10s – 40% of fail-overs

0%

10%

20%

30%

40%

50%

60%

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35

10 - 20s – 25% of fail-overs

0%

5%

10%

15%

20%

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35

20 - 30s – 5% of fail-overs

Page 46: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Recent and upcoming availability enhancements

Page 47: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Availability is about more than HW failures

You also incur availability disruptions when you

1. Patch your database software

2. Modify your database schema

3. Perform large scale database reorganizations

4. DBA errors requiring database restores

Page 48: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Zero downtime patching (Dec 2016)

Netw

ork

ing

sta

te

Ap

plic

ati

on

sta

te

Storage Service

App state

Net state

App state

Net state

Befo

re

ZD

P

New DB Engine

Old DB Engine

New DB Engine

Old DB Engine

With Z

DP

User sessions terminate

during patching

User sessions remain

active through patchingStorage Service

Page 49: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Zero downtime patching – current constraints

• Parameter changes pending

• Temporary tables open

• Locked Tables

We have to go to our current patching model when we can’t park connections:

• Long running queries

• Open transactions

• Bin-log enabled

We are working on addressing the above.

• SSL connections open

• Read replica instances

Page 50: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Database cloning (planned for May 2017)

Create a copy of a database without

duplicate storage costs

• Creation of a clone is nearly

instantaneous – we don’t copy data

• Data copy happens only on write – when

original and cloned volume data differ

Typical use cases:

• Clone a production DB to run tests

• Reorganize a database

• Save a point in time snapshot for analysis

without impacting production system.

Production database

Clone Clone

CloneDev/test

applications

Benchmarks

Production applications

Production applications

Page 51: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

How does it work?

Page

1

Page

2

Page

3

Page

4

Source Database

Page

1

Page

3

Page

2

Page

4

Cloned database

Shared Distributed Storage System: physical pages

Both databases reference same pages on the shared

distributed storage system

Page

1

Page

2

Page

3

Page

4

Page 52: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

How does it work? (contd.)

Page

1

Page

2

Page

3

Page

4

Page

5

Page

1

Page

3

Page

5

Page

2

Page

4

Page

6

Page

1

Page

2

Page

3

Page

4

Page

5

Page

6

As databases diverge, new pages are added appropriately to each

database while still referencing pages common to both databases

Page

2

Page

3

Page

5

Shared Distributed Storage System: physical pages

Source Database Cloned database

Page 53: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

What is database backtrack ? Planned for June 2017

Backtrack quickly brings the database to a point in time without requiring restore from backups

• Backtracking from an unintentional DML or DDL operation

• Backtrack is not destructive. You can backtrack multiple times to find the right point in time.

t0 t1 t2

t0 t1

t2

t3 t4

t3

t4

Rewind to t1

Rewind to t3

Invisible Invisible

Page 54: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

How does it work?

Segment snapshot Log records

Rewind Point

Segment 1

Segment 2

Segment 3

Time

▪ Take periodic snapshots within each segment in parallel and store them locally

▪ At rewind time, each segment picks the previous local snapshot and applies the

log streams to the snapshot to produce the desired state of the DB

Storage Segments

Page 55: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Logs within the log stream are made visible or invisible based on the branch within the LSN

tree to provide a consistent view for the DB

▪ The first rewind performed at t2 to rewind the DB to t1 makes the logs in purple color invisible

▪ The second rewind performed at time t4 to rewind the DB to t3 makes the logs in red and purple invisible

How does it work? (contd.)

t0 t1 t2

t0 t1

t2

t3 t4

t3

t4

Rewind to t1

Rewind to t3

Invisible Invisible

Page 56: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Removing Blockers

Page 57: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

My databases need to meet certifications

▪ Amazon Aurora gives each database

instance IP firewall protection

▪ Aurora offers transparent encryption at rest

and SSL protection for data in transit

▪ Amazon VPC lets you isolate and control

network configuration and connect

securely to your IT infrastructure

▪ AWS Identity and Access Management

provides resource-level permission

controls

*New* *New*

Page 58: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

T2 RI discounts

Up to 34% with a 1-year RI

Up to 57% with a 3-year RI

vCPU Mem Hourly Price

db.t2.small 1 2 $0.041

db.t2.medium 2 4 $0.082

db.r3.large 2 15.25 $0.29

db.r3.xlarge 4 30.5 $0.58

db.r3.2xlarge 8 61 $1.16

db.r3.4xlarge 16 122 $2.32

db.r3.8xlarge 32 244 $4.64

Run Aurora for $1/day (we just introduced t2.small)

*Prices are for Virginia

*NEW *

*NEW *

Page 59: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Near-term Roadmap (3-6 month plan)

Category Q4/2016 Q1/2017 Q2 and Q3/2017

Performance • Spatial indexing

• R4 instance support

• Parallel data load from S3

• Out-of-cache read improvements

• Asynchronous key-based pre-fetch

Availability

• Reader cluster end point and

load balancing

• Zero downtime patching

• Cross-region snapshot copy

• Cross region replication for encrypted

databases

• Fast DDL v1

• Copy-on-write database cloning

• Backtrack

• Fast DDL v2

• Zero downtime restart

Security • HIPAA/BAA Compliance

• Enhanced auditing

• Cross account and cross region

encrypted snapshot sharing

• IAM Integration and Active Directory

integration

• Encrypted RDS MySQL-Aurora

replication

Ecosystem

Integration

• Lambda Integration with

Aurora

• Load tables from S3

• Load from S3 (manifest option)

• Load compressed files from S3

• Log upload to S3 and CloudWatch

• Select into S3

• Integration with AWS Performance

Insights

Cost-effective • T2.medium instance type • T2.small instance type

MySQL-compatible• RDS MySQL to Aurora replication (in

region)• MySQL 5.7 compatibility

Regions • US West (N. California) • EU (Frankfurt)

Page 60: Amazon Aurora: A deep-dive into what’s new · * Numbers are in tpmC, measured using release 1.10 on an R3.8xlarge, MySQL numbers using RDS and EBS with 30K PIOPS MySQL 5.6 MySQL

Thank You !