S3Guard: What's in your consistency model?

20
1© Hortonworks Inc. 2011 – 2016. All Rights Reserved S3Guard: What’s in Your Consistency Model? Mingliang Liu @liuml07 Steve Loughran @steveloughran December 2016

Transcript of S3Guard: What's in your consistency model?

Page 1: S3Guard: What's in your consistency model?

1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

S3Guard: What’s in Your Consistency Model? Mingliang Liu @liuml07Steve Loughran @steveloughran

December 2016

Page 2: S3Guard: What's in your consistency model?

Steve LoughranHadoop committer & PMC, ASF Member

Mingliang Liu Apache Hadoop committer

Chris Nauroth, Hadoop committer & PMC, ASF member

Rajesh BalamohanTez Committer & PMC

Page 3: S3Guard: What's in your consistency model?

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

S3A:Hadoop File System for S3(EMR: use Amazon's s3:// )

Page 4: S3Guard: What's in your consistency model?

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Storage Use Evolution

HDFS

Application

HDFS

Application

GoalEvolution towards cloud storage as the primary Data Lake

Input Output

Backup Restore

InputOutput

Copy

HDFS

Application

Input

Output

tmp

Page 5: S3Guard: What's in your consistency model?

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

org.apache.hadoop.fs.FileSystem

hdfs s3awasb adlswift gs

Hadoop File System - One Interface Fits All

Page 6: S3Guard: What's in your consistency model?

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

/

work

pending

part-00

part-01

00

00

00

01

0101

complete

part-01

rename("/work/pending/part-01", "/work/complete")

A FileSystem: Directories, Files Data

Page 7: S3Guard: What's in your consistency model?

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

S3A: Object Store Pretending A FileSystem

Cloud Object Stores designed for– Scale– Cost– Geographic Distribution– Availability

Cloud apps dedicatedly deal with cloud storage semantics and limitations Hadoop apps should work on cloud storage transparently

– S3A partially adheres to the FileSystem specification

Page 8: S3Guard: What's in your consistency model?

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

00

00

00

01

01

s01 s02

s03 s04

hash("/work/pending/part-01") ["s02", "s03", "s04"]

01

010101

hash("/work/pending/part-00") ["s01", "s02", "s04"]

hash(name)->blob

Page 9: S3Guard: What's in your consistency model?

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

What Is The Problem?

Performance– separated from compute– cloud storage not designed for file-like access patterns

Limitations in APIs– delete(path, recursive=true)– rename(source, dest)

Eventual consistency– Create Consistency– Update– Delete– Listing

• take time to list created objects• lag in changed metadata about existing objects• lag in observing deleted objects

Page 10: S3Guard: What's in your consistency model?

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

00

00

00

01

01

s01 s02

s03 s04

hash("/work/pending/part-01") ["s02", "s03", "s04"]

copy("/work/pending/part-01", "/work/complete/part01")

01

010101

delete("/work/pending/part-01")

hash("/work/pending/part-00") ["s01", "s02", "s04"]

rename(): A Series of Operations on The Client

Page 11: S3Guard: What's in your consistency model?

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Eventual Consistency From FileSystem’s View

When listing "a directory”– Newly created files may not yet be visible, deleted ones still present

After updating an object– Opening and reading the object may still return the previous data

After deleting an object– Opening the object may succeed, returning the data

While reading an object– If object is updated or deleted during the process

Page 12: S3Guard: What's in your consistency model?

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

00

00

00

01

01

s01 s02

s03 s04

01

DELETE /work/pending/part-00

HEAD /work/pending/part-00

GET /work/pending/part-00

200

200

200

Eventually Consistent – Seeing Deleted Data

Page 13: S3Guard: What's in your consistency model?

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

S3Guard:Fast, Consistent S3 Metadata(EMR: use Amazon's EMRFS)

Page 14: S3Guard: What's in your consistency model?

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

S3Guard: Fast, Consistent S3 Metadata

Inspired by Apache licensed S3mper project from Netflix Using DynamoDB as the consistent metadata store Mutating file system operations

– Update both S3 and DynamoDB Read operations

– Return results to callers as sourced from S3– First check their results against the metadata in DynamoDB– S3A waits and rechecks both S3 and DynamoDB until they agree

Goals– Provide consistent list and get status operations on S3 objects written with S3Guard enabled

• listStatus() after put and delete• getFileStatus() after put and delete

– Provide tools to manage associated metadata and caching policies.– Configurable error handling when inconsistency is detected– Performance improvements that impact real workloads.

Page 15: S3Guard: What's in your consistency model?

15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

00

00

00

01

01

s01 s02

s03 s04

01

DELETE part-00200

HEAD part-00200

HEAD part-00404

PUT part-00200

00

DynamoDB As The Consistent Metadata Store

Page 16: S3Guard: What's in your consistency model?

16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Demo

Page 17: S3Guard: What's in your consistency model?

17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

https://issues.apache.org/jira/browse/HADOOP-13345

Page 18: S3Guard: What's in your consistency model?

18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved© Hortonworks Inc. 2011 – 2016. All Rights Reserved18

Questions?

Page 19: S3Guard: What's in your consistency model?

19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Backup Slides

Page 20: S3Guard: What's in your consistency model?

20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

00

00

00

01

01

s01 s02

s03 s04

HEAD /work/complete/part-01

PUT /work/complete/part01x-amz-copy-source: /work/pending/part-01

01

DELETE /work/pending/part-01

PUT /work/pending/part-01... DATA ...

GET /work/pending/part-01Content-Length: 1-8192

GET /?prefix=/work&delimiter=/

REST APIs