STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for...

46
1 STAIR Codes: A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee The Chinese University of Hong Kong Coding: From Practice to Theory, February 2015 This is a joint work with Dr. Mingqiang Li.

Transcript of STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for...

Page 1: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

1

STAIR Codes:

A General Family of Erasure Codes

for Tolerating Device and Sector Failures

in Practical Storage Systems

Patrick P. C. Lee

The Chinese University of Hong Kong

Coding: From Practice to Theory, February 2015

This is a joint work with Dr. Mingqiang Li.

Page 2: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Device and Sector Failures

Hierarchy of failures in disk arrays:

• Device failure: data loss in an entire device

• Sector failure (latent sector error): data loss in a sector

2

Annual sector failure rate

[Bairavasundaram et al.,

SIGMETRICS ’07]

Annual disk failure rate

[Pinheiro et al., FAST’07]

Burstiness of sector failures

[Schroeder et al., FAST ’10]

× %

(c) Sector failure bursts can

be long (> 5) (b) Sector failures can be more

frequent than disk failures

(a) Annual disk failure rate:

1~10%

Page 3: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Erasure Coding

(N,K) systematic MDS codes

• Encode K data symbols to create N-K parity symbols

• Distribute a stripe of the N symbols across disks

• Any K out of N symbols can recover original K data

symbols

Symbols are mapped to sectors

RAID is one specific implementation of erasure

coding

3

Page 4: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Mixed Failure Scenario

Consider a worst-case failure scenario with

• m=1 entirely failed device, and

• m′=2 partially failed devices with 1 and 3 sector failures

4

Question: How can we efficiently tolerate such a

mixed failure scenario via erasure coding?

Page 5: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

RAID

Overkill to use 2 parity devices to tolerate

m′=2 partially failed devices

• Device-level tolerance only 5

5 data devices

3 parity devices to tolerate

• m=1 entirely failed device

• m′=2 partially failed devices

Page 6: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Intra-Device Redundancy (IDR)

Still overkill to add parity sectors per data device

6

3 parity sectors per data device

to tolerate a sector failure burst

of length 3

m=1 parity

device

[Dholakia et al., TOS 2008]

Page 7: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Sector-Disk (SD) Codes

Simultaneously tolerate

• m entirely failed devices

• s failed sectors (per stripe) in partially failed devices

Construction currently limited to s ≤ 3

How to tolerate our mixed failure scenario?

• m=1 entirely failed device, and

• m′=2 partially failed devices with 1 and 3 sector failures

7

[Plank et al., FAST ’13, TOS’14]

s parity sectors

m parity devices

Page 8: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Sector-Disk (SD) Codes

Such an SD code is unavailable

8

s=4 global parity sectors to

tolerate any 4 sector failures

m=1 parity

device

[Plank et al., FAST ’13, TOS’14]

Page 9: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Our Work

Construct a general, space-efficient family of

erasure codes to tolerate both device and

sector failures

a) General: without any restriction on

• size of a storage array,

• number of tolerable device failures, or

• number of tolerable sector failures

b) Space-efficient:

• Use parity sectors to tolerate sector failures (like SD codes)

9

STAIR

Codes

[FAST’14]

[ACM Trans. Storage]

Page 10: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Failure Scenarios

RAID reconstruction performance preserved for

m disk failures

Fault tolerance for the worst-case m disk failures

and a “coverage” of sector failures

10

Page 11: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Key Properties of STAIR Codes

Sector failure coverage vector e

• Defines a pattern of how sector failures occur, rather than how

many sector failures would occur

Code structure based on two encoding phases

• Each phase builds on an MDS code

Two encoding methods: upstairs and downstairs encoding

• Maintain regularity of data placement

• Reuse computed parity results in encoding

• Provide complementary performance gains

11

Page 12: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Sector Failure Coverage Vector

e = (e0, e1, e2, …, em′-1 )

• Bounds # of partially failed devices m′

• Bounds # of sector failures per device el (0 ≤ l ≤ m′ -1)

• ∑ el = s

• Rationale: sector failures come in small bursts

Can define small m′ and reasonable size el for

bursts

12

Page 13: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Sector Failure Coverage Vector

Set e=(1, 3):

• At most 2 devices (aside entirely failed devices) have sector failures

• One device has at most 3 sector failures, and

• Another one has at most 1 sector failure

13

Page 14: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Examples of e

e = (1)

• PMDS and SD codes with s = 1

e = (r)

• (n, n – m – 1) codes (r = number of rows of a stripe)

e = (ɛ, ɛ, …, ɛ)

• Note: m’ = n - m

• Intra-Disk Redundancy code with ɛ parity symbols per

column

14

Page 15: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Parity Layout

15

e=(1, 3) global parity sectors

Q: How to generate e=(1, 3) global parity sectors and

m=1 parity device?

m=1 parity

device

A: Use two MDS codes Crow and Ccol

Page 16: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Two Encoding Phases

16

Phase 1

Phase 2

m=1 parity

device

e=(1, 3) global parity sectors

Crow: data parity devices +

intermediate parities

Ccol: intermediate parities

global parity sectors

Q: How to keep the global parity sectors inside a stripe?

Page 17: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Two Encoding Phases

17

m=1 parity

device

e=(1, 3) outside

global parity sectors

e=(1, 3) inside

global parity sectors

A: set outside global parity sectors as zeroes;

reconstruct inside global parity sectors

Phase 1

Phase 2

Page 18: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Augmented Rows

18

How do we compute inside parity sectors?

• Form a canonical stripe

Encode each column with Ccol to form augmented rows

• Generate virtual parities in augmented rows

Each augmented row is a codeword of Crow

Page 19: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Idea: Generate parities in upstairs direction

19

Can be generalized as upstairs decoding

for recovering failures

Page 20: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Detailed steps:

20 Crow: (10,7) code Ccol: (7,4) code

Page 21: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Detailed steps:

21 Crow: (10,7) code Ccol: (7,4) code

Page 22: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Detailed steps:

22 Crow: (10,7) code Ccol: (7,4) code

Page 23: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Detailed steps:

23 Crow: (10,7) code Ccol: (7,4) code

Page 24: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Detailed steps:

24 Crow: (10,7) code Ccol: (7,4) code

Page 25: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Detailed steps:

25 Crow: (10,7) code Ccol: (7,4) code

Page 26: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Detailed steps:

26 Crow: (10,7) code Ccol: (7,4) code

Page 27: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Detailed steps:

27 Crow: (10,7) code Ccol: (7,4) code

Page 28: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Detailed steps:

28 Crow: (10,7) code Ccol: (7,4) code

Page 29: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Detailed steps:

29 Crow: (10,7) code Ccol: (7,4) code

Page 30: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Detailed steps:

30 Crow: (10,7) code Ccol: (7,4) code

Page 31: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Detailed steps:

31 Crow: (10,7) code Ccol: (7,4) code

Page 32: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Detailed steps:

32 Crow: (10,7) code Ccol: (7,4) code

Page 33: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Upstairs Encoding

Detailed steps:

33 Crow: (10,7) code Ccol: (7,4) code

Notes: parity computations reuse

previously computed parities

Page 34: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Downstairs Encoding

34

Cannot be generalized for decoding

Another idea: Generate parities in downstairs direction

Page 35: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Downstairs Encoding

35

Detailed steps:

Crow: (10,7) code Ccol: (7,4) code

Page 36: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Downstairs Encoding

36

Detailed steps:

Crow: (10,7) code Ccol: (7,4) code

Page 37: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Downstairs Encoding

37

Detailed steps:

Crow: (10,7) code Ccol: (7,4) code

Page 38: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Downstairs Encoding

38

Detailed steps:

Crow: (10,7) code Ccol: (7,4) code

Page 39: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Downstairs Encoding

39

Detailed steps:

Crow: (10,7) code Ccol: (7,4) code

Page 40: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Crow: (10,7) code Ccol: (7,4) code

Downstairs Encoding

40

Detailed steps:

Like upstairs encoding, parity computations reuse

previously computed parities

Page 41: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Choosing Encoding Methods

The two methods are

complementary

Intuition:

• Choose upstairs

encoding for large m′

• Choose downstairs

encoding for small m′

Analysis details in the

paper

41

e=(1, 3) with m′=2 m′=2

Upstairs

Downstairs

Page 42: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Storage Space Saving

STAIR codes save devices over RAID

• s = # of tolerable sector failures

• m′ = # of partially failed devices

• r = chunk size

42 As r increases, # of devices saved ~ m′

Page 43: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Encoding Speed

43

Encoding speed of STAIR codes is on order of 1000MB/s

STAIR codes improve encoding speed of SD codes by

~100%, due to parity reuse

Similar results for decoding

n = number of devices

r=16 (sectors per chunk)

Page 44: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Update Cost

44

n=16 (devices) and r=16 (sectors per chunk)

Higher update penalty, due to global parity sectors

Good for systems with rare updates (e.g., backup) or

many full-stripe writes (e.g., SSDs)

(Update penalty: average # of updated parity sectors for updating a data sector)

[Plank et al., FAST ’13, TOS’14]

Page 45: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Conclusions

STAIR codes: a general family of erasure codes

for tolerating a hierarchy of failures in a space-

efficient manner

Complementary upstairs encoding and

downstairs encoding

Open source STAIR Coding Library (in C):

• http://ansrlab.cse.cuhk.edu.hk/software/stair

45

Page 46: STAIR Codes: A General Family of Erasure Codes for ... · A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Patrick P. C. Lee

Thank you!

Contact:

• Patrick P. C. Lee

http://www.cse.cuhk.edu.hk/~pclee

46