ECEN 5623 RT Embedded...
Transcript of ECEN 5623 RT Embedded...
Scalable Enterprise File systems
Three Types of Media Storage
– Direct Attached Storage – e.g. SATA (Serial ATA)
– Network Attached Storage – e.g. NFS
– Storage Area Networks – e.g. SAS (Serial Attached SCSI), Fiber
Channel
Flash / RAM based SSD Still 10x++ More Costly than
Spinning Media
– Predictions for Demise of HDDs and RAID?
– Cost is the Driver (E.g. < $0.01 / GB tape, < $0.10 / GB HDD,
$1.00 / GB SSD)
Fast Storage is Either SSD, RAID or Hybrid
Sam Siewert 2
Multiple Disk Drives
Disk Drives Fail – Like a Light-bulb
– MTBF of 100’s of Thousands of Hours [3 to 5 Years at Duty
Cycle]
– Difficult to Determine When Failure Might Occur
– The Larger the Population, the More Often Failures will be Seen
Disk Drives Have Low Random Access [100 to 200 I/Os
per Second]
Idea – Write to them in Parallel and Mirror Data to
Protect Against HDD Failures (Erasures)
Sam Siewert 3
RAID-10
Sam Siewert 4
A1 A1 A2 A2 A3 A3
A4 A4 A5 A5 A6 A6
RAID-1 Mirror RAID-1 Mirror RAID-1 Mirror
RAID-0 Striping Over RAID-1 Mirrors
A7 A7 A8 A8 A9 A9
A10 A10 A11 A11 A12 A12
A1,A2,A3, … A12
RAID Operates on LBAs/Sectors
(Sometimes Files) SAN/DAS RAID
NAS – Filesystem on top of RAID
RAID-10, RAID-50, RAID-60 – Stripe Over Mirror Sets
– Stripe Over RAID-5 XOR Parity Sets
– Stripe Over RAID-6 Reed-Soloman or Double-Parity Encoded Sets
EVEN/ODD
Row Diagonal Parity
Minimum Density Codes (Liberation)
Reed-Solomon Codes – Generalized Erasure Codes
Cauchy Reed-Solomon, LDPC (Low Density Parity Codes), Weaver/Hover
MDS (Maximal Distance Separation) – For each Parity Device, Another Level of Fault Tolerance is Provided
– Larger Drives (Multi-terabyte), Larger arrays (100’s of drives), and Cost Reduction are Driving RAID6 and Higher Levels
Sam Siewert 5
RAID5,6 XOR Parity Encoding
MDS Encoding, Can Achieve High Storage Efficiency
with N+1: N/(N+1) and N+2: N/(N+2)
Sam Siewert 6
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
90.0%
100.0%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Sto
rag
e E
ffic
ien
cy
Number of Data Devices for 1 XOR or 2 P,Q Encoded Devices
RAID6
RAID5
RAID-50
Sam Siewert 7
A1
RAID-5 Set RAID-5 Set
B1 C1 D1 P(ABCD)
E1 F1 G1 H1 P(EFGH)
I1 J1 P(IJKL) K1 L1
M1 P(MNOP) N1 P1 O1
P(QRST) Q1 R1 S1 T1
A2 B2 C2 D2 P(ABCD)
E2 F2 G2 H2 P(EFGH)
I2 J2 P(IJKL) K2 L2
M2 P(MNOP) N2 P2 O2
P(QRST) Q2 R2 S2 T2
RAID-0 Striping Over RAID-5 Sets
A1,B1,C1,D1,A2,B2,C2,D2,E1,F1,G1,H1,…,
Q2,R2,S2,T2
A1
RAID-6 Set RAID-6 Set
B1 C1 D1 P(ABCD)
E1 F1 G1 P(EFGH)
I1 J1 P(IJKL) K1
M1 P(MNOP) N1 O1 P(QRST) Q1 R1 S1
RAID-0 Striping Over RAID-6 Sets
A1,B1,C1,D1,A2,B2,C2,D2,E1,F1,G1,H1,…, Q2,R2,S2,T2
Disk5 Disk1 Disk2 Disk3 Disk4
Q(EFGH)
Disk6
H1 QABCD)
Q(IJKL)
Q(MNOP)
Q(QRST)
L1 P1
T1
A2 B2 C2 D2 P(ABCD)
E2 F2 G2 P(EFGH)
I2 J2 P(IJKL) K2
M2 P(MNOP) N2 O2 P(QRST) Q2 R2 S2
Disk5 Disk1 Disk2 Disk3 Disk4
Q(EFGH)
Disk6
H2 QABCD)
Q(IJKL)
Q(MNOP)
Q(QRST)
L2 P2
T2
RAID-60 (Reed-Solomon Encoding)
Comparison of ECs
Data Devices = n
Coding Devices = m
Total = m+n
Storage Efficiency: R=n/(n+m) – RAID1 2-Way, R=1/(1+1)=50%, MDS=1, Reads 2x Speed-up, 1x
Write
– RAID1 3-Way, R=1/(1+2)=33%, MDS=2, 3x Read, 1x Write
– RAID10 with 10 sets, R=10/(10+10)=50%, MDS=1, 20x Read, 10x Write
– RAID5 with 3+1 set, R=3/(3+1)=75%, MDS=1, 3x Read (Parity Check?), RMW Penalty, Striding Issues
– RAID6 with 7+2 set, R=5/(5+2)=71%, MDS=2, 5x Read, Reed-Solomon Encode on Write and RMW Penalty
– Beyond RAID6?
Cauchy Reed-Solomon Scales, but Encode, Decode Complexity High
Low Density Parity Codes, Simpler, but not MDS
Sam Siewert 10
Read, Modify Write Penalty
Any Update that is Less than the Full RAID5 or RAID6 Set, Requires 1. Read Old Data and Parity – 2 Reads
2. Compute New Parity (From Old & New Data)
3. Write New Parity and New Data – 2 Writes
Only Way to Remove Penalty is a Write-Back Cache to Coalesce Updates and Perform Full-Set Writes Always
Sam Siewert 11
A1
RAID-5 Set
B1 C1 D1 P(ABCD)
E1 F1 G1 H1 P(EFGH)
I1 J1 P(IJKL) K1 L1
M1 P(MNOP) N1 P1 O1
P(QRST) Q1 R1 S1 T1
Write A1 P(ABCD)new=A1new xor A1
xor P(ABCD)
A1 B1 C1 D1 P(ABCD)
0 0 0 0 0
0 0 0 1 1
0 0 1 0 1
0 0 1 1 0
0 1 0 0 1
0 1 0 1 0
0 1 1 0 0
…
Hands-On Coding Exercise(s)
Examples-RAID-Unit-Test, stripetest.c
Sam Siewert 12
A B C D XOR
XOR[A,B,C,D] A,B,C,D Strips
[siewerts@localhost Examples-RAID-Unit-Test]$ ./stripetest Baby-Musk-Ox.ppm Baby-Musk-Ox.ppm.replicated
read full stripe
…
hit end of file
FINISHED
[siewerts@localhost Examples-RAID-Unit-Test]$
[siewerts@localhost Examples-RAID-Unit-Test]$ diff Baby-Musk-Ox.ppm Baby-Musk-Ox.ppm.replicated