1 Conserving Energy in RAID Systems with Conventional Disks Dong Li, Jun Wang Dept. of Computer...

1

Conserving Energy in RAID Systems with Conventional Disks

Dong Li, Jun WangDept. of Computer Science & Engineering

University of Nebraska-LincolnPeter Varman

Dept. of Electrical and Computer EngineeringRice University

2

References[1] S. Gurumurthi, A. Sivasubramaniam, M. Kandemir, and H. Franke,

“DRPM: dynamic speed control for power management in server class disks,” ISCA’03

[2] D. Colarelli and D. Grunwald, “Massive arrays of idle disks for storage archives,” in Proceedings of Super Computing’ 2002

[3] E. Pinheiro and R. Bianchini, “Energy conservation techniques for disk array-based servers,” in Proceedings of the 18th International Conference on Supercomputing, 2004

[4] E. Varki, A. Merchant, J. Z. Xu, and X. Z. Qiu, “Issues and challenges in the performance analysis of real disk arrays,” IEEE Transactions on Parallel and Distributed Systems, 2004.

[5] D. Li and J. Wang, “EERAID: Energy-efficient redundant and inexpensive disk array,” in Proceedings of 11th ACM SIGOPS European Workshop, 2004.

[6] D. Li, H. Cai, X. Yao, and J. Wang, “Exploiting redundancy to construct energy-efficient, high-performance RAIDs,” Tech. Rep. TR-05-07-04, Computer Science and Engineering Department, University of Nebraska Lincoln, 2005.

3

Outline

Introduction Motivation eRAID Design Evaluation Leveraging eRAID Conclusions

4

Introduction

Energy-efficient storage system, total cost of ownership (TCO), …

Short request inter-arrival time Long disk state switch time of conventional disks Current solutions: multi-speed disks[1] Create long idle period for conventional disks

unbalance workloads Two approaches

Relocating data: MAID[2], PDC[3] Redirecting requests: EERAID[5]

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

5

Motivation

Major limitations of state of the art few workable solutions for conventional disk base

d systems single performance measurement no differentiation of workload time criticality

Three observations redundant information of RAID systems spare service capacity queueing model

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

6

eRAID Design Main idea

spin down, partially or entirely, mirror disks to standby read, write

Features soft solution --- no hardware change consider two performance metrics

Research issue maximize energy saving without violating predefined performance degradation

limits for both throughput and response time assume workloads have little change between two

consecutive time windows

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

7

Solving for Performance Degradation Our approach: using queueing models to do predictions

1. model RAID-1 system and get performance measures

2. examine how the input parameters are changed

3. get new performance measures with changed input parameters

4. compare these two results

Four workloads: synchronous read (SR), asynchronous read (AR), synchronous write (SW) and asynchronous write (AW)

Real system: HP SureStore E Disk Array FC60

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

8

Read Load Models

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

9

Read Load Performance Computing The possible changes of input parameters:

disk access probability disk service time --- negligible

Synchronous read load: Mean Value Analysis (MVA) technique eRAID --- double access probabilities of corresponding pri

mary disks Asynchronous read load:

no throughput degradation for stable systems eRAID --- double work loads of corresponding primary disk

s

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

10

Write Load Model

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

Controller cache write back policy

FC60: two-threshold write back policy destage_threshold, max_ditry

Disk array: M/M/1/K queueing model[4]

11

Write Load Performance Computing Dirty data arrival rate d

SW load: d= * cache_miss_rate : max throughput with infinite cache size

AW load: d= * cache_miss_rate independent with the system

The possible changes of input parameters: service rate: N/2 => (N-2i)/2 maximum queue length cache miss rate --- unnoticeable

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

12

Solving for Energy Saving

N-disk RAID1 Time window length T Request number R Mean service time t

Asyn. load: 2=1

Sync. load: 2<1

NT

tR11

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

1

21

)(

)())((

iai

wwssiia

base

eRAIDbaseE PPP

TTPTP

PNi

PP

E

EES

EeRAID= Eactive+Eidle+Estandby+Eswitch

(N-i) disks i disks

N disks

Ebase = Eactive+Eidle

NT

tR22

13

Control Algorithm

Time-window Solve multi-constraint

problem: select LFU disks

Conservative control

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

14

Evaluation

Disk power model: IBM Ultrastar 36Z15 Simulator: augmented Disksim Traces: Cello99 and TPC-C20 8-disk RAID1 system Two scenarios

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

15

Preliminary Results

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

CASE I: LimitT10% & LimitX3%Load Overall DT Overall DX Overall SE

AR 7.5% 0.0% 10.2%SR 7.1% 1.5% 11.8%AW 4.3% 0.0% 13.3%SW 0.0% 0.0% 0.0%

CASE II: LimitT50% & LimitX6%Load Overall DT Overall DX Overall SE

AR 29.5% 0.0% 30.0%SR 25.9% 4.6% 27.7%AW 14.3% 0.0% 23.5%SW 41.4% 1.6% 7.4%

16

Leveraging eRAID

Associate a load threshold f (1/2<f<1) for each disk when primary disk load exceeds f, spin up mirror disk to

share the load conventional mirrored layout: spin up one mirror disk our new layout: spin up less than one mirror disk

Layout files of one primary disk to a set of mirror disks

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

17

An example: N=10 and f=2/3

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

18

Conclusions

An energy saving policy, eRAID, for conventional disk based RAID-1 systems

30% energy-saving without violating predefined performance constraints

A new data layout scheme for further energy-saving Limitations

circumscribed by the accuracy of queueing models approximated input parameters, e.g. process number and

mean process delay conservative control

• Introduction

• Motivation

• Evaluation

• Conclusions

• Leveraging

• Design

19

Thank you!

Questions?

20

Creating Disk Idle Period in RAID-5: An Example

4-disk RAID 5 system A parity group containing data stripe 1, 2, 3 and

parity stripe p that are saved in disk 1, 2, 3 and 4 respectively

There is a read request for stripe 1. To service such a read, we could either read stripe 1 from disk 1, or read stripe 2, 3 and p, then calculate stripe 1 on the fly by an XOR calculation.

More details can be found in our technical report[6]

1 Conserving Energy in RAID Systems with Conventional Disks Dong Li, Jun Wang Dept. of Computer...

Documents

Transcript of 1 Conserving Energy in RAID Systems with Conventional Disks Dong Li, Jun Wang Dept. of Computer...