Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM)...

13
« CLUS 10586 layout: Large v.1.3 reference style: mathphys file: clus54.tex (petras) aid: 54 doctopic: OriginalPaper class: spr-twocol-v1 v.2008/04/14 Prn:15/04/2008; 16:06 p. 1» Cluster Comput DOI 10.1007/s10586-008-0054-y 1 55 2 56 3 57 4 58 5 59 6 60 7 61 8 62 9 63 10 64 11 65 12 66 13 67 14 68 15 69 16 70 17 71 18 72 19 73 20 74 21 75 22 76 23 77 24 78 25 79 26 80 27 81 28 82 29 83 30 84 31 85 32 86 33 87 34 88 35 89 36 90 37 91 38 92 39 93 40 94 41 95 42 96 43 97 44 98 45 99 46 100 47 101 48 102 49 103 50 104 51 105 52 106 53 107 54 108 Providing platform heterogeneity-awareness for data center power management Ripal Nathuji · Canturk Isci · Eugene Gorbatov · Karsten Schwan Received: 29 February 2008 / Accepted: 20 March 2008 © Springer Science+Business Media, LLC 2008 Abstract Power management is becoming an increasingly critical component of modern enterprise computing environ- ments. The traditional drive for higher performance has in- fluenced trends towards consolidation and higher densities, artifacts enabled by virtualization and new small form fac- tor server blades. The resulting effect has been increased power and cooling requirements in data centers which el- evate ownership costs and put more pressure on rack and enclosure densities. To address these issues, we exploit a fundamental characteristic of data centers: “platform het- erogeneity”. This heterogeneity stems from the architec- tural and management-capability variations of the underly- ing platforms. We define an intelligent heterogeneity-aware load management (HALM) system that leverages hetero- geneity characteristics to provide two data center level ben- efits: (i) power efficient allocations of workloads to the best fitting platforms and (ii) improved overall performance in a power constrained environment. Our infrastructure relies upon platform and workload descriptors as well as a novel analytical prediction layer that accurately predicts workload power/performance across different platform architectures and power management capabilities. Our allocation scheme R. Nathuji ( ) · K. Schwan Georgia Institute of Technology, Atlanta, GA 30032, USA e-mail: [email protected] K. Schwan e-mail: [email protected] C. Isci VMware, Palo Alto, CA 93404, USA e-mail: [email protected] E. Gorbatov Intel Corporation, Hillsboro, OR 97124, USA e-mail: [email protected] achieves on average 20% improvements in power efficiency for representative heterogeneous data center configurations, and up to 18% improvements in performance degradation when power budgeting must be performed. These results highlight the significant potential of heterogeneity-aware management. Keywords Power management · Distributed resource management · Heterogeneous systems 1 Introduction Power management has become a critical component of modern computing systems, pervading both mobile and en- terprise environments. Power consumption is a particularly significant issue in data centers, stimulating a variety of re- search for server systems [2]. Increased performance re- quirements in data centers have resulted in elevated densities enabled via consolidation and reduced server form factors. This has in turn created challenges in provisioning the nec- essary power and cooling capacities. For example, current data centers allocate nearly 60 Amps per rack, a limit that is likely to become prohibitive for future high density rack configurations such as blade servers, even if the accompany- ing cooling issues can be solved [24]. In addition, a 30,000 square feet data center with a power consumption of 10 MW requires a cooling system which costs $2–$5 million [21]. In such a system, the cost of running the air conditioning equip- ment alone can reach $4–$8 million a year [24]. Coupled with the elevated electricity costs from high performance servers, these effects can substantially affect the operating costs of a data center. Overall, these trends in power/cooling delivery and cost highlight the need for power and thermal management support in data centers.

Transcript of Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM)...

Page 1: Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM) system that leverages hetero-geneity characteristics to provide two data center level

« CLUS 10586 layout: Large v.1.3 reference style: mathphys file: clus54.tex (petras) aid: 54 doctopic: OriginalPaper class: spr-twocol-v1 v.2008/04/14 Prn:15/04/2008; 16:06 p. 1»

Cluster ComputDOI 10.1007/s10586-008-0054-y

1 55

2 56

3 57

4 58

5 59

6 60

7 61

8 62

9 63

10 64

11 65

12 66

13 67

14 68

15 69

16 70

17 71

18 72

19 73

20 74

21 75

22 76

23 77

24 78

25 79

26 80

27 81

28 82

29 83

30 84

31 85

32 86

33 87

34 88

35 89

36 90

37 91

38 92

39 93

40 94

41 95

42 96

43 97

44 98

45 99

46 100

47 101

48 102

49 103

50 104

51 105

52 106

53 107

54 108

Providing platform heterogeneity-awareness for data centerpower management

Ripal Nathuji · Canturk Isci · Eugene Gorbatov ·Karsten Schwan

Received: 29 February 2008 / Accepted: 20 March 2008© Springer Science+Business Media, LLC 2008

Abstract Power management is becoming an increasinglycritical component of modern enterprise computing environ-ments. The traditional drive for higher performance has in-fluenced trends towards consolidation and higher densities,artifacts enabled by virtualization and new small form fac-tor server blades. The resulting effect has been increasedpower and cooling requirements in data centers which el-evate ownership costs and put more pressure on rack andenclosure densities. To address these issues, we exploit afundamental characteristic of data centers: “platform het-erogeneity”. This heterogeneity stems from the architec-tural and management-capability variations of the underly-ing platforms. We define an intelligent heterogeneity-awareload management (HALM) system that leverages hetero-geneity characteristics to provide two data center level ben-efits: (i) power efficient allocations of workloads to the bestfitting platforms and (ii) improved overall performance ina power constrained environment. Our infrastructure reliesupon platform and workload descriptors as well as a novelanalytical prediction layer that accurately predicts workloadpower/performance across different platform architecturesand power management capabilities. Our allocation scheme

R. Nathuji (�) · K. SchwanGeorgia Institute of Technology, Atlanta, GA 30032, USAe-mail: [email protected]

K. Schwane-mail: [email protected]

C. IsciVMware, Palo Alto, CA 93404, USAe-mail: [email protected]

E. GorbatovIntel Corporation, Hillsboro, OR 97124, USAe-mail: [email protected]

achieves on average 20% improvements in power efficiencyfor representative heterogeneous data center configurations,and up to 18% improvements in performance degradationwhen power budgeting must be performed. These resultshighlight the significant potential of heterogeneity-awaremanagement.

Keywords Power management · Distributed resourcemanagement · Heterogeneous systems

1 Introduction

Power management has become a critical component ofmodern computing systems, pervading both mobile and en-terprise environments. Power consumption is a particularlysignificant issue in data centers, stimulating a variety of re-search for server systems [2]. Increased performance re-quirements in data centers have resulted in elevated densitiesenabled via consolidation and reduced server form factors.This has in turn created challenges in provisioning the nec-essary power and cooling capacities. For example, currentdata centers allocate nearly 60 Amps per rack, a limit thatis likely to become prohibitive for future high density rackconfigurations such as blade servers, even if the accompany-ing cooling issues can be solved [24]. In addition, a 30,000square feet data center with a power consumption of 10 MWrequires a cooling system which costs $2–$5 million [21]. Insuch a system, the cost of running the air conditioning equip-ment alone can reach $4–$8 million a year [24]. Coupledwith the elevated electricity costs from high performanceservers, these effects can substantially affect the operatingcosts of a data center. Overall, these trends in power/coolingdelivery and cost highlight the need for power and thermalmanagement support in data centers.

Page 2: Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM) system that leverages hetero-geneity characteristics to provide two data center level

« CLUS 10586 layout: Large v.1.3 reference style: mathphys file: clus54.tex (petras) aid: 54 doctopic: OriginalPaper class: spr-twocol-v1 v.2008/04/14 Prn:15/04/2008; 16:06 p. 2»

Cluster Comput

109 163

110 164

111 165

112 166

113 167

114 168

115 169

116 170

117 171

118 172

119 173

120 174

121 175

122 176

123 177

124 178

125 179

126 180

127 181

128 182

129 183

130 184

131 185

132 186

133 187

134 188

135 189

136 190

137 191

138 192

139 193

140 194

141 195

142 196

143 197

144 198

145 199

146 200

147 201

148 202

149 203

150 204

151 205

152 206

153 207

154 208

155 209

156 210

157 211

158 212

159 213

160 214

161 215

162 216

Previous work on server management has focused onmanaging heat during thermal events [21] or utilizing plat-form power management support, such as processor fre-quency scaling, for power budgeting [9, 18, 24]. In this pa-per, we approach the problem of managing data centers froma different perspective by considering how to intelligentlyallocate workloads amongst heterogeneous platforms in amanner that (i) improves data center power-efficiency whilepreserving/satisfying workload performance requirements,and (ii) meets data-center-level power budgets with minimalimpact on workload performance. Typically, data centersstatically allocate platform resources to applications basedupon peak load characteristics in order to maintain isola-tion and provide performance guarantees. With the continu-ing growth in capabilities of virtualization solutions (e.g.,Xen [1] and VMware [25]), the necessity of such offlineprovisioning is removed. Indeed, by allowing for flexibleand dynamic migration of workloads across physical re-sources [6], the use of virtualization in future data centersenables a new avenue of management and optimization. Ourapproach begins to leverage some of these capabilities toenhance power efficiency by taking advantage of the abilityto assign virtualized applications to varying sets of underly-ing hardware platforms based upon performance needs andpower consumption characteristics.

Throughout their lifetimes, data centers continually up-grade servers due to failures, capacity increases, and mi-grations to new form factors [12]. Over time, this leads todata centers comprised of a range of heterogeneous plat-forms with differences in component technologies; power,performance and thermal characteristics; and power man-agement capabilities. When provisioning resources to work-loads in these heterogeneous environments, power efficiencycan vary significantly based on the particular allocation. Forexample, by assigning a memory bound workload to a plat-form that can perform dynamic voltage and frequency scal-ing (DVFS), run-time power consumption can be reducedwith minimal impact to performance [19]. We propose anovel heterogeneity-aware load management (HALM) ar-chitecture to achieve this power-friendly behavior in datacenters.

Allocating power and cooling resources is another sig-nificant challenge in the modern data center. Though clearlybeneficial for transient power delivery and cooling issues,power budgeting solutions can also be useful in the provi-sioning of these resources. Traditionally, power and coolinghave been allocated based on the nameplate rating of the sys-tem power supply or its maximum output power. However, afully utilized server with a typical configuration will see itselectrical load between 60%–75% of the name plate ratingwith most enterprise workloads. Therefore, providing powerand cooling capacity based on these worst case assumptions

results in either over allocation of power and cooling ca-pacity or underutilization of server rack space leading to in-creased capital costs and underutilized data centers. Allocat-ing power and cooling capacity based on the average work-load behavior within a server and across a data center allowssignificantly increased densities but requires dynamic pro-tection mechanisms that can limit server power consump-tion when demand temporarily exceeds available capacity.These mechanisms have been recently proposed in the liter-ature and explored in the industry [8]. While very effectivein limiting power and protecting the infrastructure, they mayresult in nontrivial degradation of peak performance, espe-cially when the power constraint is too prohibitive. In thispaper we illustrate how HALM can lessen the performanceimpact of data center power budgeting strategies.

Intelligent mapping of applications to underlying plat-forms is dependent upon the availability of relevant infor-mation about workloads and hardware resources. As part ofHALM, we extend the use of workload and platform de-scriptors for this purpose, which are then used by a pre-dictor component that estimates the achievable performanceand power savings across the different platforms in the datacenter. These predictions are finally used by an allocationlayer that map workloads to a specific type of platform. Thisoverall infrastructure is evaluated using data center con-figurations consisting of variations upon four distinct plat-forms. In summary, the main contributions of our HALMsystem are: (i) a platform heterogeneity-aware power man-agement infrastructure that improves data center power effi-ciency under workload performance constraints and limiteddata center power budgets; (ii) an allocation infrastructurethat uses workload and platform descriptors to perform map-pings of hardware to virtualized workloads; and (iii) an in-telligent load shedding policy to dynamically meet transientchanges in power consumption limits. Evaluations of oursystem performed on state-of-the art platforms, includingIntel® Core™ microarchitecture based hardware, demon-strate the benefits of exploiting platform heterogeneity forpower management.

2 Motivation

2.1 Data center composition and exploiting heterogeneity

Data center deployments are inherently heterogeneous. Up-grade cycles and replacement of failed components andsystems contribute to this heterogeneity. In addition, newprocessor and memory architectures appear every few years,and reliability requirements are becoming ever more strin-gent. The effect of these trends is reflected by a recent sur-vey of data center managers that found that 90% of the facil-ities are expected to upgrade their compute and storage in-frastructure in the next two years. Figure 1(a) shows a distri-bution of different systems in a representative enterprise data

Page 3: Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM) system that leverages hetero-geneity characteristics to provide two data center level

« CLUS 10586 layout: Large v.1.3 reference style: mathphys file: clus54.tex (petras) aid: 54 doctopic: OriginalPaper class: spr-twocol-v1 v.2008/04/14 Prn:15/04/2008; 16:06 p. 3»

Cluster Comput

217 271

218 272

219 273

220 274

221 275

222 276

223 277

224 278

225 279

226 280

227 281

228 282

229 283

230 284

231 285

232 286

233 287

234 288

235 289

236 290

237 291

238 292

239 293

240 294

241 295

242 296

243 297

244 298

245 299

246 300

247 301

248 302

249 303

250 304

251 305

252 306

253 307

254 308

255 309

256 310

257 311

258 312

259 313

260 314

261 315

262 316

263 317

264 318

265 319

266 320

267 321

268 322

269 323

270 324

Fig. 1 Data center heterogeneity and management benefits

center. As the figure shows, the data center contains nine dif-ferent generations of systems that have either (1) differentprocessor architectures, cores and frequencies; (2) varyingmemory capacity and interconnect speeds; or (3) differentI/O capabilities. While all systems support the same soft-ware stack, they have very different and often asymmetricperformance and power characteristics.

Traditionally, the non-uniformity of systems in a datacenter has been characterized by different levels of perfor-mance and power consumption. However, recently, anotherdimension has been added to this heterogeneity becauseserver platforms are beginning to offer rich thermal andpower management capabilities. Processors support DVFSand aggressive sleep states to conserve CPU power. Newmemory power management implementations allow differ-ent DRAM devices to go to lower power states when in-active, and enable bandwidth throttling for thermal protec-tion. Server power supplies exhibit different conversion ef-ficiencies under different loads, directly impacting the over-all power efficiency of the system. Since power efficiencyhas become an important thrust in enterprise systems, weexpect component and platform vendors to continue intro-ducing new power and thermal management capabilities intotheir products, including I/O and system buses, chipsets, andnetwork and disk interfaces, making future platforms evenmore heterogeneous.

Previous work has proposed different approaches forenergy-efficient workload allocation in clusters, but nonehave accounted for system level power management andthermal characteristics. Therefore, the workload allocationsproposed by previous approaches will yield less than idealresults since they are completely unaware of power and ther-mal management effects on system performance and powerconsumption. To illustrate this phenomenon, we experimen-tally compare two dual processor systems, A and B , run-ning two different workloads, as shown in Fig. 1(b). Thedifferences between the two systems are in the power sup-ply unit (PSU) and processor power management capabili-ties. System A has a less efficient power supply at light loadand has processors with limited power management support.System B, on the other hand, has a high efficiency powersupply across all loads and processors that support a rich set

of power management capabilities. We measure power con-sumption on these platforms using two different syntheticworkloads: one with full utilization (W1) and one with avery low level of utilization (W2). W1 consumes about thesame amount of power on both platforms. However, allocat-ing the low-utilization W2 to system A leads to very powerinefficient execution. Since A does not support power man-agement and has low PSU efficiency at light load, its totalsystem power is more than 50 W higher than that of sys-tem B . Thus, while both systems meet the performance de-mand of both workloads, heterogeneity-aware resource allo-cation can decrease total power by more than 10%, translat-ing into millions of dollars in savings for large data centers.As this example shows, a full knowledge of system powerand supported power management features is required to ef-ficiently allocate workloads. Our HALM system is designedto provide such functionality.

2.2 Benefits of heterogeneity-aware management

To further motivate the need and benefits of heterogeneity-aware management in data centers, we perform two oppor-tunity studies. The first study considers the possible benefitsof allocating workloads by matching system capabilities andworkload execution characteristics to reduce a data center’spower profile while also meeting workload performance de-mands. We analyze an example of running a set of work-loads in a data center configuration with four unique typesof platforms described later in the paper, each with differ-ent power/performance characteristics. The set of workloadsincludes ten computational benchmarks (swim, bzip2,mesa, gcc, mcf, art, applu, vortex, sixtrack, andlucas from SPEC CPU2000) and one transaction-orientedworkload (SPECjbb2005). We generate all subsets of fourfrom these eleven benchmarks and compare three alloca-tion policies for each of the subsets in Fig. 2(a). The ‘worstcase’ allocation distributes the benchmarks across platformsto maximize power consumption, ‘random’ allocates work-loads to platforms randomly, and ‘optimal’ distributes theworkloads to minimize power consumption. For each work-load, we allocate as many systems of a given type as nec-essary to meet workload throughput requirements. Subsets

Page 4: Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM) system that leverages hetero-geneity characteristics to provide two data center level

« CLUS 10586 layout: Large v.1.3 reference style: mathphys file: clus54.tex (petras) aid: 54 doctopic: OriginalPaper class: spr-twocol-v1 v.2008/04/14 Prn:15/04/2008; 16:06 p. 4»

Cluster Comput

325 379

326 380

327 381

328 382

329 383

330 384

331 385

332 386

333 387

334 388

335 389

336 390

337 391

338 392

339 393

340 394

341 395

342 396

343 397

344 398

345 399

346 400

347 401

348 402

349 403

350 404

351 405

352 406

353 407

354 408

355 409

356 410

357 411

358 412

359 413

360 414

361 415

362 416

363 417

364 418

365 419

366 420

367 421

368 422

369 423

370 424

371 425

372 426

373 427

374 428

375 429

376 430

377 431

378 432

Fig. 2 Opportunity analysis of heterogeneity-aware management

that have benchmarks with more homogeneous behavior, i.e.similar processor and memory usage behavior, appear on theleft side of the graph, while subsets with more heteroge-neous benchmarks appear on the right. As can be seen fromthe figure, subsets of workloads with more heterogeneousbehavior can substantially benefit from heterogeneity-awareresource allocation. Averaging across all subsets, the opti-mal policy can reduce total power by 18% when comparedto random allocation and by 34% over worst-case allocation,without compromising workload performance.

The second opportunity study considers how the aggre-gate throughput of a set of workloads varies within a givenpower budget based upon allocations. In particular, we as-sume that we have one of each of our four unique platformsand again generate subsets of four workloads from a set ofSPEC CPU2000 benchmarks. For each subset, we calculatethe minimum, average, and best case throughput across allpermutations of possible allocations of the four workloadsonto the four platforms. Figure 2(b) provides the results,where each scenario is normalized by the minimum through-put value to provide fair comparisons. We find that on aver-age, the best case allocation provides a 23% improvementin performance over the random allocation, and a 48% im-provement compared to the worst-case. These results high-light the relationship between allocation decisions and per-formance when a power budget must be imposed.

Summarizing, HALM addresses the power benefits ofheterogeneity-aware allocation for two cases: (1) when thereis no power budget and (2) when such a budget must beimposed temporarily due to power delivery or cooling con-straints or as part of a power provisioning strategy [8].

3 Scalable enterprise and data center management

Our previous discussions have motivated the need to aug-ment the behavior of data centers to improve manageabil-ity by leveraging the heterogeneity in platform capabilities.

HALM extends this support with its heterogeneity-awareworkload allocation infrastructure that utilizes the flexibil-ity of rapidly developing virtualization technologies. Virtu-alization attempts to provide capabilities and abstractionsthat significantly impact the landscape of enterprise man-agement. For example, there is active work to ensure per-formance isolation benefits, where it will be possible torun multiple virtual machines (VMs) within a given phys-ical platform without interference among applications [15].Currently, VMs can coexist on a platform with negligibleperformance interference as long as resources are not over-committed. Approaches that allow for resource pools andreservations as well as dynamic resource sharing and recla-mation can aid in providing isolation even when systemsare over-provisioned. Secondly, by encapsulating applica-tion state within well defined virtual machines, migrationof workloads among resources can be performed easily andefficiently. A more powerful contribution of virtualization,however, is the ability to combine multiple resources acrossphysical boundaries to create virtual platforms for applica-tions, providing a scalable enterprise environment. HALMassumes the existence of this flexible and powerful virtual-ization support.

The usage pattern of data centers is becoming increas-ingly service-oriented, where applications and workloadsmay be submitted dynamically by subscribers/clients. Whenmanaging these types of applications certain managementactions, such as allocation decisions, happen at a coarsegranularity with finer adjustments being made at runtimeto address transient issues such as reduced power budgets.One can imagine how such a data center might be managedwith the typically used assignment approaches. At some in-frequent interval the pool of applications and service levelagreements (SLAs) that specify their required performance,in metrics such as throughput or response time, are com-piled. Applications are then assigned to platforms usinga simple load balancing scheme based upon utilization orqueue lengths, possibly even accounting for differences inthe performance of the systems [26], so that SLAs are met.

Page 5: Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM) system that leverages hetero-geneity characteristics to provide two data center level

« CLUS 10586 layout: Large v.1.3 reference style: mathphys file: clus54.tex (petras) aid: 54 doctopic: OriginalPaper class: spr-twocol-v1 v.2008/04/14 Prn:15/04/2008; 16:06 p. 5»

Cluster Comput

433 487

434 488

435 489

436 490

437 491

438 492

439 493

440 494

441 495

442 496

443 497

444 498

445 499

446 500

447 501

448 502

449 503

450 504

451 505

452 506

453 507

454 508

455 509

456 510

457 511

458 512

459 513

460 514

461 515

462 516

463 517

464 518

465 519

466 520

467 521

468 522

469 523

470 524

471 525

472 526

473 527

474 528

475 529

476 530

477 531

478 532

479 533

480 534

481 535

482 536

483 537

484 538

485 539

486 540

Fig. 3 HALM architecture

When load must be reduced to address power budgeting re-quirements, load might be shed from workloads in a simi-larly random or round robin fashion. This approach clearlyleaves room for improvement, since it does not considerpower or platform differences in any way. HALM addressesthis weakness by performing heterogeneity aware alloca-tions as well as intelligent load shedding.

The HALM architecture can be organized into three ma-jor components: (1) platform/workload descriptors, (2) apower/performance predictor, and (3) an allocator, as shownin Fig. 3(a). We use platform and workload descriptors toprovide our workload allocator with the differences amongstworkloads and platforms. These descriptor inputs are uti-lized by the predictor to determine: (1) the relative perfor-mance of workloads on different types of platforms, and(2) the power savings achievable from platform power man-agement mechanisms. Coupled with coarse platform powerconsumption information (obtained via online power moni-toring) (3) the allocator, performs the assignments of work-loads to the available resources.

The purpose of platform descriptors is to convey informa-tion regarding the hardware and power management capabil-ities of a machine. A platform descriptor is made up of mul-tiple modules, representing different system components, asshown in Fig. 3(b). Each module specifies the type of com-ponent to which it refers, such as processor, memory subsys-tem, or power supply. Within each of these modules, variouscomponent parameters are defined. For example, a moduledescribing the processor component may have attributes likeits microarchitecture family, frequency, and available man-agement support. Workload descriptors are also structuredin modules, headed with attribute declarations. Within eachmodule, a list of values for that attribute is provided. Asworkload attributes often vary with the platforms on whichit executes, our descriptor design allows multiple attributedefinitions, where each definition is predicated with com-ponent parameter values that correlate back to platform de-scriptors. Figure 3(b) illustrates the structure of the resultingworkload descriptor. We further explain the meaning of the

MPI (memory accesses per instruction) and CPICORE (corecycles per instruction) attributes in subsequent sections.

Platform descriptor information can be provided in a vari-ety of ways. It can be made readily available using platformsupport such as ACPI [13], and possibly also with someadministrative input. To provide the required workload de-scriptors, we profile workloads on a minimal set of orthogo-nal platforms, with mutually exclusive component types. Wethen use an analytical prediction approach to project work-load characteristics on all available platforms. As we discussin Sect. 5, this approach provides accurate predictions thatscale with increased amounts of heterogeneity.

4 Methodology

4.1 Platform hardware

Our hardware setup consists of four types of rack mountedserver platforms summarized in Fig. 4(a), where LLC de-notes last-level cache size. All four types of platforms con-tain standard components and typical configurations that en-tered production cycles. In our experiments Linux was in-stalled on all systems for measurement of various attributes(e.g. CPI, MPI, etc.) as well as performance. We validatedthat the performance results matched those with Xen usinga subset of workloads and platforms, but performed the ma-jority of our experiments in a non-virtualized environment tohave better access to performance counters used to measureother workload attributes.

The platform names are based on their processor codename in this paper. All four platforms are dual-processorsystems. Woodcrest, Sossaman, and Dempsey are CMPdual-core processors, and Irwindale is a 2-way SMT proces-sor supporting Hyper-Threading Technology. All platformshave 8 GB of memory. Woodcrest and Dempsey supportFully Buffered DIMM (FBD) memory with a 533 MHzDDR2 bus, while Sossaman and Irwindale support unreg-istered DDR2 400 MHz memory. Woodcrest and Dempsey

Page 6: Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM) system that leverages hetero-geneity characteristics to provide two data center level

« CLUS 10586 layout: Large v.1.3 reference style: mathphys file: clus54.tex (petras) aid: 54 doctopic: OriginalPaper class: spr-twocol-v1 v.2008/04/14 Prn:15/04/2008; 16:06 p. 6»

Cluster Comput

541 595

542 596

543 597

544 598

545 599

546 600

547 601

548 602

549 603

550 604

551 605

552 606

553 607

554 608

555 609

556 610

557 611

558 612

559 613

560 614

561 615

562 616

563 617

564 618

565 619

566 620

567 621

568 622

569 623

570 624

571 625

572 626

573 627

574 628

575 629

576 630

577 631

578 632

579 633

580 634

581 635

582 636

583 637

584 638

585 639

586 640

587 641

588 642

589 643

590 644

591 645

592 646

593 647

594 648

Fig. 4 Experimental platforms

Table 1 Levels ofheterogeneity in ourexperimental platforms

have dual FSB architectures with two branches to memoryand two channels per branch.

All four types of systems are heterogeneous in a sensethat each has a unique combination of processor architec-ture and memory subsystem. If we assume that Intel Coremicroarchitecture/Pentium® M and NetBurst constitute twotypes of processors and LLC-4 MB/FSB-1066/FBD-533and LLC-2 MB/FSB-800/DDR2-400 constitute two types ofmemory, all four platforms can be mapped as having uniqueprocessor/memory architecture combinations. Note that allfour platforms also have vastly different power and perfor-mance characteristics. For example, the Intel Core microar-

chitecture is superior to NetBurst both in terms of perfor-mance and power efficiency. FBD based memory, on theother hand, provides higher throughput in our systems atthe expense of elevated power consumption due to increasedDDR2 bus speed and the power requirements of the Ad-vanced Memory Buffer (AMB) on the buffered DIMMs.The four platforms occupy separate quadrants of a hetero-geneity space with dimensions of microarchitecture hetero-geneity and memory subsystem heterogeneity, as shown inFig. 4(b). We refer to this initial level of heterogeneity as“across-platform heterogeneity”. However, in addition tothis, all these server platforms also support chip-level DVFS.

Page 7: Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM) system that leverages hetero-geneity characteristics to provide two data center level

« CLUS 10586 layout: Large v.1.3 reference style: mathphys file: clus54.tex (petras) aid: 54 doctopic: OriginalPaper class: spr-twocol-v1 v.2008/04/14 Prn:15/04/2008; 16:06 p. 7»

Cluster Comput

649 703

650 704

651 705

652 706

653 707

654 708

655 709

656 710

657 711

658 712

659 713

660 714

661 715

662 716

663 717

664 718

665 719

666 720

667 721

668 722

669 723

670 724

671 725

672 726

673 727

674 728

675 729

676 730

677 731

678 732

679 733

680 734

681 735

682 736

683 737

684 738

685 739

686 740

687 741

688 742

689 743

690 744

691 745

692 746

693 747

694 748

695 749

696 750

697 751

698 752

699 753

700 754

701 755

702 756

This leads to a second degree of heterogeneity, where onetype of platform can have instances in a data center thatare configured to operate at different frequencies. We re-fer to this as “within-platform heterogeneity”. As processvariations increasingly result in the binning of producedchips into different operating points, this within-platformheterogeneity becomes an inherent property of the gen-eral data center landscape. Finally, many of these platformsmay incorporate some processor dynamic power manage-ment (DPM) techniques that adaptively alter platform be-havior at runtime. This creates a third source of heterogene-ity, “DPM-capability heterogeneity”, where platforms withbuilt-in DPM hooks exhibit different power/performancecharacteristics from the ones with no DPM capabilities. InTable 1, we show how these three levels of heterogeneityquickly escalate the number of distinct platform configura-tions in a data center scenario.

All experimental power measurements are performed us-ing the Extech 380801 power analyzer. The power is mea-sured at the wall and represents total AC power consumptionof the entire system. The power numbers presented in thispaper are obtained by averaging the instantaneous systempower consumption over the entire run of each workload.Our assumption is that infrastructure support for monitor-ing power consumption will be utilized to obtain this typeof workload specific power characteristics online, insteadof parameterized models. For example, all power supplies,which adhere to the latest power supply monitoring interface(PSMI) specification, support out-of-band current/voltagesampling allowing for per platform A/C power monitoringreflected by our actual power measurements.

4.2 Application model

When power managing computing environments, improve-ments can be attained with a variety of approaches. In thiswork, we consider two scenarios. The first assumes a lackof budgeting constraints, concentrating on a workload allo-cation that reduces power consumption while maintainingbaseline application performance. In other words, we max-imize the performance per watt, while holding performanceconstant. The second addresses power budgeting by per-forming load shedding to reduce power consumption whileminimizing performance impact to workloads. We considerapplication performance in terms of throughput, or the rateat which transaction operations are performed. Therefore,it is not the execution time of each transaction that definesperformance, but the rate at which multiple transactions canbe sustained. This type of model is representative of appli-cations such as transaction based web services or payrollsystems.

The goal of HALM is to evaluate the power-efficiencytradeoffs of assigning a workload to a variety of platforms.

Since the performance capabilities of each platform are dif-ferent, the execution time to perform a single operable unit,or atomic transaction, varies across them. As previouslymentioned, virtualization technologies can help to extendthe physical resources dedicated to applications when nec-essary to maintain performance by increasing the number ofplatforms used to execute transactions. In particular, trans-actions can be distributed amongst nodes until the desiredthroughput is reached.

For our analysis, we consider applications that mimicthe high performance computational applications commonto data center environments and also heavily exercise thepower hungry components of server platforms, the proces-sor and memory. Two aspects of these workloads are cap-tured in our experimental analysis. First, these workloadsare inherently transactional, such as the previous financialpayroll example or the processing of risk analysis modelsacross different inputs common to investment banking. Sec-ond, with the ability to incorporate large amounts of memoryinto platforms at relatively low costs, these applications of-ten execute mostly from memory, with little or no I/O beingperformed. Though I/O such as network use can play a sig-nificant role in multi-tier enterprise applications, we leaveconsideration of such characteristics to future work. To real-ize our application model, while also providing determin-istic and repeatable behavior for our experimentation, weutilize benchmarks from the SPEC CPU2000 suite as rep-resentative examples of transaction instances. SPEC bench-marks allow for the isolation of processor and memory com-ponents, while also generating different memory loads. In-deed, many SPEC benchmarks exhibit significant measuredmemory bandwidth of 5–8 GB/sec on our systems. In orderto provide an unbiased workload set, we include all SPECbenchmarks in our experiments. For each application, wespecify an SLA in terms of required transaction processingrate, equal to the throughput achievable on the Woodcrestplatform.

5 Workload behavior estimation

The power/performance predictor component of our HALMframework can be implemented in multiple ways. For ex-ample, one can profile a set of microbenchmarks on all plat-form configurations and develop statistical mapping func-tions across these configurations. However, as the platformtypes and heterogeneity increase, the overhead of such ap-proaches can be prohibitive. Instead, we develop a predic-tor that relies on the architectural platform properties andadjusts its predictions based on the heterogeneity specifica-tions. We refer to this model as the “Blocking Factor (BF)Model”. The BF model simply decomposes execution cyclesinto CPU cycles and memory cycles. CPU cycles represent

Page 8: Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM) system that leverages hetero-geneity characteristics to provide two data center level

« CLUS 10586 layout: Large v.1.3 reference style: mathphys file: clus54.tex (petras) aid: 54 doctopic: OriginalPaper class: spr-twocol-v1 v.2008/04/14 Prn:15/04/2008; 16:06 p. 8»

Cluster Comput

757 811

758 812

759 813

760 814

761 815

762 816

763 817

764 818

765 819

766 820

767 821

768 822

769 823

770 824

771 825

772 826

773 827

774 828

775 829

776 830

777 831

778 832

779 833

780 834

781 835

782 836

783 837

784 838

785 839

786 840

787 841

788 842

789 843

790 844

791 845

792 846

793 847

794 848

795 849

796 850

797 851

798 852

799 853

800 854

801 855

802 856

803 857

804 858

805 859

806 860

807 861

808 862

809 863

810 864

Fig. 5 Performance predictionresults

the execution with a perfect last-level cache (LLC), whilememory cycles capture the finite cache effects. This model issimilar to the “overlap model” described by Chou et al. [5].With the BF model, the CPI (cycles per instruction) of aworkload can be represented as in (1). Here CPICORE repre-sents the CPI with a perfect LLC. This term is independentfrom the underlying memory subsystem. CPIMEM accountsfor the additional cycles spent for memory accesses with afinite-sized cache:

CPI = CPICORE + CPIMEM. (1)

The CPIMEM term can be expanded into architecture andworkload specific characteristics. Based on this, the CPI ofa platform at a specific frequency f1 can be expressed asin (2). Here, MPI is the memory accesses per instruction,which is dependent on the workload and the LLC size, L

is the average memory latency, which varies based upon thememory subsystem specifications, and BF is the blockingfactor that accounts for the overlapping concurrent execu-tion during memory accesses, which is a characteristic ofthe workload:

CPI(f1) = CPICORE(f1) + MPI · L(f1) · BF(f1). (2)

Using variants of (2), performance prediction can beperformed relatively easily for within-platform heterogene-ity, as well as across-platform heterogeneity. For within-platform heterogeneity, the frequency-dependent compo-nents of (2) are scaled with frequency to predict workloadperformance on a different frequency setting. The top chartin Fig. 5 provides results for an example of this type of pre-diction with an orthogonal platform (Sossaman). The figurecontains the actual measured performance for our workloadstogether with the predicted performance.

In the latter case of across-platform heterogeneity, thenatural decoupling of the microarchitectural and memory

subsystem differences in the BF model enables us to esti-mate application performance on a platform lying on a dif-ferent corner of the memory and microarchitecture hetero-geneity space. Among our four experimental platforms, two“orthogonal platforms”, which span two opposite cornersof the platform heterogeneity quadrants in Fig. 4(b), can beused to predict performance on a third “derived platform”.The lower chart in Fig. 5 shows the prediction results for theWoodcrest platform, whose performance is “derived” usingthe CPICORE and CPIMEM characteristics of the orthogonalplatforms (Sossaman and Dempsey respectively). Overall,for the orthogonal platforms, the BF model can very accu-rately predict performance with an average prediction errorof 2%. For the derived platforms, our predictor can trackactual execution times very well, though with an increasedaverage prediction error of 20%. In the following sections,we show that this performance prediction methodology pro-vides sufficient accuracy to represent workload behavior andallows HALM to achieve close to optimal allocations. Fur-ther details of this prediction methodology can be found inour previous work [22].

The final heterogeneity type supported by our predictoris the DPM-capability heterogeneity. For this, we considera platform that enables DVFS during memory bound exe-cution regions of an application. We implement this func-tionality as part of OS power management, based on priorwork [14]. To incorporate DPM awareness, we extend thepredictor component to estimate the potential power savingsthat can be attained when executing a workload on a DPMenabled platform. Experimental results show that there isa strong correlation between the MPI of a workload andits power saving potential. Therefore, we utilize the MPIattribute in the workload descriptors to predict the powersaving potentials of workloads on DPM enabled platforms.Figure 6 shows that our MPI based prediction approach ef-fectively captures the power saving potentials of differentworkloads and successfully differentiates applications that

Page 9: Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM) system that leverages hetero-geneity characteristics to provide two data center level

« CLUS 10586 layout: Large v.1.3 reference style: mathphys file: clus54.tex (petras) aid: 54 doctopic: OriginalPaper class: spr-twocol-v1 v.2008/04/14 Prn:15/04/2008; 16:06 p. 9»

Cluster Comput

865 919

866 920

867 921

868 922

869 923

870 924

871 925

872 926

873 927

874 928

875 929

876 930

877 931

878 932

879 933

880 934

881 935

882 936

883 937

884 938

885 939

886 940

887 941

888 942

889 943

890 944

891 945

892 946

893 947

894 948

895 949

896 950

897 951

898 952

899 953

900 954

901 955

902 956

903 957

904 958

905 959

906 960

907 961

908 962

909 963

910 964

911 965

912 966

913 967

914 968

915 969

916 970

917 971

918 972

Fig. 6 Power savingpredictions for DPM enabledplatforms

can benefit significantly from being allocated to a DPM en-abled machine. As we describe in Sect. 6.1, we use this pre-dictor to choose workloads that should be assigned to theDPM enabled platforms.

6 Management policies

6.1 HALM allocation policy

After processing the workload and platform descriptors,and utilizing our BF model for performance prediction, thenext step is to perform allocation of resources to a set ofapplications in a data center. Evaluations are based on agreedy policy for allocating workloads. In particular, witheach application i, we associate a cost metric for execut-ing on each platform type k. Workloads are then orderedinto a scheduling queue based on their maximum cost met-ric across all platform types. The allocator then performsapplication assignment based on this queue, where applica-tions with higher worst-case costs have priority. The plat-form type chosen for an application is a function of this costmetric across the available platforms as well as the estimatedDPM benefits. As a cost metric for our policy, we defineNi,k , the number of platforms of type k required to executea workload i, Ni,k . This value is clearly a function of boththe performance capabilities of the platform and the SLA re-quirement of the workload. Ni,k can be analytically defined,given the transaction based application model utilized in ourwork. For each application i, the service level agreement(SLA) specifies that Xi transactions should be performedevery Yi time units. If ti,k is the execution time of a trans-action of application i on platform k, the resulting numberof platforms required to achieve the SLA can be expressedwith (3)

Ni,k =⌈

Xi

�Yi/ti,k�⌉. (3)

The ti,k values are provided by the performance predictor.It should be noted that there is a discretization in Ni,k , whichis due to the fact that individual atomic transactions cannotbe parallelized across multiple platforms. Ni,k is thereforebetter able to handle errors due to the inherent discretiza-tion performed, making it a strong choice as a cost metric

(other possible metrics are discussed and defined in our pre-vious work [22]). Given the use of Ni,k as our cost metric,our allocation approach first determines the platform typesof which (1) there are enough available systems to allocatethe workload and (2) the cost metric is minimized. We thenuse DPM savings to determine whether a more power ef-ficient platform alternative should be used between thosewith the same cost value. In other words, if there are mul-tiple platform types for which an application has the sameNi,k value, we utilize a DPM specific threshold to decidewhether or not it should be scheduled to a DPM enabledplatform type. As we demonstrate in the following section,this threshold based approach can be effective in identifyingworkloads that can take advantage of DPM capabilities.

6.2 HALM power budgeting policy

In order to address transient power delivery or cooling is-sues, it is sometimes necessary to temporarily reduce powerconsumption in a data center. To provide this mechanism,we develop a load shedding policy based upon an existingworkload allocation scheme. The goal of the policy is to re-duce the amount of resources provided to applications in or-der to meet a power budget while still allowing all workloadsto make some progress. In other words, application perfor-mance may be degraded compared to prescribed SLAs, butall applications achieve some fraction of their SLA.

Our power budgeting policy is, again, a greedy approach.For all applications with resources that can be shed, i.e.applications that utilize more than one platform, we de-fine a power-efficiency metric as the throughput per Wattthat is being provided by each resource. Afterwards, the re-sources with minimal power efficiency are shed until thepower budget is met. As our experimental results demon-strate, this simple metric allows for improved performancewhen power budgeting must be performed, as well as betterfairness across workloads in terms of performance degrada-tion experienced.

7 Experimental evaluation

7.1 Increasing power efficiency

In order to evaluate our heterogeneity-aware allocation ap-proach, we perform power and performance measurements

Page 10: Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM) system that leverages hetero-geneity characteristics to provide two data center level

« CLUS 10586 layout: Large v.1.3 reference style: mathphys file: clus54.tex (petras) aid: 54 doctopic: OriginalPaper class: spr-twocol-v1 v.2008/04/14 Prn:15/04/2008; 16:06 p. 10»

Cluster Comput

973 1027

974 1028

975 1029

976 1030

977 1031

978 1032

979 1033

980 1034

981 1035

982 1036

983 1037

984 1038

985 1039

986 1040

987 1041

988 1042

989 1043

990 1044

991 1045

992 1046

993 1047

994 1048

995 1049

996 1050

997 1051

998 1052

999 1053

1000 1054

1001 1055

1002 1056

1003 1057

1004 1058

1005 1059

1006 1060

1007 1061

1008 1062

1009 1063

1010 1064

1011 1065

1012 1066

1013 1067

1014 1068

1015 1069

1016 1070

1017 1071

1018 1072

1019 1073

1020 1074

1021 1075

1022 1076

1023 1077

1024 1078

1025 1079

1026 1080

of our SPEC based representative transactional workloadsacross each type of platform. To scale these results to thenumber of platforms present in data centers, this measureddata is extrapolated analytically using a data center alloca-tion simulator which combines real power and performancedata, prediction output, and allocation policy definitions tocalculate power efficiency in various data center configura-tions. In the simulator, we provide the output of the predictoras input to the allocation policy. We always assume that theplatforms which are profiled are the 2 GHz Sossaman plat-form and the 3.7 GHz Dempsey system. Since we assumethe workload attributes are profiled accurately on these sys-tems, for fairness we also assume that for these two plat-forms performance data is obtained via profiling as well andis therefore known perfectly. We then consider three differ-ent scenarios: (1) all other platform performance informa-tion is known perfectly (oracle) (2) our BF model is used topredict performance for the remainder of platforms as de-scribed in Sect. 5 (BF model) (3) incorporating a simple sta-tistical regression approach (Stat. Est.). For this regressionmethod, we profile a subset of applications across all plat-forms to obtain linear performance prediction models para-meterized by variables that can be obtained by profiling aworkload on the 2 GHz Sossaman and 3.7 GHz Dempseysystems (CPI, MPI, etc.). The regression models can thenbe used to predict performance of any application. The base-line allocation scheme we compare against is a random one,since it closely estimates the common round-robin or uti-lization based approach.

The efficiency improvements achievable in a data centerare also dependent upon the system’s current mix of appli-cations. To obtain our results, we randomly pick applica-tions and allocate them using the random approach until nomore workloads can be scheduled. Using the resulting setof workloads, we then evaluate power consumption whenusing our prediction and allocation policies, and comparethem against the random allocation result. This is repeated ahundred times for each of our data points.

We first look at the benefits achieved with HALM indata center configurations with across-platform and within-

platform heterogeneity but no DPM support. In particular,we include the four base platforms, Woodcrest, Sossaman,Dempsey, and Irwindale, as well as the frequency variationsof the platforms. We create data center configurations withequal numbers of each type of system. Trends are consistentacross various data center sizes, so for brevity, we includehere only results with 1000 platforms of each type. The re-sulting data center has 13 types of platforms, and power con-sumptions vary with allocation as shown in Fig. 7(a). Thefirst interesting observation is that platform heterogeneity al-lows us to achieve improved benefits over a simple randomapproach. Indeed, we see improvements of 22% with perfectknowledge and 21% using our BF based prediction com-pared to a random allocation policy. We also observe a sig-nificant difference between the statistical and analytical pre-diction schemes. The regression approach is unable to scalein terms of accuracy with increased heterogeneity, whereasthe BF approach achieves close to optimal power savings.

In order to evaluate how well our allocator can exploitDPM support, we extend the thirteen platform type configu-ration with an additional Woodcrest 3 GHz platform whichprovides DPM support. We again find that our BF predictionmethod can provide improved aggregate savings across allmachines over the statistical approach as shown in Fig. 7(b).To more closely determine our ability to exploit DPM mech-anisms, we also evaluate the power consumption of the thou-sand DPM-enabled platforms (all of which are active). Wefind that our BF model based allocation is able to improvethe power efficiency of these platforms by 3.3%. This illus-trates the potential of HALM to provide additional benefitswhen platforms vary in the power management they support.

7.2 Maximizing performance under power budgets

As a second benefit of HALM, we evaluate the ability toperform load shedding when power budgeting must be per-formed. In particular, our goal is to maximize performanceobtained in the data center while observing power budgets.For the purposes of this paper, we consider transient power

Fig. 7 HALM power improvements

Page 11: Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM) system that leverages hetero-geneity characteristics to provide two data center level

« CLUS 10586 layout: Large v.1.3 reference style: mathphys file: clus54.tex (petras) aid: 54 doctopic: OriginalPaper class: spr-twocol-v1 v.2008/04/14 Prn:15/04/2008; 16:06 p. 11»

Cluster Comput

1081 1135

1082 1136

1083 1137

1084 1138

1085 1139

1086 1140

1087 1141

1088 1142

1089 1143

1090 1144

1091 1145

1092 1146

1093 1147

1094 1148

1095 1149

1096 1150

1097 1151

1098 1152

1099 1153

1100 1154

1101 1155

1102 1156

1103 1157

1104 1158

1105 1159

1106 1160

1107 1161

1108 1162

1109 1163

1110 1164

1111 1165

1112 1166

1113 1167

1114 1168

1115 1169

1116 1170

1117 1171

1118 1172

1119 1173

1120 1174

1121 1175

1122 1176

1123 1177

1124 1178

1125 1179

1126 1180

1127 1181

1128 1182

1129 1183

1130 1184

1131 1185

1132 1186

1133 1187

1134 1188

Fig. 8 Power budgeting results

budgeting where workloads are not (re-)migrated, but in-stead, based upon an existing allocation, resources are tem-porarily withheld from applications and placed into lowpower states. For comparison, we consider three such ini-tial allocations from the previous section, one based upona random allocation, one based upon perfect oracle perfor-mance information, and finally an allocation based upon pre-diction with our BF approach. For each allocation, we con-sider two load shedding policies, one which randomly se-lects an application and sheds resources (i.e. a single com-pute node) if possible, and our greedy “smart” policy de-scribed in Sect. 6.2.

Figure 8(a) provides the aggregate performance acrossall workloads in the data center for different power budgets.The performance results are normalized with respect to themaximum achievable performance at the maximum uncon-strained power budget (100%). The figure shows how theperformance of different allocation policies decrease withdecreasing power budgets. The range of applied power bud-gets is limited to 50% as there is no possible allocation thatmeets the budget below this point. The six scenarios con-sidered are random shedding based upon a random allo-cation (rand-rand), our load shedding policy based upon arandom allocation (smart-rand), similarly the two sheddingapproaches based upon oracle based allocation (rand-oracleand smart-oracle), and finally, shedding based upon a BFprediction based allocation (rand-BF and smart-BF). We seemultiple interesting trends. First, the intrinsic benefits of theheterogeneity-aware allocations towards budgeting are ap-parent in the figure by the fact that performance does notbegin to reduce until lower power budgets compared to arandom allocation scheme. We also see that given a particu-lar allocation, our shedding policy provides benefits in per-formance across the set of power budgets. Moreover, again,we find that our BF prediction model behaves very close toan oracle based allocation when our load shedding policy isused. Overall, we see benefits of up to 18% in performancedegradation compared to a random load shedding policybased upon a random allocation. Figure 8(b) evaluates theperformance degradations for the six approaches by also

taking the fairness of load shedding into account. Here, weshow the aggregate data center performance as the harmonicmean of the individual workload throughputs, which is acommonly used metric for evaluating fairness [20]. We seefrom the figure that a random allocation always exhibits poorperformance regardless of the load shedding policy used.A heterogeneity-aware allocation, on the other hand, pro-vides improved fairness, particularly when combined withour load shedding policy.

8 Related work

HALM builds upon existing work and extends the state ofthe art in power management research. A variety of mecha-nisms exist to provide power and thermal management sup-port within a single platform. Brooks and Martonosi pro-posed mechanisms for the enforcement of thermal thresh-olds on the processor [3], focusing on single platform char-acteristics as opposed to the data center level managementachieved with HALM. Processor frequency and voltagescaling based upon memory access behavior has been shownto successfully provide power savings with minimal impactto applications. Resulting solutions include hardware basedapproaches [19] and OS-level techniques, that set processormodes based on predicted application behavior [14]. HALMis designed to manage load while being aware of such un-derlying power management occurring in server platforms.Power budgeting of SMP systems with a performance lossminimization objective has also been implemented via CPUthrottling [16]. Other budgeting solutions extend platformsupport for fine grain server power limiting [18]. The powerbudgeting achieved with HALM is based upon the use of re-source allocation to reduce power consumption across mul-tiple systems, as opposed to throttling performance of indi-vidual components.

At the data center level, incorporating temperature-awareness into workload placement has been proposed byMoore et al. [21], along with emulation environments forstudies of thermal implications of power management [11].HALM can use these thermal aware strategies to perform

Page 12: Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM) system that leverages hetero-geneity characteristics to provide two data center level

« CLUS 10586 layout: Large v.1.3 reference style: mathphys file: clus54.tex (petras) aid: 54 doctopic: OriginalPaper class: spr-twocol-v1 v.2008/04/14 Prn:15/04/2008; 16:06 p. 12»

Cluster Comput

1189 1243

1190 1244

1191 1245

1192 1246

1193 1247

1194 1248

1195 1249

1196 1250

1197 1251

1198 1252

1199 1253

1200 1254

1201 1255

1202 1256

1203 1257

1204 1258

1205 1259

1206 1260

1207 1261

1208 1262

1209 1263

1210 1264

1211 1265

1212 1266

1213 1267

1214 1268

1215 1269

1216 1270

1217 1271

1218 1272

1219 1273

1220 1274

1221 1275

1222 1276

1223 1277

1224 1278

1225 1279

1226 1280

1227 1281

1228 1282

1229 1283

1230 1284

1231 1285

1232 1286

1233 1287

1234 1288

1235 1289

1236 1290

1237 1291

1238 1292

1239 1293

1240 1294

1241 1295

1242 1296

power budgeting based upon data center temperature charac-teristics. Chase et al. discuss how to reduce power consump-tion in data centers by turning servers on and off based ondemand [4]. Utilizing this type of cluster reconfiguration inconjunction with DVFS [7] and the use of spare servers [23]has been investigated as well. As opposed to these ap-proaches, HALM attempts to reduce power consumptionby intelligently managing workloads across heterogeneousservers. Enforcing power budgets within data centers by al-locating power in a non-uniform manner across nodes hasbeen shown to be an effective management technique [9].Techniques for enforcing power budgets at blade enclosuregranularities have also been discussed [24]. HALM bud-gets aggregate power consumption via resource allocationwithout assigning per server power budgets as with theseprevious approaches.

Heterogeneity has been considered to some degree inprior work, including the evaluation of heterogeneous multi-core architectures with different core complexities [17]. Incomparison, HALM considers platform level heterogene-ity as opposed to processor asymmetry. In cluster environ-ments, a scheduling approach for power control has beenproposed for processors with varying fixed frequencies andvoltages [10]. HALM supports heterogeneity across addi-tional dimensions, such as power management capabilitiesand memory. A power efficient web server with intelligentrequest distribution in heterogeneous clusters is another ex-ample which considers leveraging heterogeneity in enter-prise systems [12]. HALM goes beyond these methods, byconsidering not just the differences in platforms’ perfor-mance capabilities, but also in their power management ca-pabilities.

9 Conclusions and future work

Power management in data center environments has becomean important area of research, in part because power deliveryand cooling limitations are quickly becoming a bottleneckin the provisioning of performance required by increasinglydemanding applications. This paper makes use of the man-agement flexibility afforded by virtualization solutions todevelop a heterogeneity-aware load management (HALM)system. HALM improves power management capabilitiesby exploiting the natural heterogeneity of platforms in datacenters, including differences in dynamic power manage-ment support that may be available. We introduce a threephase approach to mapping workloads to underlying re-sources to improve power efficiency consisting of struc-tured platform and workload descriptors, a prediction com-ponent to estimate the performance and power characteris-tics of various workload to platform mappings, and finallyan allocator which utilizes policies and prediction results to

perform decisions. We also evaluate a load shedding pol-icy based upon resulting allocations to improve performancewhen power budgeting must be performed.

Our results underscore two major conclusions. First, weshow that by intelligently considering the varying powermanagement capabilities of platforms, the ability for thesesystems to obtain power savings using their managementmechanisms can be vastly improved when compared toother assignment models. Using representative data centerconfigurations consisting of older P4 based platforms up toIntel Core microarchitecture based systems, we find that ourallocation architecture can improve power efficiency by 20%on average. In addition, our results show that by performingintelligent load shedding when power budgets must be ob-served, 18% improvements in performance degradation canbe obtained when using HALM.

In this paper, we present the beginning of our investi-gation into exploiting platform heterogeneity and emerg-ing virtualization support to improve the power characteris-tics of enterprise computing environments. As future work,we plan to integrate the management tradeoffs and lessonslearned from this work into virtualization layer managementapplications. This includes the consideration of distributedvirtualized workloads such as tiered web services where dif-ferent components may be appropriate for each layer, in-cluding heterogeneous I/O devices. The results presented inthis paper support the potential of this area of research forpower managing heterogeneous computing systems.

References

1. Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A.,Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtu-alization. In: Proceedings of the ACM Symposium on OperatingSystems Principles (SOSP), 2003

2. Bianchini, R., Rajamony, R.: Power and energy management forserver systems. IEEE Comput. 37(11), 68–76 (2004)

3. Brooks, D., Martonosi, M.: Dynamic thermal management forhigh-performance microprocessors. In: Proceedings of the 7th In-ternational Symposium on High-Performance Computer Architec-ture (HPCA), January 2001

4. Chase, J., Anderson, D., Thakar, P., Vahdat, A., Doyle, R.: Manag-ing energy and server resources in hosting centers. In: Proceedingsof the 18th Symposium on Operating Systems Principles (SOSP),October 2001

5. Chou, Y., Fahs, B., Abraham, S.: Microarchitecture optimizationsfor exploiting memory-level parallelism. In: Proceedings of theInternational Symposium on Computer Architecture (ISCA), June2004

6. Clark, C., Fraser, K., Hand, S., Hansen, J.G., Jul, E., Limpach,C., Pratt, I., Warfield, A.: Live migration of virtual machines. In:Proceedings of the 2nd ACM/USENIX Symposium on NetworkedSystems Design and Implementation (NSDI), May 2005

7. Elnozahy, E.N., Kistler, M., Rajamony, R.: Energy-efficient serverclusters. In: Proceedings of the Workshop on Power-Aware Com-puting Systems, February 2002

8. Fan, X., Weber, W.-D., Barroso, L.: Power provisioning for awarehouse-sized computer. In: Proceedings of the InternationalSymposium on Computer Architecture (ISCA), June 2007

Page 13: Unbekannt · ing platforms. We define an intelligent heterogeneity-aware load management (HALM) system that leverages hetero-geneity characteristics to provide two data center level

« CLUS 10586 layout: Large v.1.3 reference style: mathphys file: clus54.tex (petras) aid: 54 doctopic: OriginalPaper class: spr-twocol-v1 v.2008/04/14 Prn:15/04/2008; 16:06 p. 13»

Cluster Comput

1297 1351

1298 1352

1299 1353

1300 1354

1301 1355

1302 1356

1303 1357

1304 1358

1305 1359

1306 1360

1307 1361

1308 1362

1309 1363

1310 1364

1311 1365

1312 1366

1313 1367

1314 1368

1315 1369

1316 1370

1317 1371

1318 1372

1319 1373

1320 1374

1321 1375

1322 1376

1323 1377

1324 1378

1325 1379

1326 1380

1327 1381

1328 1382

1329 1383

1330 1384

1331 1385

1332 1386

1333 1387

1334 1388

1335 1389

1336 1390

1337 1391

1338 1392

1339 1393

1340 1394

1341 1395

1342 1396

1343 1397

1344 1398

1345 1399

1346 1400

1347 1401

1348 1402

1349 1403

1350 1404

9. Femal, M., Freeh, V.: Boosting data center performance throughnon-uniform power allocation. In: Proceedings of the Second In-ternational Conference on Autonomic Computing (ICAC), 2005

10. Ghiasi, S., Keller, T., Rawson, F.: Scheduling for heterogeneousprocessors in server systems. In: Proceedings of the InternationalConference on Computing Frontiers, 2005

11. Heath, T., Centeno, A.P., George, P., Ramos, L., Jaluria, Y., Bian-chini, R.: Mercury and freon: Temperature emulation and manage-ment in server systems. In: Proceedings of the International Con-ference on Architectural Support for Programming Languages andOperating Systems (ASPLOS), October 2006

12. Heath, T., Diniz, B., Carrera, E.V., Meira, W. Jr., Bianchini, R.:Energy conservation in heterogeneous server clusters. In: Proceed-ings of the 10th Symposium on Principles and Practice of ParallelProgramming (PPoPP), 2005

13. Hewlett-Packard, Intel, Microsoft, Phoenix, and Toshiba: Ad-vanced configuration and power interface specification. http://www.acpi.info (2004)

14. Isci, C., Contreras, G., Martonosi, M.: Live, runtime phase mon-itoring and prediction on real systems with application to dy-namic power management. In: Proceedings of the 39th Interna-tional Symposium on Microarchitecture (MICRO-39), December2006

15. Koh, Y., Knauerhase, R., Brett, P., Bowman, M., Wen, Z., Pu, C.:An analysis of performance interference effects in virtual environ-ments. In: Proceedings of the IEEE International Symposium onPerformance Analysis of Systems and Software (ISPASS), 2007

16. Kotla, R., Ghiasi, S., Keller, T., Rawson, F.: Scheduling processorvoltage and frequency in server and cluster systems. In: Proceed-ings of the Workshop on High-Performance, Power-Aware Com-puting (HP-PAC), 2005

17. Kumar, R., Tullsen, D., Ranganathan, P., Jouppi, N., Farkas,K.: Single-Isa heterogeneous multi-core architectures for multi-threaded workload performance. In: Proceedings of the Interna-tional Symposium on Computer Architecture (ISCA), June 2004

18. Lefurgy, C., Wang, X., Ware, M.: Server-level power control. In:Proceedings of the IEEE International Conference on AutonomicComputing (ICAC), June 2007

19. Li, H., Cher, C., Vijaykumar, T., Roy, K.: Vsv: L2-miss-drivenvariable supply-voltage scaling for low power. In: Proceed-ings of the IEEE International Symposium on Microarchitecture(MICRO-36), 2003

20. Luo, K., Gummaraju, J., Franklin, M.: Balancing throughput andfairness in SMT processors. In: Proceedings of the IEEE Interna-tional Symposium on Performance Analysis of Systems and Soft-ware (ISPASS), November 2001

21. Moore, J., Chase, J., Ranganathan, P., Sharma, R.: Makingscheduling cool: Temperature-aware workload placement in datacenters. In: Proceedings of USENIX ’05, June 2005

22. Nathuji, R., Isci, C., Gorbatov, E.: Exploiting platform heterogene-ity for power efficient data centers. In: Proceedings of the IEEEInternational Conference on Autonomic Computing (ICAC), June2007

23. Rajamani, K., Lefurgy, C.: On evaluating request-distributionschemes for saving energy in server clusters. In: Proceedings ofthe IEEE International Symposium on Performance Analysis ofSystems and Software (ISPASS), March 2003

24. Ranganathan, P., Leech, P., Irwin, D., Chase, J.: Ensemble-levelpower management for dense blade servers. In: Proceedings ofthe International Symposium on Computer Architecture (ISCA),2006

25. Sugerman, J., Venkitachalam, G., Lim, B.-H.: Virtualizing i/o de-vices on VMware workstation’s hosted virtual machine monitor.In: Proceedings of the USENIX Annual Technical Conference,2001

26. Zhang, W.: Linux virtual server for scalable network services. In:Ottawa Linux Symposium, 2000

Ripal Nathuji is a Ph.D. candidatein the School of Electrical and Com-puter Engineering at the Georgia In-stitute of Technology. He previouslyreceived his M.S. in Computer En-gineering from Texas A&M Univer-sity and his B.S. in Electrical Engi-neering and Computer Science fromthe Massachusetts Institute of Tech-nology. His current research focuseson system-level power managementof computing systems, with applica-tions to virtualized enterprise serverenvironments.

Canturk Isci is a senior memberof technical staff at VMware. Hisresearch interests include power-aware computing systems and work-load-adaptive dynamic management.He has a Ph.D. and an M.A. in elec-trical engineering from PrincetonUniversity, an M.S. in VLSI systemdesign from University of Westmin-ster, London, UK, and a B.S. in elec-trical engineering from Bilkent Uni-versity, Ankara, Turkey.

Eugene Gorbatov is a researcher atIntel’s Energy Efficient Systems lab.His current research focuses on plat-form power management, particu-larly the interaction of different plat-form components and system soft-ware.

Karsten Schwan is a professor inthe College of Computing at theGeorgia Institute of Technology, andis also the Director of the Centerfor Experimental Research in Com-puter Systems (CERCS). He ob-tained his M.S. and Ph.D. degreesfrom Carnegie-Mellon University inPittsburgh, Pennsylvania, where hebegan his research in high perfor-mance computing, addressing op-erating and programming systemssupport for the Cm* multiprocessor.His current work ranges from topics

in operating and communication systems, to middleware, to paralleland distributed applications, focusing on information-intensive distrib-uted applications in the enterprise domain and in the high performancedomain.