Slide 1 Cluster-on-Demand (COD) Justin Moore Duke University.

8
Slide 1 Cluster-on-Demand Cluster-on-Demand (COD) (COD) Justin Moore Justin Moore Duke University Duke University

Transcript of Slide 1 Cluster-on-Demand (COD) Justin Moore Duke University.

Page 1: Slide 1 Cluster-on-Demand (COD) Justin Moore Duke University.

Slide 1

Cluster-on-Demand Cluster-on-Demand (COD)(COD)

Justin MooreJustin Moore

Duke UniversityDuke University

Page 2: Slide 1 Cluster-on-Demand (COD) Justin Moore Duke University.

Slide 2

How Big Is It?How Big Is It?

500? 5000? 25,000?500? 5000? 25,000?

Clusters are growingClusters are growing

Clusters are expensiveClusters are expensive

– Power, A/C, Power, A/C, ManagementManagement … …

How to manage {heat, power, failures}?How to manage {heat, power, failures}?

How to keep everything organized?How to keep everything organized?

How to divide resources?How to divide resources?

Page 3: Slide 1 Cluster-on-Demand (COD) Justin Moore Duke University.

Slide 3

How Do You Use It?How Do You Use It?

We’ve got good middlewareWe’ve got good middleware– Batch queues, Internet Services, research apps …Batch queues, Internet Services, research apps …

But customers are very pickyBut customers are very picky– ““Linux!” “FreeBSD!” “Windows!” “Minix!” “Minix??”Linux!” “FreeBSD!” “Windows!” “Minix!” “Minix??”

– ““I only need it for 30 minutes!!”I only need it for 30 minutes!!”

Customers != administratorsCustomers != administrators– Contributing to the problem, not the solutionContributing to the problem, not the solution

How to share and manage our clusters?How to share and manage our clusters?

““Can’t we all just get along??”Can’t we all just get along??”

Page 4: Slide 1 Cluster-on-Demand (COD) Justin Moore Duke University.

Slide 4

COD: The More the MerrierCOD: The More the Merrier

Automated framework for resource managementAutomated framework for resource management

Owners define policies, customers define configsOwners define policies, customers define configs

COD creates, configures COD creates, configures dynamic virtual clustersdynamic virtual clusters

– Isolated, secure collection of nodesIsolated, secure collection of nodes

– Backed by network storageBacked by network storage

– Automatic configuration: fast and OS-agnosticAutomatic configuration: fast and OS-agnostic

Middleware negotiates allocations with CODMiddleware negotiates allocations with COD

– Virtual Cluster Manager: COD-aware layerVirtual Cluster Manager: COD-aware layer

Page 5: Slide 1 Cluster-on-Demand (COD) Justin Moore Duke University.

Slide 5

Dynamic Virtual ClustersDynamic Virtual Clusters

CODManager

Reserve pool(off-power)

SGE VirtualCluster

Ninja Virtual Cluster

Node reallocatio

nExample: CNN on 9/11

DB

Page 6: Slide 1 Cluster-on-Demand (COD) Justin Moore Duke University.

Slide 6

Those Wonderful ToysThose Wonderful Toys

Leverage open standards and open sourceLeverage open standards and open source– DHCP, NFS, NIS, XMLDHCP, NFS, NIS, XML

– Only constraint is that Linux must support hardwareOnly constraint is that Linux must support hardware

– PXELinux-based installer, RHAT/Debian toolsPXELinux-based installer, RHAT/Debian tools

Currently testing working COD prototypeCurrently testing working COD prototype– Core of policy-based scheduling engine: CSP-solverCore of policy-based scheduling engine: CSP-solver

– Framework of node requests + allocation Framework of node requests + allocation negotiationnegotiation

– OS- and filesystem-agnostic installerOS- and filesystem-agnostic installer

– Testbed to examine policies and microbenchmarksTestbed to examine policies and microbenchmarks

Page 7: Slide 1 Cluster-on-Demand (COD) Justin Moore Duke University.

Slide 7

COD: Size Doesn’t MatterCOD: Size Doesn’t Matter

Enable management scalability for hosting Enable management scalability for hosting centerscenters

– Hierarchical policy-driven mechanismsHierarchical policy-driven mechanisms

– Empower owners and customersEmpower owners and customers

Details and paper atDetails and paper at

http://www.cs.duke.edu/~justin/cod/http://www.cs.duke.edu/~justin/cod/

Page 8: Slide 1 Cluster-on-Demand (COD) Justin Moore Duke University.

Slide 8

Questions?Questions?