Cto forum nirav_kapadia_2006_03_31_2006

28
Grids, utility computing and a perspective on the future of IT infrastructure Washington Area CTO Forum March 31, 2006 Nirav Kapadia [email protected]

Transcript of Cto forum nirav_kapadia_2006_03_31_2006

Page 1: Cto forum nirav_kapadia_2006_03_31_2006

Grids, utility computingand a perspective onthe future of IT infrastructure

Washington Area CTO ForumMarch 31, 2006

Nirav [email protected]

Page 2: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 2

Outline

Characterizing computing grids Grids as intended versus what we see today Common types of grids today

Putting computing grids to work Types of problems addressed by today’s grids Operational considerations in deploying a grid

A perspective on the future of IT infrastructure Cost pressures and technology commoditization Grid and utility computing: the technology enablers

Page 3: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 3

Grids came about from a need for large scale, collaborative computing

Scale is measured in terms of users, nodes, organizations, geography, and heterogeneity A grid in the strict sense of the word involves a

large number of heterogeneous, shared resources

Collaboration is measured in terms of resource sharing and interoperability A key characteristic is the ability to manage

across organizational boundaries

Page 4: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 4

Systems for large scale, collaborative computing must meet key criteria

Group A

Scalable with users and resources

Support for heterogeneity

Group B

Support for interoperability

Scalable with geographical distances

Group C

Fully distributed (federated) architecture

Ability to compartmentalize along organizational boundariesS

tric

t d

efin

itio

n o

f co

mp

uti

ng

gri

d

Bro

ad

defin

itio

n o

fco

mp

uti

ng

gri

d

Page 5: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 5

Many commercial grid solutions only meet the broad definition of a grid

Cluster management systems Typically harness clusters of dedicated servers Examples include Platform LSF, Sun Grid Engine

CPU-scavenging “master-slave” applications Typically take advantage of idle desktop cycles Examples include SETI@Home, distributed.net

Page 6: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 6

Many commercial grid solutions only meet the broad definition of a grid

Application-specific, custom-built grids Typically built around a key business function Examples include Acxiom, Oracle offerings

Page 7: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 7

Today, solutions that meet the strict definition of a grid have to be “built”

Grid solutions based on the Globus toolkit Several vendors have Globus based offerings Univa Corp is commercializing Globus

Other grid solutions in academia and research Most are custom-built and target a specific

problem Typically not appropriate for commercial use

(today)

Page 8: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 8

Key takeaways

A grid is a distributed computing system that enables large scale, collaborative computing Scalable across a large number of diverse and

geographically dispersed resources

Many commercial “grid solutions” of today do not meet the strict definition of a grid Limited ability to manage policies and

resources across administrative boundaries

Page 9: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 9

Outline

Characterizing computing grids Grids as intended versus what we see today Common types of grids today

Putting computing grids to work Types of problems addressed by today’s grids Operational considerations in deploying a grid

A perspective on the future of IT infrastructure Cost pressures and technology commoditization Grid and utility computing: the technology enablers

Page 10: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 10

Even today’s grids can benefit users with large scale computing needs

High throughput computing (HTC) Many independent (non-communicating)

tasks Large problems that break up into

manageable, independent tasks

High performance computing (HPC) Large problem that is not decomposable into

manageable, independent tasks

Page 11: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 11

High throughput computing is common in business environments

Large, legacy applications are best served by cluster management systems Compute-intensive apps are preferable but a

mix of compute- and data-intensive apps are manageable

Customizable apps that work on small slices of data work well with CPU-scavenging grids Apps must be compute-intensive and preferably

run within a sandbox

Page 12: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 12

High performance computing isseen more in targeted environments

Applications involving multiple, communicating tasks are typically require custom designed grid environments Examples include Oracle grid offering and

some test beds built with Globus Other examples include distributed

computing platforms such as PVM and MPI

Page 13: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 13

So… you’re ready to deploy a grid computing environment…

As with any other technology, there are several operational considerations… Resources on the grid – dedicated or shared? Access management – who needs access to

what? Data management – how does data get to the

grid? Security model employed by the grid

Page 14: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 14

Resources on the grid –should they be dedicated or shared?

Cluster Mgmt Systems

Cluster management systems work best with dedicated resources

Condor – from the U of Wisconsin – is a notable exception, but not commercially available

CPU-scavenging grids

As the name implies, resources are shared – and typically involve desktops

A custom screen saver is the most common vehicle for running the grid application

Page 15: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 15

Access management –who needs (gets) access to what?

Cluster Mgmt Systems

Option #1: jobs run in a guest account

Shared access across jobs

Option #2: accounts for everyone on all machines

Homogeneous uid pool highly recommended

Logins typically disabled

CPU-scavenging grids

Option #1: jobs run with user’s privileges

If downloaded by user Option #2: jobs run in

guest account If set up by

administrator No direct remote user

access to desktop

Page 16: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 16

Data management –how does data get to the apps?

Cluster Mgmt Systems

Transfer user specified files via ftp, scp, etc

File staging for large data

On demand file transfer (system call traps)

Shared file systems

CPU-scavenging grids

Data embedded within application or retrieved via HTTP/Java call-backs

Limited data, typically no files

Page 17: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 17

Security model –user accountability is key today

Basi

c sy

stem

and k

ern

el sa

feguard

s

Run TimeEnvironment

ApplicationExecutable

ApplicationGeneration

ApplicationUsers

UnchangedBinaries

Object CodeModifications

Source CodeModifications

CustomApplications

Ideal Grid

Unix

LSF, PBS, SGE

Globus

Condor

Java, PCCs

distributed.net,SETI@Home, etc

Access management (capability control)Opportunities for subversion

Page 18: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 18

Key takeaways

Today’s commercially available grid solutions primarily target high throughput computing Cluster management systems and CPU-

scavenging grids are the most common

Carefully consider the policy implications of grids in terms of access and data management More of a concern for grids that span sub-nets or

fire walls

Page 19: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 19

Outline

Characterizing computing grids Grids as intended versus what we see today Common types of grids today

Putting computing grids to work Types of problems addressed by today’s grids Operational considerations in deploying a grid

A perspective on the future of IT infrastructure Cost pressures and technology commoditization Grid and utility computing: the technology enablers

Page 20: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 20

Even as grids take hold, theIT landscape is changing rapidly…

Technology is rapidly being commoditized

Businesses are more willing and able to shop for IT services

In-house IT infrastructure is increasingly seen as complex and rigid © Harvard Business Review

Page 21: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 21

IT infrastructure is already a commodity from a business view

Outsourcing is pervasive; and standards-based, open systems are increasingly common Cost pressures will continue driving businesses to

streamline IT infrastructure

More often than not, customized in-house IT systems stand out for their cost and complexity Common off-the-shelf solutions provide more value

in the absence of direct competitive advantage

Page 22: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 22

In time, economics will drive IT infrastructure out of the enterprise

The technology enablers for this paradigm exist today, but are still nascent (True) grids offer a way to manage computing

resources across organizational boundaries Utility computing solutions bring together

grids, data center automation, and virtualization

Page 23: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 23

The technology implications of these changes are enormous

Computing infrastructure needs to become transparent to end users Users only interact with applications and data

Policy management needs to be decoupled from system management Cannot assume users can be held

accountable Components of computing systems need

to be less tightly coupled CPU, OS, data, apps may all be in different,

remote locations

Page 24: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 24

A utility computing test bed at Purdue showcases this paradigm

Operating since 1995; now a joint development effort between Purdue and U of Florida By 2001, allowed 3,000+ users from 30

countries to run ~100 applications in a utility environment

Extensively validated: ~400,000 runs (by 2001); highly peaked usage profile

Powers online simulations in the nanoHUB.org portal for the nanotechnology community

Page 25: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 25

nanoHUB.org – remote access to simulators and compute power

ClusterTeraGrid

Condor-GGlobus

Condor-GGlobus

Internet

nanoHUB infrastructure

nanoHUB.orgWeb site

Physical Machine

Virtual Machine

NMI Cluster

Slide courtesy of Gerhard Klimeck, Network for Computational Nanotechnology

Remote desktop (VNC)

Real users and real usage >10,687 users

Page 26: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 26

ApplicationRepositories

Data Vaults

CPU Farms

WebPortal

PUNCH Virtual Machine

Utility ServicesLocal Services

OSRepositories

Custom computingenvironment assembled

in real time

Inside nanoHUB.org

Page 27: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 27

In conclusion…

Today’s commercially available grids provide a valuable but narrow service More efficient computing in a closed

environment; limited support for cross-organizational sharing

In time, grid and utility computing technologies will move IT infrastructure out of the enterprise Virtualization and data center automation

products are visible precursors

Page 28: Cto forum nirav_kapadia_2006_03_31_2006

© Nirav Kapadia 28

Questions? Comments?Email: [email protected]