Computing Outside The Box

52
1 Ian Foster Computation Institute Argonne National Lab & University of Chicago

description

The past decade has seen increasingly ambitious and successful methods for outsourcing computing. Approaches such as utility computing, on-demand computing, grid computing, software as a service, and cloud computing all seek to free computer applications from the limiting confines of a single computer. Software that thus runs "outside the box" can be more powerful (think Google, TeraGrid), dynamic (think Animoto, caBIG), and collaborative (think FaceBook, myExperiment). It can also be cheaper, due to economies of scale in hardware and software. The combination of new functionality and new economics inspires new applications, reduces barriers to entry for application providers, and in general disrupts the computing ecosystem. I discuss the new applications that outside-the-box computing enables, in both business and science, and the hardware and software architectures that make these new applications possible.

Transcript of Computing Outside The Box

Page 1: Computing Outside The Box

1

Ian FosterComputation Institute

Argonne National Lab & University of Chicago

Page 2: Computing Outside The Box

3

1890

Page 3: Computing Outside The Box

4

1953

Page 4: Computing Outside The Box

5

“Computation may someday be organized as a public utility …

The computing utility could become the basis for a new and

important industry.”

John McCarthy (1961)

Page 5: Computing Outside The Box

6

Page 6: Computing Outside The Box

7

Page 7: Computing Outside The Box

8

Page 8: Computing Outside The Box

9

Page 9: Computing Outside The Box

10

Page 10: Computing Outside The Box

11I-WAY, 1995

Page 11: Computing Outside The Box

12

The grid, 1998

“Dependable, consistent, pervasive access to resources”

Dependable: Performance and functionality guarantees

Consistent: Uniform interfaces to a wide variety of resources

Pervasive: Ability to “plug in” from anywhere

Page 12: Computing Outside The Box

13

Application

Infrastructure

Page 13: Computing Outside The Box

14

Application

InfrastructureService oriented infrastructure

Page 14: Computing Outside The Box

15

Layered grid architecture

Application

Fabric“Controlling things locally”: Access to, & control of,

resources

Connectivity“Talking to things”: communication (Internet protocols) & security

Resource“Sharing single resources”: negotiating access, controlling

use

Collective“Managing multiple resources”: ubiquitous infrastructure

services

User“Specialized services”: user- or appln-specific

distributed services

InternetTransport

Application

Link

Internet Protocol

Architecture

Initially custom … later Web Services

Page 15: Computing Outside The Box

16

Page 16: Computing Outside The Box

17www.opensciencegrid.org

Page 17: Computing Outside The Box

18www.opensciencegrid.org

Page 18: Computing Outside The Box

19

Bennett Berthenthal et al., www.sidgrid.org

Page 19: Computing Outside The Box

20Brian Tieman

Page 20: Computing Outside The Box

21

Simplifiedexampleworkflows

Genome sequence analysis

Physicsdata

analysis

Sloan digital sky

surveywww.opensciencegrid.org

Page 21: Computing Outside The Box

22

“Sine” workload, 2M tasks, 10MB:10ms ratio, 100 nodes, GCC policy, 50GB caches/node

IoanRaicu

Page 22: Computing Outside The Box

23Same scenario, but with dynamic resource

provisioning

Page 23: Computing Outside The Box

24

Data diffusion ine-wave workload: Summary

GPFS 5.70 hrs, ~8Gb/s, 1138 CPU hrs

DD+SRP 1.80 hrs, ~25Gb/s, 361 CPU hrs DD+DRP 1.86 hrs, ~24Gb/s, 253 CPU hrs

Page 24: Computing Outside The Box

25

Application

InfrastructureService oriented infrastructure

Page 25: Computing Outside The Box

26

ApplicationService oriented applications

InfrastructureService oriented infrastructure

Page 26: Computing Outside The Box

27

Page 27: Computing Outside The Box

28

ApplnService

Create

Index service

StoreRepository ServiceAdvertize

Discover

Invoke;get results

Introduce

Container

Transfer GAR

Deploy

Ohio State University and Argonne/U.Chicago

Creating Services in 2008Introduce and gRAVI

Introduce Define service Create skeleton Discover types Add operations Configure security

Grid Remote Application Virtualization Infrastructure Wrap executables

Globus

Page 28: Computing Outside The Box

29

As of Oct19, 2008:

122 participants105 services

70 data35 analytical

Page 29: Computing Outside The Box

30

Microarray clustering using Taverna

1. Query and retrieve microarray data from a caArray data service:cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/CaArrayScrub

2. Normalize microarray data using GenePattern analytical service node255.broad.mit.edu:6060/wsrf/services/cagrid/PreprocessDatasetMAGEService

1. Hierarchical clustering using geWorkbench analytical service: cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/HierarchicalClusteringMage

Workflow in/output

caGrid services

“Shim” servicesothers

Wei Tan

Page 30: Computing Outside The Box

31

Birmingham•

The Globus-basedLIGO data grid

Replicating >1 Terabyte/day to 8 sites

>100 million replicas so farMTBF = 1 month

LIGO Gravitational Wave Observatory

Cardiff

AEI/Golm

Page 31: Computing Outside The Box

32

Pull “missing” files to a storage system

List of required Files

GridFTPLocalReplicaCatalog

ReplicaLocationIndex

Data Replicati

on Service

Reliable File

Transfer Service Local

ReplicaCatalog

GridFTP

Data replication service

“Design and Implementation of a Data Replication Service Based on the Lightweight Data Replicator System,” Chervenak et al., 2005

ReplicaLocationIndex

Data MovementData Location

Data Replication

Page 32: Computing Outside The Box

33

Hypervisor/OS Deploy hypervisor/OS

Why not leverage dynamic deployment capabilities?

Physical machineProcure hardware

VM VM Deploy virtual machine

State exposed & access uniformly at all levelsProvisioning, management, and monitoring at all levels

JVM Deploy container

DRS Deploy service GridFTP LRC

VO Services

GridFTP

Page 33: Computing Outside The Box

34

Maybe we need to specialize further …

User

ServiceProvider

“Provide access to data D at S1, S2,

S3 with performance P”

ResourceProvider

“Provide storage with performance P1, network with

P2, …”

D

S1

S2

S3

D

S1

S2

S3Replica catalog,User-level multicast, …

D

S1

S2

S3

Page 34: Computing Outside The Box

35Infrastructure

Applications

Page 35: Computing Outside The Box

36

Energy

Progress of adoption

Page 36: Computing Outside The Box

37

Page 37: Computing Outside The Box

38US$3

Page 38: Computing Outside The Box

39Credit: Werner Vogels

Page 39: Computing Outside The Box

40Credit: Werner Vogels

Page 40: Computing Outside The Box

41

Animoto EC2 image usage

Day 1 Day 8

0

4000

Page 41: Computing Outside The Box

42

Software

Platform

Infrastructure

Saleforce.com, Google,Animoto, …, …, …caBIG, TG gateways

Page 42: Computing Outside The Box

43

Software

Platform

Infrastructure

Saleforce.com, Google,Animoto, …, …, …caBIG, TG gateways

Amazon, GoGrid, Sun,Microsoft, …

Page 43: Computing Outside The Box

44

Software

Platform

Infrastructure

Saleforce.com, Google,Animoto, …, …, …caBIG, TG gateways

Amazon, GoGrid, Sun,Microsoft, …

Amazon, Google,Microsoft, …

Page 44: Computing Outside The Box

45

Dynamo: Amazon’s highly available key-value store (DeCandia et al.,

SOSP’07) Simple query model

Weak consistency, no isolation

Stringent SLAs (e.g., 300ms for 99.9% of requests; peak 500 requests/sec)

Incremental scalability

Symmetry Decentralization Heterogeneity

Page 45: Computing Outside The Box

Technologies used in Dynamo

Problem Technique AdvantagePartitioning

Consistent hashing

Incremental scalability

High Availability for writes

Vector clocks with

reconciliation during reads

Version size is decoupled from update rates

Handling temporary failures

Sloppy quorum and hinted handoff

Provides high availability and

durability guarantee when some of the replicas are not

availableRecovering from

permanent failures

Anti-entropy using Merkle

trees

Synchronizes divergent replicas in the background

Membership and failure detection

Gossip-based membership

protocol and failure

detection.

Preserves symmetry and avoids having a centralized registry

for storing membership and node liveness information

Page 46: Computing Outside The Box

47

ApplicationService oriented applications

InfrastructureService oriented infrastructure

Page 47: Computing Outside The Box

48

Energy Internet

The Shape of Grids to Come?

Page 48: Computing Outside The Box

49

Killers apps for COTB?

Biomedical informatics/Evidence-based medicine

Human responses to global climate disruption

Page 49: Computing Outside The Box

50

My servers

Chicago

Chicago

handle.net

BIRN

Chicago

IaaS provider

Chicago

BIRN

Chicago

Using IaaS in biomedical informatics

Page 50: Computing Outside The Box

51

“The computer revolution

hasn’t happened yet.”

Alan Kay, 1997

Page 51: Computing Outside The Box

52Time

Con

nect

ivity

(on

log

scal

e) Science Enterprise Consumer

“When the network is as fast as the computer's

internal links, the machine disintegrates across

the net into a set of special purpose appliances”

(George Gilder, 2001)

Grid Cloud ????

Page 52: Computing Outside The Box

Computation Institutewww.ci.uchicago.edu

Thank you!