Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science...

34
Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The University of Chicago http://www.mcs.anl.gov/~foster Plenary Talk, I2 Virtual Conference, October 2, 2001

Transcript of Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science...

Page 1: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

Grid TechnologiesEnabling Collaborative Science

Ian Foster

Mathematics and Computer Science Division

Argonne National Laboratory

and

Department of Computer Science

The University of Chicago

http://www.mcs.anl.gov/~foster

Plenary Talk, I2 Virtual Conference, October 2, 2001

Page 2: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Issues I Will Address

Problem statement: What are Grids? Architecture and technologies Projects and futures

Page 3: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

The Grid Problem

Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations

Page 4: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Elements of the Problem

Resource sharing– Computers, storage, sensors, networks, …

– Sharing always conditional: issues of trust, policy, negotiation, payment, …

Coordinated problem solving– Beyond client-server: distributed data analysis,

computation, collaboration, … Dynamic, multi-institutional virtual orgs

– Community overlays on classic org structures

– Large or small, static or dynamic

Page 5: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Grid Communities & Applications:Data Grids for High Energy Physics

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm

~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPSFrance Regional Centre

Italy Regional Centre

Germany Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Caltech ~1 TIPS

~622 Mbits/sec

Tier 0Tier 0

Tier 1Tier 1

Tier 2Tier 2

Tier 4Tier 4

1 TIPS is approximately 25,000

SpecInt95 equivalents

Image courtesy Harvey Newman, Caltech

Page 6: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Grid Communities and Applications:Network for Earthquake Eng. Simulation

NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other

On-demand access to experiments, data streams, computing, archives, collaboration

NEESgrid: Argonne, Michigan, NCSA, UIUC, USC

Page 7: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Why Discuss Architecture?

Descriptive– Provide a common vocabulary for use when

describing Grid systems Guidance

– Identify key areas in which services are required

Prescriptive– Define standard “Intergrid” protocols and

APIs to facilitate creation of interoperable Grid systems and portable applications

Page 8: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

What Sorts of Standards? Need for interoperability when different groups

want to share resources– E.g., IP lets me talk to your computer, but how do

we establish & maintain sharing?

– How do I discover, authenticate, authorize, describe what I want to do, etc., etc.?

Need for shared infrastructure services to avoid repeated development, installation, e.g.– One port/service for remote access to computing,

not one per tool/application

– X.509 enables sharing of Certificate Authorities

Page 9: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

So, in Defining Grid Architecture, We Must Address …

Development of Grid protocols & services– Protocol-mediated access to remote resources

– New services: e.g., resource brokering

– “On the Grid” = speak Intergrid protocols

– Mostly (extensions to) existing protocols Development of Grid APIs & SDKs

– Facilitate application development by supplying higher-level abstractions

The model is the Internet and Web

Page 10: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

The Role of Grid Services(aka Middleware) and Tools

Remotemonitor

Remoteaccess

Informationservices

Faultdetection

. . .Resourcemgmt

CollaborationTools

Data MgmtTools

Distributedsimulation

. . .

net

Page 11: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Layered Grid Architecture(By Analogy to Internet Architecture)

Application

Fabric“Controlling things locally”: Access to, & control of, resources

Connectivity“Talking to things”: communication (Internet protocols) & security

Resource“Sharing single resources”: negotiating access, controlling use

Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services

InternetTransport

Application

Link

Inte

rnet P

roto

col

Arch

itectu

re

Page 12: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Where Are We With Architecture?

No “official” standards exist But:

– Globus Toolkit has emerged as the de facto standard for several important Connectivity, Resource, and Collective protocols

– GGF has an architecture working group

– Technical specifications are being developed for architecture elements: e.g., security, data, resource management, information

– Internet drafts submitted in security area

Page 13: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Grid Services Architecture:Connectivity Layer Protocols & Services

Communication– Internet protocols: IP, DNS, routing, etc.

Security: Grid Security Infrastructure (GSI)– Uniform authentication & authorization

mechanisms in multi-institutional setting

– Single sign-on, delegation, identity mapping

– Public key technology, SSL, X.509, GSS-API (several Internet drafts document extensions)

– Supporting infrastructure: Certificate Authorities, key management, etc.

Page 14: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Site A(Kerberos)

Site B (Unix)

Site C(Kerberos)

Computer

User

Single sign-on via “grid-id”& generation of proxy cred.

Or: retrieval of proxy cred.from online repository

User ProxyProxy

credential

Computer

Storagesystem

Communication*

GSI-enabledFTP server

AuthorizeMap to local idAccess file

Remote fileaccess request*

GSI-enabledGRAM server

GSI-enabledGRAM server

Remote processcreation requests*

* With mutual authentication

Process

Kerberosticket

Restrictedproxy

Process

Restrictedproxy

Local id Local id

AuthorizeMap to local idCreate processGenerate credentials

Ditto

GSI in Action: “Create Processes at A and B that Communicate & Access Files at C”

Page 15: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Grid Services Architecture (3):Resource Layer Protocols & Services

Resource management: GRAM– Remote allocation, reservation, monitoring,

control of [compute] resources Data access: GridFTP

– High-performance data access & transport Information: MDS (GRRP, GRIP)

– Access to structure & state information & others emerging: catalog access, code

repository access, accounting, … All integrated with GSI

Page 16: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

GRAM Resource Management Protocol

Grid Resource Allocation & Management– Allocation, monitoring, control of computations

Simple HTTP-based RPC– Job request: Returns opaque, transferable “job

contact” string for access to job

– Job cancel, Job status, Job signal

– Event notification (callbacks) for state changes Protocol/server address robustness (exactly once

execution), authentication, authorization Servers for most schedulers; C and Java APIs

Page 17: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Resource Management Futures:GRAM-2 (planned for late 2001)

Advance reservations– As prototyped in GARA in previous 2 years

Multiple resource types– Manage anything: storage, networks, etc., etc.

Recoverable requests, timeout, etc.– Build on early work with Condor group

Use of SOAP (RPC using HTTP + XML)– First step towards Web Services

Policy evaluation points for restricted proxies

Karl Czajkowski, Steve Tuecke, others

Page 18: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Data Access & Transfer

GridFTP: extended version of popular FTP protocol for Grid data access and transfer

Secure, efficient, reliable, flexible, extensible, parallel, concurrent, e.g.:– Third-party data transfers, partial file transfers

– Parallelism, striping (e.g., on PVFS)

– Reliable, recoverable data transfers Reference implementations

– Existing clients and servers: wuftpd, nicftp

– Flexible, extensible libraries

Page 19: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Grid Services Architecture (4):Collective Layer Protocols & Services Index servers aka metadirectory services

– Custom views on dynamic resource collections assembled by a community

Resource brokers (e.g., Condor Matchmaker)– Resource discovery and allocation

Replica management and replica selection– Optimize aggregate data access performance

Co-reservation and co-allocation services– End-to-end performance

Etc., etc.

Page 20: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

The Grid Information Problem

Large numbers of distributed “sensors” with different properties

Need for different “views” of this information, depending on community membership, security constraints, intended purpose, sensor type

Page 21: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

The Globus Toolkit Solution: MDS-2

Registration & enquiry protocols, information models, query languages– Provides standard interfaces to sensors

– Supports different “directory” structures supporting various discovery/access strategies

Karl Czajkowski, Steve Fitzgerald, others

Page 22: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

GriPhyN/PPDGData Grid Architecture

Application

Planner

Executor

Catalog Services

Info Services

Policy/Security

Monitoring

Repl. Mgmt.

Reliable TransferService

Compute Resource Storage Resource

DAG

DAG

DAGMAN, Kangaroo

GRAM GridFTP; GRAM; SRM

GSI, CAS

MDS, NWS

MCAT; GriPhyN catalogs

GDMP

MDS

Globus

= initial solution is operational

Ewa Deelman, Mike Wilde, others www.griphyn.org

Page 23: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Selected Major Grid ProjectsName URL &

SponsorsFocus

Access Grid www.mcs.anl.gov/FL/accessgrid; DOE, NSF

Create & deploy group collaboration systems using commodity technologies

BlueGrid IBM Grid testbed linking IBM laboratories

DISCOM www.cs.sandia.gov/discomDOE Defense Programs

Create operational Grid providing access to resources at three U.S. DOE weapons laboratories

DOE Science Grid

sciencegrid.org

DOE Office of Science

Create operational Grid providing access to resources & applications at U.S. DOE science laboratories & partner universities

Earth System Grid (ESG)

earthsystemgrid.orgDOE Office of Science

Delivery and analysis of large climate model datasets for the climate research community

European Union (EU) DataGrid

eu-datagrid.org

European Union

Create & apply an operational grid for applications in high energy physics, environmental science, bioinformatics

g

g

g

g

g

g

New

New

Page 24: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Selected Major Grid ProjectsName URL/

SponsorFocus

EuroGrid, Grid Interoperability (GRIP)

eurogrid.org

European Union

Create technologies for remote access to supercomputer resources & simulation codes; in GRIP, integrate with Globus

Fusion Collaboratory

fusiongrid.org

DOE Off. Science

Create a national computational collaboratory for fusion research

Globus Project globus.org

DARPA, DOE, NSF, NASA, Msoft

Research on Grid technologies; development and support of Globus Toolkit; application and deployment

GridLab gridlab.org

European Union

Grid technologies and applications

GridPP gridpp.ac.uk

U.K. eScience

Create & apply an operational grid within the U.K. for particle physics research

Grid Research Integration Dev. & Support Center

grids-center.org

NSF

Integration, deployment, support of the NSF Middleware Infrastructure for research & education

g

g

g

g

g

g

New

New

New

New

New

Page 25: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Selected Major Grid ProjectsName URL/Sponsor Focus

Grid Application Dev. Software

hipersoft.rice.edu/grads; NSF

Research into program development technologies for Grid applications

Grid Physics Network

griphyn.org

NSF

Technology R&D for data analysis in physics expts: ATLAS, CMS, LIGO, SDSS

Information Power Grid

ipg.nasa.gov

NASA

Create and apply a production Grid for aerosciences and other NASA missions

International Virtual Data Grid Laboratory

ivdgl.org

NSF

Create international Data Grid to enable large-scale experimentation on Grid technologies & applications

Network for Earthquake Eng. Simulation Grid

neesgrid.org

NSF

Create and apply a production Grid for earthquake engineering

Particle Physics Data Grid

ppdg.net

DOE Science

Create and apply production Grids for data analysis in high energy and nuclear physics experiments

g

g

g

g

g

New

New

g

Page 26: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Selected Major Grid ProjectsName URL/Sponsor Focus

TeraGrid teragrid.org

NSF

U.S. science infrastructure linking four major resource sites at 40 Gb/s

UK Grid Support Center

grid-support.ac.uk

U.K. eScience

Support center for Grid projects within the U.K.

Unicore BMBFT Technologies for remote access to supercomputers

g

g

New

New

Also many technology R&D projects: e.g., Condor, NetSolve, Ninf, NWS

See also www.gridforum.org

Page 27: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

The 13.6 TF TeraGrid:Computing at 40 Gb/s

26

24

8

4 HPSS

5

HPSS

HPSS UniTree

External Networks

External Networks

External Networks

External Networks

Site Resources Site Resources

Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB

SDSC4.1 TF225 TB

Caltech Argonne

TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org

Page 28: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

iVDGL:International Virtual Data Grid Laboratory

Tier0/1 facility

Tier2 facility

10 Gbps link

2.5 Gbps link

622 Mbps link

Other link

Tier3 facility

U.S. PIs: Avery, Foster, Gardner, Newman, Szalay www.ivdgl.org

Page 29: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

NSF GRIDS Center

Grid Research, Integration, Deployment, & Support (GRIDS) Center

Develop, deploy, support– Middleware infrastructure for national-scale

collaborative science and engineering

– Integration platform for experimental middleware technologies

UC, USC/ISI, UW, NCSA, SDSC Partner with Internet-2, SURA, Educause in

NSF Middleware Initiative

www.grids-center.org www.nsf-middleware-org

GRIDS

Page 30: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

And What’s This Got To Do With … CORBA?

– Grid-enabled CORBA underway Java, Jini, Jxta?

– Java CoG Kit. Jini, Jxta: future uncertain Web Services, .NET, J2EE?

– Major Globus focus (GRAM-2: SOAP, WSDL)

– Workflow/choreography services

– Q: What can Grid offer to Web services? Next revolutionary technology of the month?

– They’ll need Grid technologies too

Page 31: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

The Future:All Software is Network-Centric

We don’t build or buy “computers” anymore, we borrow or lease required resources– When I walk into a room, need to solve a

problem, need to communicate A “computer” is a dynamically, often

collaboratively constructed collection of processors, data sources, sensors, networks– Similar observations apply for software

Page 32: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

And Thus …

Reduced barriers to access mean that we do much more computing, and more interesting computing, than today => Many more components (& services); massive parallelism

All resources are owned by others => Sharing (for fun or profit) is fundamental; trust, policy, negotiation, payment

All computing is performed on unfamiliar systems => Dynamic behaviors, discovery, adaptivity, failure

Page 33: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Acknowledgments

Globus R&D is joint with numerous people– Carl Kesselman, Co-PI

– Steve Tuecke, principal architect at ANL

– Others to be acknowledged GriPhyN R&D is joint with numerous people

– Paul Avery, Co-PI; Newman, Lazzarini, Szalay

– Mike Wilde, project coordinator

– Carl Kesselman, Miron Livny CS leads

– ATLAS, CMS, LIGO, SDSS participants; others Support: DOE, DARPA, NSF, NASA, Microsoft

Page 34: Grid Technologies Enabling Collaborative Science Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

[email protected] ARGONNE CHICAGO

Summary

The Grid problem: Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations

Grid architecture: Emphasize protocol and service definition to enable interoperability and resource sharing

Globus Toolkit a source of protocol and API definitions, reference implementations

See: globus.org, griphyn.org, gridforum.org, grids-center.org, nsf-middleware.org