The Anatomy of the Grid Enabling Scalable Virtual Organizations

39
The Anatomy of the Grid Enabling Scalable Virtual Organizations Acknowldgement to: Ian Foster Mathematics and Computer Science Division Argonne National Laboratory John DYER TERENA [email protected]

description

The Anatomy of the Grid Enabling Scalable Virtual Organizations. John DYER TERENA [email protected]. Acknowldgement to: Ian Foster Mathematics and Computer Science Division Argonne National Laboratory. Grids are “hot” …. but what are they really about?. Presentation Agenda. - PowerPoint PPT Presentation

Transcript of The Anatomy of the Grid Enabling Scalable Virtual Organizations

Page 1: The Anatomy of the Grid Enabling Scalable Virtual Organizations

The Anatomy of the GridEnabling Scalable Virtual Organizations

Acknowldgement to: Ian FosterMathematics and Computer Science DivisionArgonne National Laboratory

John DYERTERENA

[email protected]

Page 2: The Anatomy of the Grid Enabling Scalable Virtual Organizations

but what are they really about?

Grids are “hot” …

Page 3: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Presentation Agenda• Problem statement• Architecture• Globus Toolkit• Futures

Page 4: The Anatomy of the Grid Enabling Scalable Virtual Organizations

The Grid Problem Resource sharing & coordinated problem

solving in dynamic, multi-institutional virtual organizations

Page 5: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Elements of the Problem• Resource sharing

• Computers, storage, sensors, networks, …• Sharing always conditional: issues of trust, policy,

negotiation, payment, … (Cost v Performance)• Coordinated problem solving

• Beyond client-server: distributed data analysis, computation, collaboration, …

• Dynamic, multi-institutional virtual orgs• Community overlays on classic org structures• Large or small, static or dynamic

Page 6: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Computational Astrophysics

• Solved EEs for gravitational waves• Tightly coupled, communications required • Must communicate 30MB/step between machines

Gig-E100MB/sec

SDSC IBM SP1024 procs5x12x17 =1020

NCSA Origin Array256+128+1285x12x(4+2+2) =480

OC-12 lineBut only 2.5MB/sec)

17

5 125

4 2 2

Page 7: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Data Grids for High Energy Physics

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm ~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPSFrance Regional Centre

Italy Regional Centre

Germany Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Caltech ~1 TIPS

~622 Mbits/sec

Tier 0Tier 0

Tier 1Tier 1

Tier 2Tier 2

Tier 4Tier 4

1 TIPS is approximately 25,000

SpecInt95 equivalents

Image courtesy Harvey Newman, Caltech

Page 8: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Network for Earthquake Engineering Simulation

• NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other

• On-demand access to experiments, data streams, computing, archives, collaboration

NEESgrid: Argonne, Michigan, NCSA, UIUC, USC

Page 9: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Grid Applications:Mathematicians

• Community=an informal collaboration of mathematicians and computer scientists

• Condor-G delivers 3.46E8 CPU seconds in 7 days (600E3 seconds real-time)

• peak 1009 processors in U.S. and Italy (8 sites)

MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin

Page 10: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Grid ArchitectureIsn’t it just the Next Generation

Internet , so why bother !

Page 11: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Why Discuss Architecture?• Descriptive

• Provide a common vocabulary for use when describing Grid systems

• Guidance• Identify key areas in which services are

required - FRAMEWORK• Prescriptive

• Define standards• But in the existing standards framework• GGF working with IETF, Internet2 etc.

Page 12: The Anatomy of the Grid Enabling Scalable Virtual Organizations

What Sorts of Standards?• Need for interoperability when different groups want to

share resources• E.g., IP lets me talk to your computer, but how do we

establish & maintain sharing?• How do I discover, authenticate, authorize, describe what I

want to do, etc., etc.?• Need for shared infrastructure services to avoid

repeated development, installation, e.g.• One port/service for remote access to computing, not one

per tool/application• X.509 enables sharing of Certificate Authorities

• MIDDLEWARE !

Page 13: The Anatomy of the Grid Enabling Scalable Virtual Organizations

In Defining Grid Architecture, We Must Address . . .

• Development of Grid protocols & services• Protocol-mediated access to remote resources• New services: e.g., resource brokering• Mostly (extensions to) existing protocols

• Development of Grid APIs & SDKs• Facilitate application development by

supplying higher-level abstractions• The model is the Internet and Web

Page 14: The Anatomy of the Grid Enabling Scalable Virtual Organizations

The Role of Grid Services(Middleware) and Tools

Informationservices

Faultdetection . . .Resource

mgmt

CollaborationTools

Data MgmtTools

Distributedsimulation. . .

net

Page 15: The Anatomy of the Grid Enabling Scalable Virtual Organizations

GRID ArchitectureStatus

• No “official” standards exist• But:

• Globus Toolkit has emerged as the de facto standard for several important Connectivity, Resource, and Collective protocols

• GGF has an architecture working group• Technical specifications are being developed

for architecture elements: e.g., security, data, resource management, information

• Internet drafts submitted in security area

Page 16: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Layered Grid Architecture

Application

Fabric

ConnectivityCOMMS & AUTHENTICATIONSingle Sign On, Trust . . .

ResourceNEGOTIATION & CONTROL Sharing resources, controlling

CollectiveJOB MANAGEMENTDirectory, Discovery, Monitoring

InternetTransport

Application

Link

Internet Protocol Architecture

DOES THE SCIENCE /….

ALL PHYSICAL RESOURCESNet, CPUs, Storage, Sensors

Page 17: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Toolkits & Components

• CONDOR - Harnessing the processing capacity of idle workstations

www.cs.wisc.edu/condor/• LEGION- developing an object-oriented framework for grid applications

www.cs.virginia.edu/~legion• Globus Toolkit SDK - APIs

www.globus.org/

Page 18: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Architecture: Fabric Layer• Just what you would expect: the diverse

mix of resources that may be shared• Individual computers, Condor pools, file

systems, archives, metadata catalogs, networks, sensors, etc., etc.

• Few constraints on low-level technology: connectivity and resource level protocols

• Globus toolkit provides a few selected components (e.g., bandwidth broker)

Page 19: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Architecture: Connectivity• Communication

• Internet protocols: IP, DNS, routing, etc.• Security: Grid Security Infrastructure (GSI)

• Uniform authentication & authorization mechanisms in multi-institutional setting

• Single sign-on, delegation, identity mapping• Public key technology, SSL, X.509, GSS-API

(several Internet drafts document extensions)• Supporting infrastructure: Certificate

Authorities, key management, etc.

Page 20: The Anatomy of the Grid Enabling Scalable Virtual Organizations

GSI Futures• Scalability in numbers of users & resources

• Credential management• Online credential repositories• Account management

• Authorization• Policy languages• Community authorization

• Protection against compromised resources• Restricted delegation, smartcards

Page 21: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Architecture: Resources• Resource management: Remote allocation,

reservation, monitoring, control of [compute] resources - GRAM (access & management

• Data access: GridFTP• High-performance data access & transport

• Information:• GRIP cf LDAP• GRRP – Registration Protocol• Access to structure & state information

• & others emerging: catalog access, code repository access, accounting, …

• All integrated with GSI

Page 22: The Anatomy of the Grid Enabling Scalable Virtual Organizations

GRAM Resource Management Protocol• Grid Resource Allocation & Management

• Allocation, monitoring, control of computations• Simple HTTP-based RPC

• Job request: Returns opaque, transferable “job contact” string for access to job

• Job cancel, Job status, Job signal• Event notification (callbacks) for state changes

• Protocol/server address robustness (exactly once execution), authentication, authorization

• Servers for most schedulers; C and Java APIs

Page 23: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Data Access & Transfer• GridFTP: extended version of popular FTP protocol for

Grid data access and transfer• Secure, efficient, reliable, flexible, extensible, parallel,

concurrent, e.g.:• Third-party data transfers, partial file transfers• Parallelism, striping (e.g., on PVFS)• Reliable, recoverable data transfers

• Reference implementations• Existing clients and servers: wuftpd, nicftp• Flexible, extensible libraries

Page 24: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Architecture: Collective• Bringing the underlying resources together

to provide the requested services • Resource brokers (e.g., Condor Matchmaker)

• Resource discovery and allocation• Replica management and replica selection

• Optimize aggregate data access performance• Co-reservation and co-allocation services

• End-to-end performance• Etc., etc.

Page 25: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Globus Toolkit Solution

Registration & enquiry protocols, information models, query languages• Provides standard interfaces to sensors• Supports different “directory” structures

supporting various discovery/access strategiesKarl Czajkowski, Steve Fitzgerald, others

Page 26: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Grid Futures

Page 27: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Major Grid ProjectsName URL &

SponsorsFocus

Access Grid www.mcs.anl.gov/FL/accessgrid; DOE, NSF

Create & deploy group collaboration systems using commodity technologies

BlueGrid IBM Grid testbed linking IBM laboratoriesDISCOM www.cs.sandia.gov/

discomDOE Defense Programs

Create operational Grid providing access to resources at three U.S. DOE weapons laboratories

DOE Science Grid

sciencegrid.orgDOE Office of Science

Create operational Grid providing access to resources & applications at U.S. DOE science laboratories & partner universities

Earth System Grid (ESG)

earthsystemgrid.orgDOE Office of Science

Delivery and analysis of large climate model datasets for the climate research community

European Union (EU) DataGrid

eu-datagrid.orgEuropean Union

Create & apply an operational grid for applications in high energy physics, environmental science, bioinformatics

ggg

g

g

g

New

New

Page 28: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Major Grid ProjectsName URL/

SponsorFocus

EuroGrid, Grid Interoperability (GRIP)

eurogrid.orgEuropean Union

Create technologies for remote access to supercomputer resources & simulation codes; in GRIP, integrate with Globus

Fusion Collaboratory

fusiongrid.orgDOE Off. Science

Create a national computational collaboratory for fusion research

Globus Project globus.orgDARPA, DOE, NSF, NASA, Msoft

Research on Grid technologies; development and support of Globus Toolkit; application and deployment

GridLab gridlab.orgEuropean Union

Grid technologies and applications

GridPP gridpp.ac.ukU.K. eScience

Create & apply an operational grid within the U.K. for particle physics research

Grid Research Integration Dev. & Support Center

grids-center.orgNSF

Integration, deployment, support of the NSF Middleware Infrastructure for research & education

g

g

g

g

g

g

New

New

New

New

New

Page 29: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Major Grid ProjectsName URL/Sponsor Focus

Grid Application Dev. Software

hipersoft.rice.edu/grads; NSF

Research into program development technologies for Grid applications

Grid Physics Network

griphyn.orgNSF

Technology R&D for data analysis in physics expts: ATLAS, CMS, LIGO, SDSS

Information Power Grid

ipg.nasa.govNASA

Create and apply a production Grid for aerosciences and other NASA missions

International Virtual Data Grid Laboratory

ivdgl.orgNSF

Create international Data Grid to enable large-scale experimentation on Grid technologies & applications

Network for Earthquake Eng. Simulation Grid

neesgrid.orgNSF

Create and apply a production Grid for earthquake engineering

Particle Physics Data Grid

ppdg.netDOE Science

Create and apply production Grids for data analysis in high energy and nuclear physics experiments

g

g

g

g

gNew

New

g

Page 30: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Major Grid Projects

Name URL/Sponsor FocusTeraGrid teragrid.org

NSFU.S. science infrastructure linking four major resource sites at 40 Gb/s

UK Grid Support Center

grid-support.ac.ukU.K. eScience

Support center for Grid projects within the U.K.

Unicore BMBFT Technologies for remote access to supercomputers

g

gNew

New

Also many technology R&D projects: e.g., Condor, NetSolve, Ninf, NWS

See also www.gridforum.org

Page 31: The Anatomy of the Grid Enabling Scalable Virtual Organizations

The 13.6 TF TeraGrid:Computing at 40 Gb/s

26

24

8

4 HPSS

5

HPSS

HPSS UniTree

External Networks

External NetworksExternal

Networks

External Networks

Site Resources Site Resources

Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB

SDSC4.1 TF225 TB

Caltech Argonne

TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org

Page 32: The Anatomy of the Grid Enabling Scalable Virtual Organizations

International Virtual Data Grid Lab

Tier0/1 facilityTier2 facility

10 Gbps link

2.5 Gbps link

622 Mbps link

Other link

Tier3 facility

U.S. PIs: Avery, Foster, Gardner, Newman, Szalay www.ivdgl.org

Page 33: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Problem Evolution

• Past-present: (102) high-end systems; Mb/s networks; centralized (or entirely local) control• I-WAY (1995): 17 sites, week-long; 155 Mb/s• GUSTO (1998): 80 sites, long-term experiment• NASA IPG, NSF NTG: O(10) sites, production

• Present: (104-106) data systems, computers; Gb/s networks; scaling, decentralized control• Scalable resource discovery; restricted delegation;

community policy; GriPhyN Data Grid: 100s of sites, (104) computers; complex policies

• Future: (106-109) data, sensors, computers; Tb/s networks; highly flexible policy, control

Page 34: The Anatomy of the Grid Enabling Scalable Virtual Organizations

The Future• We don’t build or buy “computers” anymore,

we borrow or lease required resources• When I walk into a room, need to solve a

problem, need to communicate• A “computer” is a dynamically, often

collaboratively constructed collection of processors, data sources, sensors, networks• Similar observations apply for software

Page 35: The Anatomy of the Grid Enabling Scalable Virtual Organizations

And Thus …• Reduced barriers to access mean that we do much

more computing, and more interesting computing, than today => Many more components (& services); massive parallelism

• All resources are owned by others => Sharing (for fun or profit) is fundamental; trust, policy, negotiation, payment

• All computing is performed on unfamiliar systems => Dynamic behaviors, discovery, adaptivity, failure

Page 36: The Anatomy of the Grid Enabling Scalable Virtual Organizations

The Global Grid Forum• Merger of (US) GridForum & EuroGRID• Cooperative Forum of Working Groups• Open to all who show up• Meets every four months• Alternate – US and Europe

• GGF1 – Amsterdam, NL• GGF2 – Washington, US• GGF3 – Frascatti, IT

http://www.gridforum.org

Page 37: The Anatomy of the Grid Enabling Scalable Virtual Organizations

GF BOF (Orlando)

GF1 (San Jose, NASA Ames)GF2 (Chicago, Northwestern)eGrid and GF BOFs (Portland)GF3 (San Diego, SDSC)

eGrid1(Posnan, PSNC)GF4 (Redmond, Microsoft)

eGrid2 (Munich, Europar)GF5 (Boston, Sun)

Global GF BOF (Dallas)

1999 2000

Asia-Pacific GF Planning (Yokohama)

1998

GGF-1 (Amsterdam, WTCW)GGF-2 (Washington, DC, DOD-MSRC)

GGF3 (Rome,INFN)7-10 October 2001

GGF4 (Toronto, NRC)17-20 February 2002

GGF5 (Edinburgh)21-24 July 2002Jointly with HPDC(24-26 July)

2001 2002

Global Grid Forum History

Page 38: The Anatomy of the Grid Enabling Scalable Virtual Organizations

GGF AREAs • Working Groups • Research GroupsGrid Information Services • Grid Object

Specification• Grid Notification

Framework• Metacomputing

Directory Services

• Relational Database Information Services

Scheduling and Resource Management

• Advanced Reservation

• Scheduling Dictionary

• Scheduler Attributes

Security • Grid Security Infrastructure

• Grid Certificate Policy

Performance • PerformanceArchitectures • JINI • Grid Protocol

ArchitectureData • GridFTP • ReplicaApplications, Programming Models, and User Environments

• APPS, GUS, GCE, APM

Page 39: The Anatomy of the Grid Enabling Scalable Virtual Organizations

Summary• The Grid problem: Resource sharing &

coordinated problem solving in dynamic, multi-institutional virtual organizations

• Grid architecture: Emphasize protocol and service definition to enable interoperability and resource sharing

• Globus Toolkit a source of protocol and API definitions, reference implementations

• See: globus.org, griphyn.org, gridforum.org