The Anatomy of the Grid Enabling Scalable Virtual Organizations Ian Foster Mathematics & Computer...
-
date post
22-Dec-2015 -
Category
Documents
-
view
212 -
download
0
Transcript of The Anatomy of the Grid Enabling Scalable Virtual Organizations Ian Foster Mathematics & Computer...
The Anatomy of the GridEnabling Scalable Virtual Organizations
Ian Foster
Mathematics & Computer Science Division
Argonne National Laboratory
and
Dept. of Computer Science
The University of Chicago
http://www.mcs.anl.gov/~foster
2nd US-Hungarian Workshop on Cluster and Grid Computing, February 6, 2002
David S. Angulo
Dept. of Computer Science
The University of Chicago
and
Mathematics & Computer Science Division
Argonne National Laboratory
http://www.cs.uchicago.edu/~dangulo
[email protected] University of Chicago
Partial Acknowledgements
Globus ToolkitTM
– R&D involves> many fine scientists & engineers at ANL/UofC, USC/ISI, and
elsewhere (see www.globus.org)
– Led by> Ian Foster @ Argonne/UofC> Carl Kesselman @ USC/ISI
Open Grid Services Architecture work performed by– Ian Foster, Globus Co-PI @ Argonne/UofC– Carl Kesselman, Globus Co-PI @ USC/ISI– Steve Tuecke, Globus Toolkit Architect @ANL– Jeff Nick, Steve Graham, Jeff Frey @ IBM
Strong collaborations with many outstanding EU, UK, US Grid projects
Support from DOE, NASA, NSF, Microsoft, IBM
[email protected] University of Chicago
Grid Computing
[email protected] University of Chicago
The Grid Problem
Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations
[email protected] University of Chicago
Why Grids? A biochemist exploits 10,000 computers to screen
100,000 compounds in an hour 1,000 physicists worldwide pool resources for
petaflop analyses of petabytes of data Civil engineers collaborate to design, execute, &
analyze shake table experiments Climate scientists visualize, annotate, & analyze
terabyte simulation datasets A home user invokes architectural design
functions at an application service provider– An application service provider purchases cycles
from compute cycle providers
[email protected] University of Chicago
Elements of the Problem
Resource sharing– Computers, storage, sensors, networks, …
– Sharing always conditional: issues of trust, policy, payment, …
Coordinated problem solving– Beyond client-server: distributed data analysis,
computation, … Dynamic, multi-institutional virtual orgs
– Community overlays on classic org structures
– Large or small, static or dynamic
[email protected] University of Chicago
Grids: Why Now?
Moore’s law improvements in computing produce highly functional end systems
The Internet and burgeoning wired and wireless provide universal connectivity
Network exponentials produce dramatic changes in geometry and geography
[email protected] University of Chicago
Grids: Why Now?
Moore’s law improvements in computing produce highly functional endsystems
The Internet and burgeoning wired and wireless provide universal connectivity
Network exponentials produce dramatic changes in geometry and geography– 9-month doubling: double Moore’s law!
– 1986-2001: x340,000; 2001-2010: x4000?
[email protected] University of Chicago
A Little History
Early 90s– Gigabit testbeds, metacomputing
Mid to late 90s– Early experiments (e.g., I-WAY), software projects
(e.g., Globus), application experiments 2002
– Major application communities emerging– Major infrastructure deployments are underway– Rich technology base has been constructed– Global Grid Forum: >1000 people on mailing lists,
192 orgs at last meeting, 28 countries
[email protected] University of Chicago
The Grid World: Current Status
Dozens of major Grid projects in scientific & technical computing/research & education– Deployment, application, technology
Considerable consensus on key concepts and technologies– Globus Toolkit™ has emerged as de facto
standard for major protocols & services Global Grid Forum has emerged as a
significant force– And first “Grid” proposals at IETF
[email protected] University of Chicago
Selected Major Grid ProjectsName URL &
SponsorsFocus
Access Grid www.mcs.anl.gov/FL/accessgrid; DOE, NSF
Create & deploy group collaboration systems using commodity technologies
BlueGrid IBM Grid testbed linking IBM laboratories
DISCOM www.cs.sandia.gov/discomDOE Defense Programs
Create operational Grid providing access to resources at three U.S. DOE weapons laboratories
DOE Science Grid
sciencegrid.org
DOE Office of Science
Create operational Grid providing access to resources & applications at U.S. DOE science laboratories & partner universities
Earth System Grid (ESG)
earthsystemgrid.orgDOE Office of Science
Delivery and analysis of large climate model datasets for the climate research community
European Union (EU) DataGrid
eu-datagrid.org
European Union
Create & apply an operational grid for applications in high energy physics, environmental science, bioinformatics
g
g
g
g
g
g
New
New
[email protected] University of Chicago
Selected Major Grid ProjectsName URL/
SponsorFocus
EuroGrid, Grid Interoperability (GRIP)
eurogrid.org
European Union
Create technologies for remote access to supercomputer resources & simulation codes; in GRIP, integrate with Globus
Fusion Collaboratory
fusiongrid.org
DOE Off. Science
Create a national computational collaboratory for fusion research
Globus Project globus.org
DARPA, DOE, NSF, NASA, Msoft
Research on Grid technologies; development and support of Globus Toolkit; application and deployment
GridLab gridlab.org
European Union
Grid technologies and applications
GridPP gridpp.ac.uk
U.K. eScience
Create & apply an operational grid within the U.K. for particle physics research
Grid Research Integration Dev. & Support Center
grids-center.org
NSF
Integration, deployment, support of the NSF Middleware Infrastructure for research & education
g
g
g
g
g
g
New
New
New
New
New
[email protected] University of Chicago
Selected Major Grid ProjectsName URL/Sponsor Focus
Grid Application Dev. Software
hipersoft.rice.edu/grads; NSF
Research into program development technologies for Grid applications
Grid Physics Network
griphyn.org
NSF
Technology R&D for data analysis in physics expts: ATLAS, CMS, LIGO, SDSS
Information Power Grid
ipg.nasa.gov
NASA
Create and apply a production Grid for aerosciences and other NASA missions
International Virtual Data Grid Laboratory
ivdgl.org
NSF
Create international Data Grid to enable large-scale experimentation on Grid technologies & applications
Network for Earthquake Eng. Simulation Grid
neesgrid.org
NSF
Create and apply a production Grid for earthquake engineering
Particle Physics Data Grid
ppdg.net
DOE Science
Create and apply production Grids for data analysis in high energy and nuclear physics experiments
g
g
g
g
g
New
New
g
[email protected] University of Chicago
Selected Major Grid ProjectsName URL/Sponsor Focus
TeraGrid teragrid.org
NSF
U.S. science infrastructure linking four major resource sites at 40 Gb/s
UK eScience Grid grid-support.ac.uk
U.K. eScience
Support center for Grid projects within the U.K.
Unicore BMBFT Technologies for remote access to supercomputers
g
g
New
New
Also many technology R&D projects: e.g., Condor, NetSolve, Ninf, NWS
See also www.gridforum.org
[email protected] University of Chicago
Grid Communities & Applications:Data Grids for High Energy Physics
Tier2 Centre ~1 TIPS
Online System
Offline Processor Farm
~20 TIPS
CERN Computer Centre
FermiLab ~4 TIPSFrance Regional Centre
Italy Regional Centre
Germany Regional Centre
InstituteInstituteInstituteInstitute ~0.25TIPS
Physicist workstations
~100 MBytes/sec
~100 MBytes/sec
~622 Mbits/sec
~1 MBytes/sec
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight (deprecated)
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Caltech ~1 TIPS
~622 Mbits/sec
Tier 0Tier 0
Tier 1Tier 1
Tier 2Tier 2
Tier 4Tier 4
1 TIPS is approximately 25,000
SpecInt95 equivalents
www.griphyn.org www.ppdg.net www.eu-datagrid.org
[email protected] University of Chicago
Grid Communities and Applications:Mathematicians Solve NUG30
Community=an informal collaboration of mathematicians and computer scientists
Condor-G delivers 3.46E8 CPU seconds in 7 days (peak 1009 processors) in U.S. and Italy (8 sites)
Solves NUG30 quadratic assignment problem
14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22,13,26,17,30,6,20,19,8,18,7,27,12,11,23
www.mcs.anl.gov/metaneos: Argonne, Iowa, NWU, Wisconsin
[email protected] University of Chicago
Grid Communities and Applications:Network for Earthquake Eng. Simulation
NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other
On-demand access to experiments, data streams, computing, archives, collaboration
NEESgrid: Argonne, Michigan, NCSA, UIUC, USC www.neesgrid.org
[email protected] University of Chicago
The 13.6 TF TeraGrid:Computing at 40 Gb/s
26
24
8
4 HPSS
5
HPSS
HPSS UniTree
External Networks
External Networks
External Networks
External Networks
Site Resources Site Resources
Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB
SDSC4.1 TF225 TB
Caltech Argonne
TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org
[email protected] University of Chicago
Intl. Virtual Data Grid Lab.
Tier0/1 facility
Tier2 facility
10+ Gbps link
2.5 Gbps link
622 Mbps link
Other link
Tier3 facility
www.ivdgl.org
[email protected] University of Chicago
Access Grid
Collaborative work among large groups
~50 sites worldwide Use Grid services for
discovery, security www.scglobal.org
Ambient mic(tabletop)
Presentermic
Presentercamera
Audience camera
Access Grid: Argonne, others www.accessgrid.org
[email protected] University of Chicago
Grid Architecture & Globus Toolkit™ The question:
– What is needed for resource sharing & coordinated problem solving in dynamic virtual organizations (VOs)?
The answer:– Major issues identified: membership, resource
discovery & access, …, …
– Grid architecture captures core elements, emphasizing pre-eminent role of protocols
– Globus Toolkit™ has emerged as de facto standard for major protocols & services
[email protected] University of Chicago
The Critical Role of Protocols Need for interoperability when different groups
want to share resources– E.g., IP lets me talk to your computer, but how do
we establish & maintain sharing?
– How do I discover, authenticate, authorize, describe what I want to do, etc., etc.?
Need for shared infrastructure services to avoid repeated development, installation, e.g.– One port/service for remote access to computing,
not one per tool/application
– X.509 enables sharing of Certificate Authorities
[email protected] University of Chicago
Grid Architecture
Application
Fabric“Controlling things locally”: Access to, & control of, resources
Connectivity“Talking to things”: communication (Internet protocols) & security
Resource“Sharing single resources”: negotiating access, controlling use
Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services
InternetTransport
Application
Link
Inte
rnet P
roto
col
Arch
itectu
re
For more info: www.globus.org/research/papers/anatomy.pdf
[email protected] University of Chicago
Globus Project and Toolkit
Globus Project™ – R&D project at ANL, U.Chicago, USC/ISI
– Emphasis on identifying and defining core protocols and services
– O(40) researchers & developers Globus Toolkit™
– A major product of the Globus Project
– Open source software: reference implementation of core protocols & services
– Growing open source developer community
[email protected] University of Chicago
Globus & Architecture (1):Fabric Layer
Diverse resources that may be shared– Computers, clusters, Condor pools, file
systems, archives, metadata catalogs, networks, sensors, etc., etc.
Speak connectivity, resource protocols– The neck of the protocol hourglass
May implement standard behaviors– Reservation, pre-emption, virtualization
– Grid operation can have profound implications for resource behavior
Gridresource
Registration, enquiry,
management, access
protocol(s)
[email protected] University of Chicago
Globus & Architecture (2):Connectivity Layer Protocols & Services
Communication– Internet protocols: IP, DNS, routing, etc.
Security: Grid Security Infrastructure (GSI)– Uniform authentication & authorization
mechanisms in multi-institutional setting
– Single sign-on, delegation, identity mapping
– Public key technology, SSL, X.509, GSS-API (several Internet drafts document extensions)
– Supporting infrastructure: Certificate Authorities, key management, etc.
[email protected] University of Chicago
Site A(Kerberos)
Site B (Unix)
Site C(Kerberos)
Computer
User
Single sign-on via “grid-id”& generation of proxy cred.
Or: retrieval of proxy cred.from online repository
User ProxyProxy
credential
Computer
Storagesystem
Communication*
GSI-enabledFTP server
AuthorizeMap to local idAccess file
Remote fileaccess request*
GSI-enabledGRAM server
GSI-enabledGRAM server
Remote processcreation requests*
* With mutual authentication
Process
Kerberosticket
Restrictedproxy
Process
Restrictedproxy
Local id Local id
AuthorizeMap to local idCreate processGenerate credentials
Ditto
GSI in Action: “Create Processes at A and B that Communicate & Access Files at C”
[email protected] University of Chicago
Globus & Architecture (3):Resource Layer Protocols & Services
Resource management: GRAM– Remote allocation, reservation, monitoring,
control of [compute] resources Data access: GridFTP
– High-performance data access & transport Information: MDS (GRRP, GRIP)
– Access to structure & state information & others emerging: database access, code
repository access, accounting, … All integrated with GSI
[email protected] University of Chicago
GRAM ResourceManagement Protocol
Grid Resource Allocation & Management– Allocation, monitoring, control of computations
– Secure remote access to diverse schedulers Current evolution
– Immediate and advance reservation
– Multiple resource types: manage anything
– Recoverable requests, timeout, etc.
– Evolve to Web Services
– Policy evaluation points for restricted proxies
Karl Czajkowski, Steve Tuecke, others
[email protected] University of Chicago
Data Access & Transfer
GridFTP: extended version of popular FTP protocol for Grid data access and transfer
Secure, efficient, reliable, flexible, extensible, parallel, concurrent, e.g.:– Third-party data transfers, partial file transfers
– Parallelism, striping (e.g., on PVFS)
– Reliable, recoverable data transfers Reference implementations
– Existing clients and servers: wuftpd, nicftp
– Flexible, extensible libraries
Bill Allcock, Joe Bester, John Bresnahan, Steve Tuecke, others
[email protected] University of Chicago
Grid Services Architecture (4):Collective Layer Protocols & Services
Community membership & policy– E.g., Community Authorization Service
Index/metadirectory/ brokering services– E.g., Globus GIIS, Condor Matchmaker
Replica management and replica selection– Optimize aggregate data access performance
Co-reservation and co-allocation services– End-to-end performance
Middle tier services– MyProxy credential repository, portal services
[email protected] University of Chicago
Data Grids
Grid infrastructures, tools, and applications focused on enabling distributed access to, & analysis of, large amounts of data
A specialization and extension of standard Grid technologies
Current application domains include high energy & nuclear physics, climate data analysis, astronomy, bioinformatics
[email protected] University of Chicago
Grid Physics Network (GriPhyN) Enabling R&D for advanced data grid systems,
focusing in particular on Virtual Data concept
Virtual Data ToolsRequest Planning and
Scheduling ToolsRequest Execution Management Tools
Transforms
Distributed resources(code, storage,computers, and network)
Resource Management
Services
Resource Management
Services
Security and Policy
Services
Security and Policy
Services
Other Grid Services
Other Grid Services
Interactive User Tools
Production Team
Individual Investigator Other Users
Raw data source
ATLASCMSLIGOSDSS
Paul Avery, Ian Foster, Co-PIs www.griphyn.org
[email protected] University of Chicago
Future Directions Initial exploration (1996-1999; Globus 1.0)
– Extensive appln experiments; core protocols Data Grids (1999-??; Globus 2.0+)
– Large-scale data management and analysis Open Grid Services Architecture (2001-??, Globus
3.0)– Integration w/ Web services, hosting envs.
– Integration with databases
– Integrated set of higher-level services Scalable systems (2003-??)
– Sensors, wireless, ubiquitous computing
[email protected] University of Chicago
Summary The Grid problem: Resource sharing & coordinated
problem solving in dynamic, multi-institutional virtual organizations
Grid architecture: Protocol, service definition for interoperability & resource sharing
Globus Toolkit™ a source of protocol and API definitions—and reference implementations– And many projects applying Grid concepts (& Globus
technologies) to important problems Timely to start applying technologies to industrial
problems, within & outside STC
[email protected] University of Chicago
For More Information The Globus Project™
– www.globus.org Global Grid Forum
– www.gridforum.org Grid architecture
– www.globus.org/research/papers/anatomy.pdf
Open Grid Services Architecture (soon)– www.globus.org/research/
papers/ogsa.pdf– www.globus.org/research/
papers/gsspec.pdf