GRIDS Center G rid R esearch I ntegration D evelopment & S upport Copyright Thomas Garritano, 2002....
Transcript of GRIDS Center G rid R esearch I ntegration D evelopment & S upport Copyright Thomas Garritano, 2002....
GRIDS Center
Grid Research Integration Development & Support
http://www.grids-center.org
Copyright Thomas Garritano, 2002. This work is the intellectual property of the author. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright appears on the reproduced materials and notice is given that the
copying is by permission of the author. To disseminate otherwise or to republish requires written permission from the author.
Chicago - NCSA – SDSC - USC/ISI - Wisconsin
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
GRIDS, part of the NSF Middleware Initiative (NMI)
• The Information Sciences Institute (ISI) at the University of Southern California (Carl Kesselman)
• The University of Chicago (Ian Foster)
• The National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign (Randy Butler)
• The University of California at San Diego (Phil Papadoupolus)
• The University of Wisconsin at Madison (Miron Livny)
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
Enabling Seamless Collaboration
GRIDS will help distributed communities pursue common goals
Scientific research Engineering design Education Artistic creation
Focus is on the enabling mechanisms required for collaboration
Resource sharing as a fundamental concept
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
Grid Computing Rationale
The need for flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resource
See “The Anatomy of the Grid: Enabling Scalable Virtual Organizations” by Foster,
Kesselman, Tuecke at http://www.globus.org (in the “Publications” section)
The need for communities (“virtual organizations”) to share geographically distributed resources as they pursue common goals while assuming the absence of:
central location central control omniscience existing trust relationships
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
Elements of Grid Computing
Resource sharing Computers, storage, sensors, networks Sharing is always conditional, based on issues of trust,
policy, negotiation, payment, etc.
Coordinated problem solving Beyond client-server: distributed data analysis,
computation, collaboration, etc.
Dynamic, multi-institutional virtual organizations Community overlays on classic org structures Large or small, static or dynamic
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
Resource-SharingMechanisms
• Should address security and policy concerns of resource owners and users
• Should be flexible and interoperable enough to deal with many resource types and sharing modes
• Should scale to large numbers of resources, participants, and/or program components
• Should operate efficiently when dealing with large amounts of data & computational power
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
Grid ApplicationsScience portals
Help scientists overcome steep learning curves of installing and using new software
Solve advanced problems by invoking sophisticated packages remotely from Web browsers or "thin clients”
Portals are currently being developed in biology, fusion, computational chemistry, and other disciplines
Distributed computing High-speed workstations and networks can yoke
together an organization's PCs to form a substantial computational resource
E.g., U.S. and Italian mathematicians pooled resources for one week, aggregating 42,000 CPU-days to solve "Nug30"
Mathematicians Solve NUG30
Looking for the solution to the NUG30 quadratic assignment problem
An informal collaboration of mathematicians and computer scientists
Condor-G delivered 3.46E8 CPU seconds in 7 days (peak 1009 processors) in U.S. and Italy (8 sites)
14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22,13,26,17,30,6,20,19,8,18,7,27,12,11,23
MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin
Community = 1000s of home
computer users Philanthropic
computing vendor (Entropia)
Research group (Scripps)
Common goal= advance AIDS research
Home ComputersEvaluate AIDS Drugs
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
More Grid ApplicationsLarge-scale data analysis
Science increasingly relies on large datasets that benefit from distributed computing and storage
E.g., the Large Hadron Collider at CERN will generate many petabytes of data from high-energy physics experiments, with single-site storage impractical for technical and political reasons
Computer-in-the-loop instrumentation Data from telescopes, synchrotrons, and electron
microscopes are traditionally archived for batch processing
Grids are permitting quasi-real-time analysis that enhances the instruments’ capabilities
E.g., with sophisticated “on-demand” software, astronomers may be able to use automated detection techniques to zoom in on solar flares as they occur
Image courtesy Harvey Newman, Caltech
Data Grids forHigh Energy Physics
Tier2 Centre ~1 TIPS
Online System
Offline Processor Farm
~20 TIPS
CERN Computer Centre
FermiLab ~4 TIPSFrance Regional Centre
Italy Regional Centre
Germany Regional Centre
InstituteInstituteInstituteInstitute ~0.25TIPS
Physicist workstations
~100 MBytes/sec
~100 MBytes/sec
~622 Mbits/sec
~1 MBytes/sec
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
Physicists work on analysis “channels.”
Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server.
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight (deprecated)
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Caltech ~1 TIPS
~622 Mbits/sec
Tier 0Tier 0
Tier 1Tier 1
Tier 2Tier 2
Tier 4Tier 4
1 TIPS is approximately 25,000
SpecInt95 equivalents
DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago
tomographic reconstruction
real-timecollection
wide-areadissemination
desktop & VR clients with shared controls
Advanced Photon Source
Online Access to Scientific Instruments
archival storage
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
Still More Grid Applications
Collaborative work Researchers often want to aggregate not only data
and computing power, but also human expertise Grids enable collaborative problem formulation and
data analysis E.g., an astrophysicist who has performed a large,
multi-terabyte simulation could let colleagues around the world simultaneously visualize the results, permitting real-time group discussion
E.g., civil engineers collaborate to design, execute, & analyze shake table experiments
U.S. PIs: Avery, Foster, Gardner, Newman, Szalay www.ivdgl.org
iVDGL: International Virtual Data Grid Laboratory
Tier0/1 facility
Tier2 facility
10 Gbps link
2.5 Gbps link
622 Mbps link
Other link
Tier3 facility
Network for EarthquakeEngineering Simulation
NEESgrid: US national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, and each other
On-demand access to experiments, data streams, computing, archives, collaboration
NEESgrid: Argonne, Michigan, NCSA, UIUC, USC
The 13.6 TF TeraGrid:Computing at 40 Gb/s
26
24
8
4 HPSS
5
HPSS
HPSS UniTree
External Networks
External Networks
External Networks
External Networks
Site Resources Site Resources
Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB
SDSC4.1 TF225 TB
Caltech Argonne
TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
Grids and IndustryGrid computing has much in common with major industrial thrusts
Business-to-business, Peer-to-peer, Application Service Providers, Storage Service Providers, Distributed Computing, Internet Computing, etc.
Outsourcing increases decentralization of resources
Sharing issues are not adequately addressed by existing technologies
Complicated requirements: “run program X at site Y subject to community policy P, providing access to data at Z according to policy Q”
Companies like IBM, Platform Computing and Microsoft are getting substantively involved with the open-source Grid community (e.g., web services and Grid services)
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
eBusiness Grids
• Engineers at a multinational company collaborate on the design of a new product
• A multidisciplinary analysis in aerospace couples code and data in four companies
• An insurance company mines data from partner hospitals for fraud detection
• An application service provider offloads excess load to a compute cycle provider
• An enterprise configures internal & external resources to support eBusiness workload
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
Grid Computing: Why Now?
• Moore’s law improvements in computing produce highly functional endsystems
• The Internet and burgeoning wired and wireless provide universal connectivity
• Changing modes of problem solving emphasize teamwork, computation
• Network exponentials produce dramatic changes in geometry and geography
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
Network ExponentialsNetwork vs. computer performance
Computer speed doubles every 18 months Network speed doubles every 9 months Difference = order of magnitude per 5 years
1986 to 2000 Computers: x 500 Networks: x 340,000
2001 to 2010 Computers: x 60 Networks: x 4000
Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan-2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins.
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
GRIDS and the NSF Middleware Initiative
GRIDS is one of two NMI teams; the other is EDIT
NMI seeks standard components and mechanisms Authentication, authorization, policy Resource discovery and directory Remote access of computers, data, instruments
Also seeks: Integration with end-user tools (conferencing, data
analysis, data sharing, distributed computing, etc.) Integration with campus infrastructures Integration with commercial technologies
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
GRIDS Deliverablesfor NMI Release 1.0
On May 7, NMI Release 1.0 will be issued (see www.nsf-middleware.org), including deliverables from the GRIDS and EDIT teams
GRIDS software in NMI-R1 will include new versions of:
Globus Toolkit™ Condor-G Network Weather Service package also includes KX.509
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
The Globus Toolkit™The de facto standard for Grid computing
A modular “bag of technologies” addressing key technical problems facing Grid tools, services and applications
Made available under liberal open source license Simplifies collaboration across virtual organizations
Authentication Grid Security Infrastructure (GSI)
Scheduling Globus Resource Allocation Manager (GRAM) Dynamically Updated Request Online Coallocator (DUROC)
File transfer Global Access to Secondary Storage (GASS) GridFTP
Resource description Monitoring and Discovery Service (MDS)
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
Condor-G High performance computing (HPC) is often measured
in operations per second; with high throughput computing (HTC), Condor permits increased processing capacity over longer periods of time
CPU cycles/day (week, month, year?) under non-ideal circumstances
“How many times can I run simulation X in a month using all available machines?”
The Condor Project develops, deploys, and evaluates mechanisms and policies for HTC on large collections of distributed systems
NMI-R1 will include Condor-G, an enhanced version of the core Condor software optimized to work with Globus Toolkit™ for managing Grid jobs
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
Network Weather Service From UC Santa Barbara, NWS monitors and dynamically
forecasts performance of network and computational resources
Uses a distributed set of performance sensors (network monitors, CPU monitors, etc.) for instantaneous readings
Numerical models’ ability to predict conditions is analogous to weather forecasting – hence the name
For use with the Globus Toolkit and Condor, allowing dynamic schedulers to provide statistical Quality-of-Service readings
NWS forecasts end-to-end TCP/IP performance (bandwidth and latency), available CPU percentage and available non-paged memory
NWS automatically identifies the best forecasting technique for any given resource
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
KX.509 for Converting Kerberos Certificates to PKI
Stand-alone client program from the University of Michigan
For a Kerberos-authenticated user, KX.509 acquires a short-term X.509 certificate that can be used by PKI applications
Stores the certificate in the local user's Kerberos ticket file Systems that already have a mechanism for removing
unused kerberos credentials may also automatically remove the X.509 credentials
Web browser may then load a library (PKCS11) to use these credentials for https
The client reads X.509 credentials from the user’s Kerberos cache and converts them to PEM, the format used by the Globus Toolkit
Part of the NSF Middleware Initiative (NMI) www.grids-center.org
GRIDS
GRIDS Integration Issues
Ten NMI testbed sites will be early adopters, seeking integration of enterprise and Grid computing
Eight sites to be announced soon by SURA Two further sites: CalTech and USC
Via NMI partnerships, GRIDS will help identify points of intersection and divergence between Grid and enterprise computing
Directory services Authorization, authentication and security Emphasis is on open standards and architectures
as the route to successful collaboration