Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003,...
-
Upload
amber-lyons -
Category
Documents
-
view
218 -
download
0
Transcript of Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003,...
Introduction to the Introduction to the Grid: technologies Grid: technologies and projectsand projects
Oxana SmirnovaOxana SmirnovaLund UniversityLund UniversityOctober 28, 2003, KoOctober 28, 2003, Košicešice
2003-10-28 [email protected] 2
Outlook
Information Technology developments Grid solutions High Energy Physics challenges Development and deployment projects
2003-10-28 [email protected] 3
IT progress: some facts Network vs. computer
performance: Computer speed doubles
every 18 months Network speed doubles
every 9 months 1986 to 2000:
Computers: 500 times faster
Networks: 340000 times faster
2001 to 2010 (projected): Computers: 60 times faster Networks: 4000 times faster
Slide adapted from the Globus Alliance
Bottom line: CPUs are fast enough; networks are very fast – gotta make use of it!
2003-10-28 [email protected] 4
The Grid Paradigm
Distributed supercomputer, based on commodity PCs and fast WAN
Access to the great variety of resources by a single pass – certificate
A possibility to manage distributed data in a synchronous manner (e.g., LHC data analysis)
A new commodity
Supercomputer
WorkstationPC Farm
The GridDrainage
Water
Electricity
Internet
Grid
Radio/TV
2003-10-28 [email protected] 5
Wider scope: a Grid SystemA Grid system is a collection
of distributed resources
connected by a network
Examples of Distributed Resources: Desktop Handheld hosts Devices with embedded processing resources
such as digital cameras and phones Tera-scale supercomputers
Slide adapted from A.Grimshaw
2003-10-28 [email protected] 6
Characteristics of a generic Grid system
Numerous Resources
Ownership by MutuallyDistrustful
Organizations & Individuals
Potentially Faulty
Resources
Different SecurityRequirements
& Policies Required
Resources areHeterogeneous
GeographicallySeparated
Different ResourceManagementPolicies
Connected byHeterogeneous, Multi-Level
Networks
Slide adapted from A.Grimshaw
2003-10-28 [email protected] 7
Grid paradigm is overloaded
Desktop Cycle Aggregation Desktop only United Devices, Entropia, Data Synapse
Cluster & Departmental “Grids” Single owner, platform, domain, file system and location SUN SGE, Platform LSF, PBS
Enterprise “Grids” Single enterprise; multiple owners, platforms, domains, file systems, locations, and security policies SUN SGE EE, Platform Multicluster
Global Grids Multiple enterprises, owners, platforms, domains, file systems, locations, and security policies Legion, Avaki, Globus
Graph borrowed from A.Grimshaw
WARNING! Not everything that has “G” in the name is
Grid!(SGE, Oracle 10g, Condor-G
etc)
2003-10-28 [email protected] 8
Globus: the toolkit provider
The first and only provider of a Grid toolkit (libraries and API) An academic research project in
USA and now Europe Free software, open code Supports Grid testbeds since late
90’s
Grid features:
• Heterogeneous
• Non-interactive
• Single logon
• Optimized file transfer protocol
• Information schemaTo do:
• Global resource management
• Data management
• User management, accounting
To do:
• Global resource management
• Data management
• User management, accounting
2003-10-28 [email protected] 9
Gatekeeper(factory)
Reporter(registry +discovery)
Userprocess #2Proxy #2
Create process Register
User
Userprocess #1
Proxy
Authenticate & create proxy
credential
GSI(Grid Security Infrastructure)
Reliable remote
invocation
GRAM(Grid Resource Allocation & Management)
The Globus Toolkit v2 in One Slide Grid protocols (GSI, GRAM, …) enable resource sharing within
virtual organizations; toolkit provides reference implementation ( = Globus Toolkit services)
Protocols (and APIs) enable other tools and services for membership, discovery, data management, workflow, …
Other service(e.g. GridFTP)
Other GSI-authenticated remote service
requests
GIIS: GridInformationIndex Server (discovery)
MDS-2(Monitoring and
Discovery Service)Soft state
registration; enquiry
Slide adapted from the Globus Alliance
2003-10-28 [email protected] 10
Globus-Based Grid Tools & Applications Data Grids
Distributed management of large quantities of data: physics, astronomy, engineering
High-throughput computing Coordinated use of many computers
Collaborative environments Authentication, resource discovery, and resource access
Portals Thin client access to remote resources & services
And combinations of the above
Slide adapted from the Globus Alliance
2003-10-28 [email protected] 11
Some architectural thoughts
Storage
StorageUser
Interface
UserInterface
UserInterface
InformationServer
Data locationserver
WorkloadmanagerWorkloadmanager
InformationServer
InformationServer
2003-10-28 [email protected] 12
Who needs Grid: High Energy Physics challenges Data-intensive tasks
Large datasets, large files Lengthy processing times Large memory consumption High throughput is necessary
Very distributed user base Distributed computing
resources of modest size Produced and processed data
are hence distributed, too Issues of coordination,
synchronization and authorization are outstanding
HEP is by no means unique in its demands, but they are first, they are many, and they badly need it
2003-10-28 [email protected] 13
Experiment-Grid interactionExperiment Grid
Task
Input DB
Output DB
MSS
Paper
JobDescription
InformationSystem
ResourceResourceBrokerBroker
Resources
CPU DiskMonitoring& control
ReplicaLocation
2003-10-28 [email protected] 14
HEP-related Grid projects
European projects
US projects
Many national, regional Grid projects --GridPP(UK), INFN-grid(I),NorduGrid, Dutch Grid, …
The Virtual DataToolkit (VDT)
The DataGRIDToolkit
Slide adapted from Les Robertson
2003-10-28 [email protected] 15
Related Grid projects
Other Grid-related projects do not develop Open Source-like (i.e., free) software/middleware, as of today Most notably, Legion/Avaki: Globus competitor, widely used by businesses Entropia: like SETI@Home IBM, Platform: Globus-based Sun Grid Engine EE: enterprise Grids
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
LCG
EDG EGEE
GriPhyN, PPDG VDT
CROSSGRID
DataTAG
NorduGrid
Globus GT2 GT3 OGSA
2003-10-28 [email protected] 16
What Grid can do today
Simplest Grid: users access distributed resources using a single certificate
More complex Grid: users’ tasks are distributed between different resources by a broker
Even more complex Grid: not only tasks, but massive amounts of data are also distributed and managed (not quite there yet, only prototypes
??????
Broker(s) ???Broker(s)
MSS
SE
SE
MSS
???
???
2003-10-28 [email protected] 17
What is missing
Common policies, or ways of mutually respecting such
Grid accounting systems and Grid economy Serious security solutions; role-based access
control Full-blown distributed data management systems Tools and methods for system-wide applications
environment deployment STANDARDS!
2003-10-28 [email protected] 18
The Grid or many Grids?
Globus Toolkit 2 is a basis for great many Grid solutions Which use some common tools and utilities: GSI, GridFTP But they also differ a lot, architecturally and technologically There are several non-interoperable GT2-based Grid systems!
No satisfactory ready-made solutions developers invent their own Being financed from different sources, developers and users are not always
encouraged to adopt rival project’s solution Instead of “How should I use Grid?”, users ask “Which Grid should I use?”
Grid standards body: Global Grid Forum (GGF) Heavily oriented towards commercial implementations No effective standards since 2001
Meanwhile, Globus introduced the “Open Grid Services Architecture” (OGSA) Globus Toolkit 3 is released Not yet used by any of the development projects Perhaps the first set of standards endorsed by GGF
2003-10-28 [email protected] 19
Fu
nctio
nal
ity, s
tan
dar
diza
tion
Customsolutions
1990 1995 2000 2005
Open GridServices Arch
Real standardsMultiple implementations
Web services, etc.
Managed sharedvirtual systems
Computer science research
Globus Toolkit
Defacto standardSingle implementation
Internetstandards
The emergence of Open Grid standards
2010
Slide adapted from the Globus Alliance
2003-10-28 [email protected] 20
Open Grid Services Architecture Standard interfaces & behaviors for distributed system
management Service orientation: Grid Services, in analogy to Web Services
Web services: persistent Grid services: transient (issues: e.g., how are they discovered?) Extending WSDL to GSDL (work with W3C)
Standard service specifications Resource management Data management Workflow Security etc.
Paves the road towards interoperability and true modularity of Grid structures
2003-10-28 [email protected] 21
Conclusion
HEP community stirred a world-wide Grid interest Next big thing after the dot-com?..
Despite a slow start and much hype, some real work is under way Rather, the next big thing after the WWW !
Still, no complete solution exists Data management? Accounting? Security? Standardization?
With courage and patience, we should go Grid