Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003,...

21
Introduction to the Introduction to the Grid: technologies Grid: technologies and projects and projects Oxana Smirnova Oxana Smirnova Lund University Lund University October 28, 2003, Ko October 28, 2003, Ko šice šice

Transcript of Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003,...

Page 1: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

Introduction to the Introduction to the Grid: technologies Grid: technologies and projectsand projects

Oxana SmirnovaOxana SmirnovaLund UniversityLund UniversityOctober 28, 2003, KoOctober 28, 2003, Košicešice

Page 2: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 2

Outlook

Information Technology developments Grid solutions High Energy Physics challenges Development and deployment projects

Page 3: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 3

IT progress: some facts Network vs. computer

performance: Computer speed doubles

every 18 months Network speed doubles

every 9 months 1986 to 2000:

Computers: 500 times faster

Networks: 340000 times faster

2001 to 2010 (projected): Computers: 60 times faster Networks: 4000 times faster

Slide adapted from the Globus Alliance

Bottom line: CPUs are fast enough; networks are very fast – gotta make use of it!

Page 4: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 4

The Grid Paradigm

Distributed supercomputer, based on commodity PCs and fast WAN

Access to the great variety of resources by a single pass – certificate

A possibility to manage distributed data in a synchronous manner (e.g., LHC data analysis)

A new commodity

Supercomputer

WorkstationPC Farm

The GridDrainage

Water

Electricity

Internet

Grid

Radio/TV

Page 5: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 5

Wider scope: a Grid SystemA Grid system is a collection

of distributed resources

connected by a network

Examples of Distributed Resources: Desktop Handheld hosts Devices with embedded processing resources

such as digital cameras and phones Tera-scale supercomputers

Slide adapted from A.Grimshaw

Page 6: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 6

Characteristics of a generic Grid system

Numerous Resources

Ownership by MutuallyDistrustful

Organizations & Individuals

Potentially Faulty

Resources

Different SecurityRequirements

& Policies Required

Resources areHeterogeneous

GeographicallySeparated

Different ResourceManagementPolicies

Connected byHeterogeneous, Multi-Level

Networks

Slide adapted from A.Grimshaw

Page 7: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 7

Grid paradigm is overloaded

Desktop Cycle Aggregation Desktop only United Devices, Entropia, Data Synapse

Cluster & Departmental “Grids” Single owner, platform, domain, file system and location SUN SGE, Platform LSF, PBS

Enterprise “Grids” Single enterprise; multiple owners, platforms, domains, file systems, locations, and security policies SUN SGE EE, Platform Multicluster

Global Grids Multiple enterprises, owners, platforms, domains, file systems, locations, and security policies Legion, Avaki, Globus

Graph borrowed from A.Grimshaw

WARNING! Not everything that has “G” in the name is

Grid!(SGE, Oracle 10g, Condor-G

etc)

Page 8: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 8

Globus: the toolkit provider

The first and only provider of a Grid toolkit (libraries and API) An academic research project in

USA and now Europe Free software, open code Supports Grid testbeds since late

90’s

Grid features:

• Heterogeneous

• Non-interactive

• Single logon

• Optimized file transfer protocol

• Information schemaTo do:

• Global resource management

• Data management

• User management, accounting

To do:

• Global resource management

• Data management

• User management, accounting

Page 9: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 9

Gatekeeper(factory)

Reporter(registry +discovery)

Userprocess #2Proxy #2

Create process Register

User

Userprocess #1

Proxy

Authenticate & create proxy

credential

GSI(Grid Security Infrastructure)

Reliable remote

invocation

GRAM(Grid Resource Allocation & Management)

The Globus Toolkit v2 in One Slide Grid protocols (GSI, GRAM, …) enable resource sharing within

virtual organizations; toolkit provides reference implementation ( = Globus Toolkit services)

Protocols (and APIs) enable other tools and services for membership, discovery, data management, workflow, …

Other service(e.g. GridFTP)

Other GSI-authenticated remote service

requests

GIIS: GridInformationIndex Server (discovery)

MDS-2(Monitoring and

Discovery Service)Soft state

registration; enquiry

Slide adapted from the Globus Alliance

Page 10: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 10

Globus-Based Grid Tools & Applications Data Grids

Distributed management of large quantities of data: physics, astronomy, engineering

High-throughput computing Coordinated use of many computers

Collaborative environments Authentication, resource discovery, and resource access

Portals Thin client access to remote resources & services

And combinations of the above

Slide adapted from the Globus Alliance

Page 11: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 11

Some architectural thoughts

Storage

StorageUser

Interface

UserInterface

UserInterface

InformationServer

Data locationserver

WorkloadmanagerWorkloadmanager

InformationServer

InformationServer

Page 12: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 12

Who needs Grid: High Energy Physics challenges Data-intensive tasks

Large datasets, large files Lengthy processing times Large memory consumption High throughput is necessary

Very distributed user base Distributed computing

resources of modest size Produced and processed data

are hence distributed, too Issues of coordination,

synchronization and authorization are outstanding

HEP is by no means unique in its demands, but they are first, they are many, and they badly need it

Page 13: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 13

Experiment-Grid interactionExperiment Grid

Task

Input DB

Output DB

MSS

Paper

JobDescription

InformationSystem

ResourceResourceBrokerBroker

Resources

CPU DiskMonitoring& control

ReplicaLocation

Page 14: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 14

HEP-related Grid projects

European projects

US projects

Many national, regional Grid projects --GridPP(UK), INFN-grid(I),NorduGrid, Dutch Grid, …

The Virtual DataToolkit (VDT)

The DataGRIDToolkit

Slide adapted from Les Robertson

Page 15: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 15

Related Grid projects

Other Grid-related projects do not develop Open Source-like (i.e., free) software/middleware, as of today Most notably, Legion/Avaki: Globus competitor, widely used by businesses Entropia: like SETI@Home IBM, Platform: Globus-based Sun Grid Engine EE: enterprise Grids

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010

LCG

EDG EGEE

GriPhyN, PPDG VDT

CROSSGRID

DataTAG

NorduGrid

Globus GT2 GT3 OGSA

Page 16: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 16

What Grid can do today

Simplest Grid: users access distributed resources using a single certificate

More complex Grid: users’ tasks are distributed between different resources by a broker

Even more complex Grid: not only tasks, but massive amounts of data are also distributed and managed (not quite there yet, only prototypes

??????

Broker(s) ???Broker(s)

MSS

SE

SE

MSS

???

???

Page 17: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 17

What is missing

Common policies, or ways of mutually respecting such

Grid accounting systems and Grid economy Serious security solutions; role-based access

control Full-blown distributed data management systems Tools and methods for system-wide applications

environment deployment STANDARDS!

Page 18: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 18

The Grid or many Grids?

Globus Toolkit 2 is a basis for great many Grid solutions Which use some common tools and utilities: GSI, GridFTP But they also differ a lot, architecturally and technologically There are several non-interoperable GT2-based Grid systems!

No satisfactory ready-made solutions developers invent their own Being financed from different sources, developers and users are not always

encouraged to adopt rival project’s solution Instead of “How should I use Grid?”, users ask “Which Grid should I use?”

Grid standards body: Global Grid Forum (GGF) Heavily oriented towards commercial implementations No effective standards since 2001

Meanwhile, Globus introduced the “Open Grid Services Architecture” (OGSA) Globus Toolkit 3 is released Not yet used by any of the development projects Perhaps the first set of standards endorsed by GGF

Page 19: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 19

Fu

nctio

nal

ity, s

tan

dar

diza

tion

Customsolutions

1990 1995 2000 2005

Open GridServices Arch

Real standardsMultiple implementations

Web services, etc.

Managed sharedvirtual systems

Computer science research

Globus Toolkit

Defacto standardSingle implementation

Internetstandards

The emergence of Open Grid standards

2010

Slide adapted from the Globus Alliance

Page 20: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 20

Open Grid Services Architecture Standard interfaces & behaviors for distributed system

management Service orientation: Grid Services, in analogy to Web Services

Web services: persistent Grid services: transient (issues: e.g., how are they discovered?) Extending WSDL to GSDL (work with W3C)

Standard service specifications Resource management Data management Workflow Security etc.

Paves the road towards interoperability and true modularity of Grid structures

Page 21: Introduction to the Grid: technologies and projects Oxana Smirnova Lund University October 28, 2003, Košice.

2003-10-28 [email protected] 21

Conclusion

HEP community stirred a world-wide Grid interest Next big thing after the dot-com?..

Despite a slow start and much hype, some real work is under way Rather, the next big thing after the WWW !

Still, no complete solution exists Data management? Accounting? Security? Standardization?

With courage and patience, we should go Grid