1
Observations on Architecture,Protocols, Services, APIs, SDKs, and the Role of the Grid Forum
Ian FosterCarl KesselmanSteven Tuecke
2
Why this Talk?
Considerable progress over GF1-5 in terms of interest and understanding in Grid concepts
Seems timely to attempt to– Define the scope of the problem that we
are tackling– Define a common vocabulary for
describing components and activities
3
Overview
1. The Grid problem: controlled resource sharing in multi-institutional settings
2. Definition, role, and importance of protocols, services, SDKs, and APIs
3. A categorization of protocols, services, SDKs, and APIs in the Grid environment
4
The Grid Problem
Grid language has been driven by genesis from metacomputing, but…
In practice, the Grid is about resource sharing and coordinated problem solving in dynamic, multi-institutional virtual organizations
Focus on how to enable, maintain, and control the sharing of resources to achieve a common goal
5
Universal Nature of the Grid Problem
“Sharing” fundamental in many settings– Application Service Providers, Storage Service
Providers, etc.; Peer-to-peer computing; Distributed computing; Business to business; …
Sharing issues not adequately addressed by existing technologies– Sharing at a deep level, across broad ranges
of resources and in a general way– E.g., user provides ASP with controlled access
to their data on an SSP: how?? Grid community has unique experience
6
Some Important Definitions
Resource Network protocol Network enabled service Application Programmer Interface (API) Software Development Kit (SDK)
Not discussed, but important: policies
7
Resource
Entity that is to be shared– Includes computers, storage, data,
software Does not have to be physical entity
– Condor pool, distributed file system, … Defined in terms of interfaces, not
devices– E.g. LSF defines compute resource– Open/close/read/write defines access to a
distributed file system, e.g. NFS, AFS, DFS
8
Network Protocol
A formal description of message formats and a set of rules for message exchange– Rules may define sequence of message
exchanges– Protocol may define state-change in
endpoint, e.g. state change Good protocols designed to do one thing
– Protocols can be layered Examples of protocols
– IP, TCP, TLS, HTTP, Kerberos
9
Network Enabled Services
Implementation of a protocol that defines a set of capabilities– Protocol defines interaction with service– All services require protocols– Not all protocols are used to provide
services (e.g. IP, TLS) Examples: FTP and Web servers
Web Server
IP Protocol
TCP Protocol
TLS Protocol
HTTP Protocol
FTP Server
IP Protocol
TCP Protocol
FTP Protocol
Telnet Protocol
10
Application Programmer Interface
A specification for a set of routines to facilitate application development– Refers to definition, not implementation, e.g.
there are many implementations of MPI Spec often language-specific (or IDL)
– Routine name, number, order and type of arguments; mapping to language constructs
– Behavior or function of routine Examples
– GSS API, MPI
11
Software Development Kit
A particular instantiation of an API SDK consists of libraries and tools
– Provides implementation of API specification
Can have multiple SDKs for an API Examples of SDKs
– MPICH, Motif Widgets
12
Multiple APIs but a Single ProtocolExample: TCP/IP
Multiple APIs: BSD sockets, Winsock, System V streams, …
Different programs use different APIs Interoperability: programs using
different APIs can exchange information
TCP/IP Protocol: Reliable byte streams
WinSock API Berkeley Sockets API
Application Application
13
Single API, but Multiple ProtocolsE.g., Message Passing Interface
MPI provides portability: any correct program compiles & runs on a platform
Does not provide interoperability: all processes must link against same SDK– E.g., MPICH and LAM versions of MPI
ApplicationApplication
MPI API MPI API
LAM SDK
LAM protocol
MPICH-P4 SDK
MPICH-P4 protocol
TCP/IP TCP/IPDifferent message formats, exchange
sequences, etc.
14
Back to Grids:The Programming & Systems Problems
The programming problem– Making it easy to develop sophisticated
applications– Requires programming environments: APIs,
SDKs, tools The systems problem
– Facilitating coordinated use of diverse resources; sharing infrastructure
– Requires systems: protocols and services “Standards” can help in both cases:
but in different ways
15
I.e., Standard APIs and Protocols are Both Important: For Different Reasons
Standard APIs/SDKs are important– They enable application portability– But w/o standard protocols, interoperability
is hard (every SDK speaks every protocol?) Standard protocols are important
– Enable cross-site interoperability– Enable shared infrastructure– But w/o standard APIs/SDKs, application
portability is hard (different platforms access protocols in different ways)
16
Grid Architecture
We now proceed to analyze Grid systems with respect to sharing
Identify key areas where protocols, services, APIs, and SDKs can occur
Result is a layered protocol architecture
We assert this can be useful as a means of describing and structuring Grid Forum activities
17
Layered Grid Protocol Architecture(By Analogy to Internet Architecture)
Application
Fabric“Controlling things locally”: Access to, & control of, resources
Connectivity“Talking to things”: communication (Internet protocols) & security
Resource“Sharing single resources”: negotiating access, controlling use
Collective“Managing multiple resources”: ubiquitous infrastructure services
User“Specialized services”: user- or appln-specific distributed services
InternetTransport
Application
Link
Inte
rnet P
roto
col
Arch
itectu
re
18
Protocols, Services, and InterfacesOccur at Each Level
Languages/Frameworks
Fabric Layer
Applications
Local Access APIs and Protocols
Collective Service APIs and SDKs
Collective ServicesCollective Service Protocols
Resource APIs and SDKs
Resource ServicesResource Service Protocols
User Service ProtocolsUser Service APIs and SDKs
User Services
Connectivity APIs
Connectivity Protocols
19
ComputeResource
SDK
API
AccessProtocol
SourceCode Repository
SDK
API
LookupProtocol
Example: User Portal
Web Portal
Source code discovery, application configuration
Brokering, co-allocation, certificate authorities
Access to data, access to computers, access to network performance data
Communication, service discovery (DNS), authentication, authorization, delegation
Storage systems, schedulers
User
Appln
Collective
Resource
Connect
Fabric
20
ComputeResource
SDK
API
AccessProtocol
CheckpointRepository
SDK
API
C-pointProtocol
Example:High-Throughput Computing System
High Throughput Computing System
Dynamic checkpoint, job management, failover, staging
Brokering, certificate authorities
Access to data, access to computers, access to network performance data
Communication, service discovery (DNS), authentication, authorization, delegation
Storage systems, schedulers
User
Appln
Collective
Resource
Connect
Fabric
21
Important Points
We build on Internet protocols One or many protocols?
– No one “right” protocol for any one function– But: interoperability requires that we define
and commit to core “Intergrid” protocols– Definition: “A resource is Grid-enabled if it
speaks Intergrid protocols” One or many APIs and SDKs?
– Many APIs, SDKs, programming models can target Intergrid protocols
– But: code sharing requires standards
22
Summary
Grids are about [large-scale] sharing– Hence require standard protocols to enable
interoperability and shared infrastructure– As well as, of course, standard APIs and
SDKs to enable portability & code sharing Well defined protocol architecture is
essential to understanding & progress– Provides a framework for figuring out
where the pieces fit
23
24
Additional Slides
26
“Standards Enable Sharing”-- But of What, and How?
Of code & abstractions– Via SDKs, APIs– E.g.: MPI
Of infrastructure services– Via protocols, policies– E.g.: GIS, CA
Of resources– Via protocols, policies– E.g., TCP/IP, TLS
App 1SDK
App 2SDK
App 1 App 2
CA GIS
App App
TCP/IP
TLS
Site 1 Site 2
27
Aspects of the Programming Problem
Need for abstractions and models to add to speed/robustness/etc. of development– E.g., OO abstractions, MPI for messaging
Need for code sharing to allow reuse of code components developed by others– E.g., MPI allows reuse of message passing
Need for tool sharing to allow reuse of tools developed by others– E.g., standard debuggers
28
Aspects of the Systems Problem
Need for interoperability when different groups want to share resources– Diverse components, policies, mechanisms– E.g., standard notions of identity, means of
communication, resource descriptions Need for shared infrastructure services to
avoid repeated development, installation– E.g., one port/service for remote access to
computing, not one per tool/application– E.g., Certificate Authorities: expensive to run
Top Related