Grid Computing - UPM
Transcript of Grid Computing - UPM
2
Overview
IntroductionArchitectureSecurityInformation SystemsData ManagementWorkload ManagementReferences
3
Introduction
At the beginning of the 20th century, if you wanted to get electricity you needed to live near an electric generator. Nowadays there are super generators that supply to numerous clients (Electric Power Grid)As regards information, World Wide Web allows us to share information everywhere around the worldNew challenges:
– Complex problems required to analyse a great amount of data
– Researchers are located in geographically separated places
4
Introduction
Grid Computing is based on the philosophy of information and electricity sharing, allowing us to access another kind of heterogeneous and geographically separated resourcesGrid provides the sharing of:
– Computational resources– Storage elements– Specific applications
Thus, Grid is based on:– Internet protocols– Ideas of parallel and distributed computing
5
IntroductionGrid technology is an important part of several researching areas because it provides computational and storage support to applications that needs a great computational capacity and analyses a great amount of data
6
Grid
A grid can be defined as: “coordinated resources that are not subject to a centralized control ... using standard, open and general-purpose protocols and interfaces ... deliver non-trivial qualities of services ...” [Ian Foster]
Extending the definition of grid:– Special form of distributed computing– Heterogeneous resources– Computational and storage resources geographically
distributed– Resources are usually connected by wide area networks
(WAN)– Servers, supercomputers, clusters, … are grid resources
7
Usual Grids
Grid Computing is used in applications with the following characteristics:– Have a community of distributed users– Need a great computational power– Need a great storage capacity
Above all, it is used in researching areas which face up to complex problems and store numerous data– High Energy Physics– Earth Observation– Bio-medicine
8
Overview
IntroductionArchitectureSecurityInformation SystemsData ManagementWorkload ManagementReferences
9
Grid architecture
Application
High level MiddlewareEDG, Crossgrid
Low level MiddlewareGlobus, Unicore, Legion
Operating systems Unix, Linux, Windows
Hardware
10
Grid architecture
Local resources access and control
Communication by means of Internet protocols and security
Resources sharing and access negotiation
Coordination of several resources Application
Application
Collective
Resource
Connectivity
Fabric
Transport
Network
Link
11
Elements
Resource providers – Publish the availability of their resources by
means of information systems– Define their own security policies
Broker– Register and categorize the published services
providing collective searchRequesters– Use brokering services to find and use resources
12Resources providersBrokerRequester
Communication
Example
13
Pilares básicos
14
Overview
IntroductionArchitectureSecurityInformation SystemsData ManagementWorkload ManagementReferences
15
Need of security
Distributed resourcesNo centralized controlDifferent resource providersEach resource provider uses different security policies
16
Security in Grid
Generic Security Services (GSS)– Authentication, delegation, integrity and
confidentiality – Public Key Infrastructure (PKI) with X.509
certificates– Kerberos– Secure Socket Layer (SSL)
Grid Security Infrastructure (GSI)– Delegation– Single Sign-On Proxy certificates
17
Certificate request
An user ask for a certificate to a Certification Authority (CA)The CA checks the user identityThen, the CA signs the request, creating a certificate, and return it to the user– Certificates can be cancelled
Certificate Revocation List (CRL)
The aim of the certificates is described in the certificate policy (CP)
18
Overview
IntroductionArchitectureSecurityInformation Systems
Grid MonitoringData ManagementWorkload ManagementReferences
19
Information Systems
Provide information on:– The Grid itself
The user may query about the status and performance of the Grid
– Grid applicationsRegister and monitors resourcesStandardization is required to interoperate among different grids projects
– Globus: MDS (Monitoring and Discovery Service) – European Data Grid: R-GMA (Relational Grid Monitoring
Architecture)– UNICORE: Incarnation Database (IDB)
20
Performance
Traditionally performance measures:– Speed– Throughput– Bandwidth
In Grid environments, it is necessary to measure:– Allocation of resources to processes– QoS– Availability
21
Overview
IntroductionArchitectureSecurityInformation SystemsData ManagementWorkload ManagementReferences
22
Data Grid
Set of storage resources and data retrievalcomponents which allows applications to access data by means of special software mechanismsData grid problems:
– Data location– Replication– I/O performance
23
Data Transfer
GridFTP: Protocol to data transfer in a secure way in a grid environment
– Extends FTP protocol– Use Grid Security Infrastructure (GSI)– Several storage systems provide GridFTP interfaces:
CastorEDG’s SRM
Reliable File Transfer (RFT): Grid Service whichprovides interfaces to manage and monitor file transfers by using GridFTP servers
24
Data replication
Due to the complexity of a grid environment, the existence offile replicas could be advisableNeed of identifying and locating replicasReplica Location Service (RLS): a Grid Service for registeringdata replicas and later discovering
– Mappings between logic and fisical identifiers– Database for metadata
25
Overview
IntroductionArchitectureSecurityInformation SystemsData ManagementWorkload ManagementReferences
26
Resource management system
Resource Management includes the efficient use of computing and storage resources– Processor time– Memory– Storage– Network
User-transparentInteracts with the rest of Grid components
27
Job Execution
A job can be any kind of executable that requires CPU or storageResource manager:
– Resource BrokeringFind suitable resources
– MatchmakingAssign a job to a resource that satisfies job requirements
– Job executionExecute the jobs and retrieve outputError management
Job execution requires to find the right Computing Element
28
Job submission
UI
WorkloadManager
ReplicaCatalogue
Inform.Service
ComputingElement
StorageElement
“Grid enabled”data transfers/
accesses
RBstorage
In/OutputSandboxfiles
Job
Data Localization
Status
SE statusCE status
In/OutputSandboxfiles
Job
29
Intensive jobs
Used in parallel and distributed environments– Parallel machines– Clusters
A Grid is understood as a set of clusters or parallel machinesPossibility to execute MPI jobs– MPICH-G2 – LAM-MPI 7.0.4
Matchmaking– Resource broker must select nodes that have MPI
installed, and at least n CPUs
30
Job queue managers
Condor-G: Condor High-throughput computing project – http://www.cs.wisc.edu
Portable Batch System (PBS)Sun Grid Engine (SGE)
31
Overview
IntroductionArchitectureSecurityInformation SystemsData ManagementWorkload ManagementReferences
32
References
“The Grid: Blueprint for a New Computing Infrastructure”. I. Foster and C. Kesselman. Morgan Kaufmann. 1998.“The Anatomy of the Grid: Enabling Scalable Virtual Organizations”. I. Foster, C. Kessleman and S. Tuecke. International Journal of Supercomputer Applications. 2001“The Globus Alliance”. http://www.globus.org
Post-XML Grids
34
Outline
Web services– SOAP– WSDL– UDDI
Grid Services & OGSIWS-RF
35
Web evolutionPast: Documents Web
– Static pages– Web as a huge repository of information– Technologies: HTTP + HTML
Present: Applications Web– Pages dynamically generated by Web applications– Applications export their interface to users through Web– Commercial transactions environment (Business to consumer, B2C)– Technologies: CGI, ASP, PHP, JSP, servlets, ...
Future (and present): Services Web (functions/methods)– “Libraries” offer services to programs (no to users)– Web as a huge services API (Components Web)– “Added value” Enterprises (Business to business, B2B)– Distributed systems over Internet
Web Service: RPC in the Web using XML
36
Web applications: Common scenario
Picture extracted from “Understanding Web Services”: http://www7.software.ibm.com/vad.nsf/Data/Document4362
37
Web Services: Common scenario
Picture extracted from “Understanding Web Services”: http://www7.software.ibm.com/vad.nsf/Data/Document4362
38
Web service
Module which exports a set of functions (methods) toapplications through the Web, providing hw/swplatforms independenceSimilar to RPC or RMI but integrated in the WebStandarization managed by W3C:
– http://www.w3.org/2002/ws/Questions:
– Transport protocol → HTTP– Representation format → XML– Communication protocol→ SOAP– IDL (Interface Definition Language) → WSDL– Binding → UDDI
39
Transport protocol: HTTP
POST used for request and answer fromRPC– Universally available– It passes through firewalls
POST /~ssoo/consultaBD.cgi HTTP/1.0Content-length: 76.....................
DNI=87654321&MAT=980000&Asignatura=sod&Curso=2002&Convocatoria=Jun&Tipo=acta
HTTP/1.1 200 OKContent-Type: text/html; charset=iso-8859-1.....................
<HTML>
40
Representation format: XML
RPC information coded in XML– Flexible and powerful– XML Schema allows us to define accurately data types– E.g., float GetLastTradePrice(string symbol);
Request:<GetLastTradePrice>
<symbol>DIS</symbol> </GetLastTradePrice>
Answer:<GetLastTradePriceResponse>
<Price>34.5</Price></GetLastTradePriceResponse>
Schema:<element name="GetLastTradePrice"><complexType><all>
<element name="symbol" type="string"/></all></complexType></element><element name="GetLastTradePriceResponse"><complexType><all>
<element name="Price" type="float"/></all></complexType></element>
41
Communication protocol: SOAP
Simple Object Access Protocol (CandidateRecommendation)SOAP = HTTP + XML– It specifies how to send XML messages over
HTTP– It defines the message container (in XML)– General protocol (not only for RPC)
Message container Structure:– Envelope: Header [optional] + Body
Header: complementary info. (e.g., in RPC thetransaction ID)Body: Original message
42
SOAP and RPCPOST /StockQuote HTTP/1.1......................<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"><SOAP-ENV:Body>
<m:GetLastTradePrice xmlns:m="http://example.com/stockquote.xsd"><symbol>DIS</symbol>
</m:GetLastTradePrice></SOAP-ENV:Body>
</SOAP-ENV:Envelope>
HTTP/1.1 200 OK...............<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/><SOAP-ENV:Body>
<m:GetLastTradePriceResponse xmlns:m="http://example.com/stockquote.xsd"><Price>34.5</Price>
</m:GetLastTradePriceResponse></SOAP-ENV:Body>
</SOAP-ENV:Envelope>
Request
Answ
er
43
Service Definition: WSDL
Web Service Description LanguageIDL for Web Services based on XMLWDSL document describes the Web service:– Data types (XML Schema)– Exported functions and request/answer messages– Protocols: usually SOAP over HTTP– Service address → URL with server and
“component”E.g., http://www.stockquoteserver.com/StockQuote
Usually, it is generated automatically fromservice code
44
UDDI
Universal Description, Discovery, andIntegrationDistributed registry of web services offeredby enterprisesIt is accessed as a web serviceQuery by using different criteria:– Activity, kind of service, geographical location
45
Web service registration
Picture extracted from “Understanding Web Services”: http://www7.software.ibm.com/vad.nsf/Data/Document4362
46
Information of a UDDI Registry
White pages: Listing of organizations (contactinformation) and of services provided by suchorganizationsYellow pages: Classifications of companies and Web Services according to taxonomiesGreen pages: It describes how a Web service can be invoked (Pointers to service description documents). Usually stored outside the registry.
Grid Services
OGSI
48
Computationally intensiveFile access/transferBag of various heterogeneous
protocols & toolkitsMonolithic designRecognised internet, ignored WebAcademic teams
Generation GameIn
crea
sed
func
tiona
lity,
stan
dard
izat
ion
Time
Customsolutions
Open GridServices
ArchitectureWeb services
Globus ToolkitCondor, Unicore
Defacto standardsGridFTP, GSI
X.509,LDAP,FTP, …
App-specificServices
Data and knowledge intensiveOpen services-based architecture
Builds on Web servicesGGF + OASIS+W3C
Multiple implementations Global Grid Forum
Industry participation(adapted from Ian Foster GGF7 Plenary)
49
Grid Services
Grid services were first introduced in “The Physiology of the Grid: An Open Grid Service Architecture for Distributed Systems Integration” by Foster et al.“Grid Service: a Web service that provides a set of well-defined interfaces and that follows specific conventions”.“The interfaces address discovery, dynamic service creation, lifetime management, notification, and manageability; the conventions address naming and upgradeability”.
50
Grid Services
The Physiology paper and its sister paper, The Anatomy of the Grid: Enabling Scalable Virtual Organizations, were the first papers to discuss using Web services to build Grids.They described an architecture built on special types of Web services, Grid services.There is now an OGSA working group at the Global Grid Forum (GGF) trying to tie the various grid standards coming out of GGF into a coherent whole.
51
OGSA
Defined by The Global Grid ForumOpen Grid Services Architecture– Grid Computing + Web Services– Concepts of both technologies
52
OGSA
What provides?– Distributed Services among Distributed,
Dynamic and Heterogeneous VOsTo whom?– Grid communities– Web Services communities
53
OGSI
OGSI (Open Grid Services Infrastructure)Formal and technical specification of what a Grid Service is.Detailed Specification of how Grid Services work.
54
Globus (GT3), OGSA and OGSI
Source: The Globus Toolkit 3 Programmer's Tutorial. Borja Sotomayor. http://www.casa-sotomayor.net/gt3-tutorial
55
Grid ServicesA Web service with a lot of extensions that make itadequate for a grid-based applicationMain improvements:– Stateful and potentially transient services– ServiceData– Notifications– portType extension– Lifecycle management– GSH & GSR
56
Writing a Grid ServiceDefine the service’s interface– GWSDLImplement the service– JavaDefine the deployment parameters– WSDDCompile everything and generate GAR file– AntDeploy service– Ant
Source: The Globus Toolkit 3 Programmer's Tutorial. Borja Sotomayor. http://www.casa-sotomayor.net/gt3-tutorial
57
Evolution of Grid StandardsOGSI drawbacks:– Long and dense specification– It does not work well with current Web Services tools– Too object orientedWSRF & GT4– http://www.globus.org/wsrf– WSRF presented in January 2004– GT4 is the current releaseHowever, WSRF and OGSI are conceptually thesame thing.
58
OGSI referencesThe Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration -http://www.globus.org/research/papers/ogsa.pdfThe Anatomy of the Grid: Enabling Scalable Virtual Organizations. – http://www.globus.org/research/papers/anatomy.pdfFinal OGSI Specification V1.0 –https://forge.gridforum.org/projects/ogsi-wg/document/Final_OGSI_Specification_V1.0/en/1OGSI V1.0 Primer - https://forge.gridforum.org/projects/ogsi-wg/document/draft-ggf-ogsi-gridserviceprimer/en/1From Open Grid Services Infrastructure to WS-Resource Framework: Refactoring and Extension -http://www.globus.org/wsrf/specs/ogsi_to_wsrf_1.0.pdfA Grid Application Framework based on Web Services Specifications and Practices – http://www.neresc.ac.uk/ws-gaf/documents.htmlGGF – http://www.ggf.org/
WS-RF: WS-Resources Framework
60
Grid and Web Services
Grid
Web Services
Pre-XML Post-XML
GT2 GT3
OGSI
WS-RF
GT4
61
WS-RF Web Service Resource Framework
WS-RF effectively has replaced OGSI since January 2004.Addresses the issues with OGSI.Doesn’t use inheritance – instead we compose portTypes.Simply a re-factoring of OGSI?Instead of Grid service instances we have WS-Resources.
62
WS-Resource Counter
WebServiceClient
createResource
CounterResource
counterID=1
CounterResource
counterID=2
add
WS-AddressingEPR
add
Destroy
(adapted from Marc McKeown Slides)
63
“Implied” Resource PatternThe WS-Resource definition codifies the relationship between Web services and stateful resources in terms of the implied resource pattern
– A set of conventions on Web services technologies that allow the state of a resource to be defined and associated with the description of a Web service interface.
WS-Addressing standardizes the endpoint reference construct used to represent the address of a Web service deployed at a given network endpoint.
– WS-Addressing Endpoint Reference, the client uses this EPR to communicate with the WS-Resource.
– The EPR holds an identifier for the WS-Resource.
64
OGSI vs WS-RF
WS-BaseFaultBase fault type
WSDLGWSDL
WS-ServiceGroupServiceGroup portTypes
Factory portType
WS-NotificationNotification portTypes
WS-ResourceLifetimeGridService portType lifetime management
WS-ResourcePropertiesGridService portType service data access
Resource properties definitionService Data definition
WS-RenewableReferencesHandleResolver portType
WS-Addressing Endpoint Reference & WS-RenewableReferences
GSH
WS-Addressing Endpoint ReferenceGSR
WS-Resource FrameworkOGSI
65
WSRF Implementations
Globus GT4 supports WSRFWSRF.NET from University of VirginiaPython implementation from Lawrence Berkley Laboratory.Java implementation from University of Indiana.Perl implementation from University of Manchester.
66
WSRF References
Modeling Stateful Resources with Web Services – http://www.globus.org/wsrfThe WS-Resource Framework -http://www.globus.org/wsrfFrom Open Grid Services Infrastructure to WS-Resource Framework: Refactoring and Extension- http://www.globus.org/wsrfWSRF OASIS working group - http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsrfWS-Notification OASIS working group -http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsn