Management and usage of large scale...

53
Management and usage of large scale infrastructures [email protected]

Transcript of Management and usage of large scale...

Page 1: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

Management and usage of large scale infrastructures

[email protected]

Page 2: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

2

Grid Computing and clouds

● Ian Foster on Grids : “Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”.

● Clouds: sharing of resources to achieve coherence and economies of scale, similar to a utility (like the electricity grid) over a network

Should be easy to use and could be easy to manage

Page 3: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

3

User point of view

Page 4: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

4

Several goals

● Low latency– How long do I have to wait my job

completion

● High Throughput– How many jobs can I finish in a timeframe

● Low cost– How much does it cost to me

● Low complexity– How long do I have to manage my jobs

Page 5: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

5

Reduce the complexity

● Standardization bodies● Use of open protocols● High level QoS● Abstraction levels

– Grid● Everything is a resource

– Cloud● PaaS, IaaS, SaaS

Services !!!

Page 6: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

6

Frontends

● First contact– Web Site

● Dedicated

– Eclipse Plugin● Developper

only

– Command line● Expert

only

Page 7: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

7

Why doing it simple ?

● Framework for building submission websites

Page 8: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

8

Cloud version : same complexity

Page 9: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

9

Simple Job Submission

● Submit job to a GRAM servicedefault factory EPRgenerate job RSL to default localhost

● Command example:% globusrun-ws -submit -c /bin/touch touched_itSubmitting job...Done.Job ID: uuid:002a6ab8-6036-11d9-bae6-0002a5ad41e5Termination time: 01/07/2005 22:55 GMTCurrent job state: ActiveCurrent job state: CleanUpCurrent job state: DoneDestroying job...Done.

Page 10: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

10

Simple Job Submission : Cloud version

Page 11: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

11

Security is complex

● In clouds– Isolation and no sharing

– Delegated to other layers

● In grids– Virtual organization

– Cooperation between sites

– Trust mechanisms

Page 12: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

12

Grid Security Infrastructure (GSI)

● Based on certificates● Several CA (Certificate

Authorities)● Trust relations are inherited from

CA● Communications are based on

SSL● Coarse grained

– Not adapted for reading few bytes in a file

Page 13: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

13

Grid Security Infrastructure (GSI)

Page 14: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

14

Timing and methodology

● Clouds– Everything by hand, you have what you

pay● PaaS / SaaS / IaaS

– Deployment/Development depends on what you buy

● Grids– Standardized (everything is a resource)

– Can do everything so everything is a pain

Page 15: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

15

Example of Grid data communication

● Globus WSRF : Web Service Resource Framework

● Data accessis a service

Page 16: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

16

Provider point of view

Page 17: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

17

Job flow in grids : Question ? How many decisions

Page 18: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

18

Basic useful services

● VO Management Service: resources allocation to each Virtual Organization.

● Resource Discovery and Management Service● Job Management Service● And much more: security (authentication,

authorisation, data management)…

● All all services interact: example Job Management Service needs Resource Discovery

● Need Standardization for interfaces to services Example: JobSubmissionService has a submitJob() method

Page 19: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

19

Base infrastructure to implement the architecture OGSA?

OGSA: Open Grid Services Architecture

● The method invocation should also be standardized. Corba? RMI? RPC? No : Web Services!!

● But need Stateful Web Services!

● WSRF: Web Services Resource Framework

Page 20: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

20

The Web services WSDL/SOAP/HTTP pancake

In theory extensible and generic.In reality complex and monolitic

Page 21: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

21

Going more inside Web services invocations

You don’t have to program the stubs/nor the SOAP requests/responsesJust like Corba and RMI

Page 22: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

22

From stateless to stateful WS

Using the concept of resources

Page 23: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

23

WS-Resources

Web Service + Resource = WS-ResourcesTo address these, we need a

endpoint reference to specify the resource

Think how simple are DNS, RmiRegistry... Nope

Page 24: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

24

Specification, WSRF and more

● WS-ResourceProperties: defined in the WSDL interface

● WS-ResourceLifetime: manage lifecycle of the WS-Resources

● WS-ServiceGroup: group services or WS-Resources together allow to find in the group services meeting a

particular property allow also to address all services of the group by

one entry point● WS-BaseFaults: for fault reporting● WS-Notification: producer/consumer mode● WS-Addressing: to address the WS-Resources

Page 25: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

25

Grid middlewareProvides WS-R

Grid middlewareIS WS-R

Page 26: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

26

Writing a WSRF Web/Grid Service

Five Steps, only !

1. Define the service’s interface. This is done with WSDL

2. Implement the service. This is done with Java.

3. Define the deployment parameters. This is done with WSDD and JNDI

4. Compile everything and generate a GAR file. This is done with Ant

5. Deploy service. This is also done with a GT4 tool

Page 27: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

27

A example service interface

public interface Math

{public void add(int a);public void subtract(int a);public int getValueRP();

}

In Java or IDL, the description is simple…

Page 28: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

28

WSDLservice description

<?xml version="1.0" encoding="UTF-8"?>

<definitions name="MathService”

targetNamespace="http://www.globus.org/namespaces/examples/core/MathService_instance"

xmlns="http://schemas.xmlsoap.org/wsdl/"

xmlns:tns="http://www.globus.org/namespaces/examples/core/MathService_instance"

xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"

xmlns:wsrp="http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceProperties-1.2-draft-01.xsd"

xmlns:wsrpw="http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceProperties-1.2-draft-01.wsdl"

xmlns:wsdlpp="http://www.globus.org/namespaces/2004/10/WSDLPreprocessor"

xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<wsdl:import

namespace="http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceProperties-1.2-draft-01.wsdl"

location="../../wsrf/properties/WS-ResourceProperties.wsdl" />

Page 29: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

29

<!­­==== P O R T T Y P E ==========­­>

<portType name="MathPortType"

wsdlpp:extends="wsrpw:GetResourceProperty"

wsrp:ResourceProperties="tns:MathResourceProperties">

<operation name="add">

<input message="tns:AddInputMessage"/>

<output message="tns:AddOutputMessage"/>

</operation>

<operation name="subtract">

<input message="tns:SubtractInputMessage"/>

<output message="tns:SubtractOutputMessage"/>

</operation>

<operation name="getValueRP">

<input message="tns:GetValueRPInputMessage"/>

<output message="tns:GetValueRPOutputMessage"/>

</operation>

</portType>

</definitions>

Page 30: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

30

<!­­====== M E S S A G E S ======­­>

<message name="AddInputMessage">

<part name="parameters" element="tns:add"/>

</message>

<message name="AddOutputMessage">

<part name="parameters" element="tns:addResponse"/>

</message>

<message name="SubtractInputMessage">

<part name="parameters" element="tns:subtract"/>

</message>

<message name="SubtractOutputMessage">

<part name="parameters" element="tns:subtractResponse"/>

</message>

<message name="GetValueRPInputMessage">

<part name="parameters" element="tns:getValueRP"/>

</message>

<message name="GetValueRPOutputMessage">

<part name="parameters" element="tns:getValueRPResponse"/>

</message>

Page 31: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

31

<! === T Y P E S ========­­>

<types>

<xsd:schema targetNamespace="http://www.globus.org/namespaces/examples/core/MathService_instance"   xmlns:tns="http://www.globus.org/namespaces/examples/core/MathService_instance"

   xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<!­­ REQUESTS AND RESPONSES ­­>

<xsd:element name="add" type="xsd:int"/>

<xsd:element name="addResponse">

<xsd:complexType/>

</xsd:element>

<xsd:element name="subtract" type="xsd:int"/>

<xsd:element name="subtractResponse">

<xsd:complexType/>

</xsd:element>

<xsd:element name="getValueRP">

<xsd:complexType/>

</xsd:element>

<xsd:element name="getValueRPResponse" type="xsd:int"/>

Page 32: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

32

<!­­ RESOURCE PROPERTIES ­­>

<xsd:element name="Value" type="xsd:int"/>

<xsd:element name="LastOp" type="xsd:string"/>

<xsd:element name="MathResourceProperties">

<xsd:complexType>

<xsd:sequence>

<xsd:element ref="tns:Value" minOccurs="1" maxOccurs="1"/>

<xsd:element ref="tns:LastOp" minOccurs="1" maxOccurs="1"/>

</xsd:sequence>

</xsd:complexType>

</xsd:element>

</xsd:schema>

</types>

Page 33: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

33

From stateless to stateful WS

Using the concept of resources

Page 34: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

34

If you are still alive, you still have to

● Actually write the code● Configure the deployment

With WSDD and JNDI● Compile everithing with the right libraries● Generate a GAR file: Grid Archive ● Deploy into a container

● And it was a simple stateless service !

Most people just run code and forget about services

Page 35: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

35

Behind the scene how does it work ?

Page 36: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

36

Grids : Globus GRAM 4, everything is specified

GridFTPRFT

Delegation

GridFTP

GRAMservices

local sched.

user job

compute element

compute element and service host(s)

remote storage element(s)

FTP data

FTP control

clie

nt

job submit

delegate

xfer

req

uest

local job control

delegateGRAMadaptersu

do

Page 37: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

37

Clouds : OpenStack : somewhat specified

OpenStack : Communication and meta-data

Page 38: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

38

Structure

● Monitoring● Analyze● Decision● Implementation

MAPE-K loop

Concept view: actually several cooperative decisions

Page 39: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

39

Monitoring

● Grid : Integrated monitoring– Ganglia

– NWS, Network Weather Service (adds prediction)

– Nagios

● Cloud– Provider : integrated

– User : no access to provider data● If you want something, deploy it

Page 40: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

40

Monitoring example : Ganglia

● Goal: High performance

– Small messages to reduce network impact

– Hierarchical structure with aggregation nodes

– Scalability (few thousand nodes)

● Several components

– XDR for portable non-intrusive communication

– RRDtool for data storage and manipulation

– XML for data format

● Open Source

Page 41: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

41

Analyze

Metrics

Computed using raw data from monitoring

ex: Energy consumption

● Grid: usually performance

– How many jobs are running

– How many are waiting

– How far are the deadlines

– Everything is at 100%

– Energy does (not) matter

Page 42: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

42

Analyze

Metrics● Cloud

– Abstract « performance » do not exist : only users (QoS)

– Provider has an infrastructure point of view● Unused resources● Cost (electricity & management)

– Some classical metrics (Question : for who ?)

● Performance● Energy● Reliability● Dynamism

Page 43: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

43

Decision

● Grids : already said– Most important : where and when to run

tasks

● Clouds– User: Optimize QoS

● Start new instances● Modify resource allocation of current instances

– Provider: save money (and electricity)● Consolidation● Switching on/off servers

Page 44: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

44

Grid exemple : backfilling

Question : If 5 is longer, can we move 4 ?What could be the negative impact ?

Page 45: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

45

Cloud exemple : steps for consolidation

Page 46: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

46

Limits● Consolidation

– Real servers don't switch off

– Service interruption (even if few ms)

– Isolation

● Scheduling in general

– Fairness

– QoS evaluation

– Multi-metrics for antagonist objectives● « Performance », Energy, Resilience,

Dynamism

Question: How to manage reliability ?

Page 47: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

47

Execute

● User: Depends on the application– Reconfiguration

– Data migration (web server, database)

– Scalability of the application

● Provider– Latency problems:

● Switching on/off a nodes: ~ 1 min

– Scale problem● Switching on/off 1000 nodes: power peaks

Page 48: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

48

What about Peer to Peer ?

Page 49: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

49

Control ?

● Several type of Peer to Peer systems– Corporate

● Distributed File system● Work Stealing

– Cooperative● Protein folding● BitCoins

Page 50: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

50

Distributed Hash Table

● Main point of contact : DHT● Manages meta-data

– File systems

● Manages all data– Work sharing

● Several libraries– Kademlia

– Chord

Page 51: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

51

Comparison with Grids and Clouds

● More specific– Toward simple data management

● Distributed file sharing

– Toward computation on simple data● Protein folding● BitCoins● Work stealing

● Some good properties– Low possibilities but simple to implement

– Decentralized Question : Decisions ?

Page 52: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

52

Hype Cycle for Emerging Technologies, Gartner 2014

Page 53: Management and usage of large scale infrastructuresGeorges.Da-Costa/cours/liban/cours_02_grille.pdf · 5 Reduce the complexity Standardization bodies Use of open protocols High level

53

Bibliography

● The Grid 2: Blueprint for a New Computing Architecture. Ian Foster, Carl Kesselman

● The Globus Toolkit 4 Programmer’s Tutorial, Borja Sotomayor

● A view of cloud computing Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R., Konwinski, A., ... & Zaharia, M.

● OpenStack: toward an open-source solution for cloud computing Sefraoui, Omar, Mohammed Aissaoui, and Mohsine Eleuldj

● Peer-to-peer computing Milojicic, Dejan S., Vana Kalogeraki, Rajan Lukose, Kiran Nagaraja, Jim Pruyne, Bruno Richard, Sami Rollins, and Zhichen Xu