Download - Grds conferences icst and icbelsh (9)

APPLICATION LEVEL CHECKPOINT-BASED

APPROACH FOR CRUSH FAILURE IN

DISTRIBUTED SYSTEM

Presented By

Moh Moh Khaing

OUTLINES

Abstract

Introduction

Objectives

Background Theory

Proposed System

System flow of proposed system

Two phases of proposed system

Implementation

Conclusion

2

ABSTRACT

Fault-tolerance for the computing node failure is an important and

critical issue in distributed and parallel processing system.

If the numbers of computing nodes are increased concurrently and

dynamically in network, it may occur node failure more times.

This system proposes application level checkpoint-based fault

tolerance approach for distributed computing.

The proposed system uses coordinated checkpointing techniques

and systematic process logging as global monitoring mechanism.

The proposed system implements on distributed multiple

sequences alignment (MSA) application using genetic algorithm

(GA).

3

DISTRIBUTED MULTIPLE SEQUENCE ALIGNMENT WITH

GENETIC ALGORITHM (MSAGA)

4

MSA with

GA

Division

Head Node

MSA with

GA

MSA with

GA

Aligned

Sequence ResultAligned

Sequence Result

Aligned

Sequence Result

Combine Alignment Result

Display Result

DNA Sequences (2 …..n)

SEQUENCES ALIGNMENT EXAMPLE

Input multiple DNA Sequences

>DNAseq1: AAGGAAGGAAGGAAGGAAGGAAGG

>DNAseq2: AAGGAAGGAATGGAAGGAAGGAAGG

>DNAseq3: AAGGAACGGAATGGTAGGAAGGAAGG

Output for aligned DNA Sequences

>DNAseq1: A-AGGA-AGGA-AGGAA-------GG-----AA-GGAAGG

>DNAseq2: ----------------AAGGAAGGAATGGAAGGAAGGAAGG

>DNAseq3: ----------------AAGGAACGGAATGGTAGGAAGGAAGG 5

NODE FAILURE CONDITION

Node failure condition is occurred when the worker node connectsto head node, worker node accepts the input sequence and workernode sends resulted sequence the head node. The failureconditions are

1. Worker node is denied as soon as worker node had connectedto the head node without working any job.

2. Worker node rejects the input sequence from the head nodeafter the head node and worker node had connected and headnode had prepared the input sequence for worker node.

3. Worker node sends “No Send” message to Head node afterworker node had accepted the result sequence to head node.

4. Worker node is crushed when it cannot connect to the Headnode with correct address.

5. Worker node is crushed when it disconnect to the Head node.6

COORDINATED CHECKPOINTING

Checkpointing is used as fault tolerance mechanism in distributed

system.

A checkpoint is a snapshot of the current state of a process and

assist in monitoring process.

Coordinated checkpointing takes the checkpoint periodically and

save in the log file.

This monitoring information provides at the node failure

condition.

If node failure occurs in distributed computing, another available

node can reconstruct the process state from the information saved

in the checkpoint information of failed node.7

SYSTEMATIC PROCESS LOGGING

Systematic Process Logging (SPL) which was derived from a

log-based method.

The motivation for SPL is to reduce the amount of computation

that can be lost, which is bound by the execution time of a

single failed task.

SPL saves the checkpoint information from the coordinated

checkpointing as the log file format with exactly time and their

contents.

Depending on the fault, it decides which node can be accepted

the job from failed node using storing log file.

8

PROPOSED FAULT TOLERANCE SYSTEM

The checkpoint based fault tolerance approach is implementedon the application layer without using any operating systemsupport.

In distributed multiple sequences alignment application,one headnode and one or more worker nodes are connected with localarea network.

All worker nodes implemented the MSAGA and aligned theinput sequence from head node independently.

The proposed fault tolerance system takes the local checkpoint atthe MSA process of each computing worker node themselvesand global checkpoint at events of all workers ’ condition byhead node.

9

ARCHITECTURE OF PROPOSED FAULT TOLERANCE

SYSTEM

Head Node

Local Area Network

GRM GCS

LCS LC

Worker 1

LCS LC

Worker 2

LCS LC

Worker 3

GRM – Global Resource Monitor

GCS – Global Checkpoint Storage

LCS- Local Checkpoint Storage

LC – Local Checkpoint 10

SYSTEM FLOW OF PROPOSED SYSTEM

Start

End

Load Balancing Phase

GRMHN

GCS

Checkpointing Phase

WNHN

Systematic Process Logging

GCS LCS

WNHNGRM LC

Coordinated Checkpointing

HN- Head Node

WN – Worker Node

11

IMPLEMENTATION OF HEAD NODE

Checkpointing Phase

The global resource monitor(GRM) plays the main role in

both coordinated checkpointing phase and systematic process

logging phase.

GRM takes the global checkpoint of all workers nodes’ event

at the coordinated checkpointing phase.

GCS saves the global checkpoint information as the log file

format at the Systematic process logging phase.

12

GLOBAL CHECKPOINT

13

Global Rrsource Monitor(GRM )

Begin

1. Taking global checkpoints of current condition of each WN

with WN’s IP, port, status, and time duration

2. Detecting the failure condition of WNs

3. Finding the available worker nodes and decide which node

is suitable for continuing to do failed WN’s jobs

End

TYPES OF CHECKPOINT

14

Checkpoint No Checkpoint

Name

Checkpoint Content

1 Available Worker node is connected with Head node

and waits for jobs from Head node

2 Denied Worker node is disconnected with Server

3 Busy Worker node is processing the jobs

4 Receive Worker node send the result to the Head

node and exist (or) Worker node send

Error message and Exit

5 Crush Worker node sends the crush message to

the Head node

CHECKPOINT INFORMATION

For each checkpoint, there are four conditions are

described:

Worker Typeto show worker number,

IP address to show WN,

Checkpoint Name to show worker node’s conditions,

Current Time to show process current time,

Time Duration to show time within each worker’s

running state to accept and receive state or running

state to reject state.

15

Worker

Type

IP Address Checkpoint

Name

Current

Time

Time

Duration

AVAILABLE CHECKPOINT OF ALL WORKERS

GRM take checkpoint as Available when all worker nodes are

connected to the head node

16

CHECKPOINT CHANGES FROM AVAILABLE

17

GlobalCheckpoint_Available ( )

Begin

1. IF HN and WNs are connected THEN

GRM takes checkpoint as Available

END IF

2. IF Checkpoint is Available THEN

IF WN is continuously connected to HN THEN

HN selects sequence and send to WNs

IF WN not accepted the sequence THEN

GRM takes checkpoint as Crush

The sequence is go to crush queue

ELSE

GRM takes checkpoint as Busy

WN does MSA application

END IF

ELSE

GRM takes checkpoint as Denied

END IF

End

DETECTING NODE FAILURE BY GRM

18

BUSY CHECKPOINT OF ALL WORKERS

19

CHECKPOINT CHANGES FROM BUSY

20

GlobalCheckpoint_Busy ( )

Begin

1 IF WN accepted input sequence from HN THEN

GRM takes checkpoint as Busy

END IF

2 IF the checkpoint is Busy THEN

IF WN sends error message to HN THEN

GRM takes checkpoint as Receive for error

ELSE

GRM takes checkpoint as Receive for result

END IF

END IF

End

RECEIVE CHECKPOINT WITH RESULT

21

RECEIVE CHECKPOINT WITH NO SEND MESSAGE

22

GLOBAL CHECKPOINT STORAGE(GCS)

23

Global_Checkpoint_Storage ( )

Begin

1 GCS stores the current condition of all WN in network

as checkpoint by GRM

2 GCS records the detail condition of WN

3 Create GCS log file for all checkpoint of nodes

End

GCS LOG FILE

24

LOAD BALANCING PHASE

25

GRM_LoadBalancing( )

BEGIN

IF (GRM detects Denied or Crush or Receive “No Send”) THEN

1 It is assumed that they are the failure of worker node.

2 The GRM finds the available node using GCS and decide

which node is suitable to send job.

3 If so, the HN sends jobs to such available node from failed

node.

4 Call Available and Busy Algorithm

ENDIF

END

LOAD BALANCING ACCORDING TO NODE FAILURE

AS DENIED CHECKPOINT

26


AS CRUSH CHECKPOINT

27


AS RECEIVE CHECKPOINT(NO SEND)

28

IMPLEMENTATION OF WORKER NODE

Worker node executes the DNA sequence to form alignedsequence using MSAGA application

Worker node takes the local checkpoint at the application levelof MSAGA

Worker node implements checkpointing phase in proposed faulttolerance system.

The local checkpoint (LC) and the local checkpoint storage(LCS) play the main role in that phase.

Every worker nodes make the local checkpoint and has ownlocal checkpoint storage.

Local checkpoint (LC) takes all checkpoint of each worker node.

Local checkpoint storage(LCS) stores the process of oneworker’s processing state. 29

LOCAL CHECKPOINT

local checkpoint (LC) is responsible for taking local checkpointof worker process states.

Local checkpoint (LC) starts to take the checkpoints of worker’sprocessing state when worker node (WN) connects to the headnode.

This local checkpoint’s responsibilities is done till all workers’processes are finished regularly and worker is exit from local areanetwork because of node failure.

30

LOCAL CHECKPOINT OF EACH WORKER

31

LocalCheckpoint( )

BEGIN

1 Record WN Starting time, Ending time and connection time

2 Record all process state of MSA for sequence

END

LOCAL CHECKPOINT STORAGE(LCS)

SPL produces the checkpoint log file and processing log file for

local condition of each node.

So, all local checkpoint monitoring information are stored into

local checkpoint storage (LCS).

The LCS is stored by the correspondence each WN.

32

LocalCheckpointStorage( )

BEGIN

1. Store WN Starting time, Ending time and

connection time

2. Store all process state of MSA for sequence

END

LCS LOG FILE

33

CONCLUSION

The GRM cannot make wrong checkpoint for the number of

worker node .

GRM can recognize differences between old worker node and new

worker node exactly when the worker node connect to the head

node next again.

While GRM takes the checkpoint for one worker node, the

remaining workers do not need to stop their operation. Therefore,

there is no block for worker nodes.

This approach supports that the distributed multiple sequence

alignment processing can operate continuously to get the final

result when the node failure occurred within network.

This system computes the exact time of each worker nodes and

the whole system execution time. This system can get the portable

checkpoint feature and does not need to use any operating system

supports.

34

THANK YOU!!

35