Atif

Multi Protocol Label Switching Recovery Mechanism

Sulalah Qais Mirkar, Dr.Vijay Thakurdas Raisinghani

School of Technology Management and Engineering (MPSTME), Dept ofIT,

NMiMS Deemed-to-be University sulalah.mirkar@nmim s.edu, [email protected]

Abstract-A circuit switched computer can be used to send real time data like voice and video from sender to receiver. In case if a link on the path fails then data cannot be sent along the path. However circuit switching is resilient to such failu res. These packets are treated individually in the network. If failure occurs in the network path then packets can be rerouted to avoid the failures and communication will not be interrup ted. In this paper, we analyze the different approaches of traffic rerouting in MPLS domain that are resilient to the failure. We propose a new model AssuredQuality-of-Service (ASQ) an improved mechanism such that traffic demand can be rerouted in the network as fast as possible. Our approach helps to achieve a fast reroute of traffic on path failure, as compared to existing recovery models. Further, our appro ach requires less number of nodes on the backup path as compared to recovery models proposed by Makam's or Hask in.

Index Terms-Multi Protocol Label Switching, Makam's Model, Recovery Path Mechanism, Control-driven mode , Datadriven mode.

I. INTRODUCTION

Iti Protocol label Switching (MPLS) [I] is a technology which helps in Traffic Engineering. S [I] is a technology which overcomes certain

limitations of IP based networks. In MPLS, the packets are forwarded using short labels instead of the entire IP address. In MPLS, since the forwarding of a packet is based only on label switching, the packet forwarding is faster than IP network. In MPLS, many approaches or models have been proposed to move traffic from faulty active path to recovery path like Makam's Model [2], Haskin Model [3] and Fast Reroute oneto-one back Model [4]. In Makam's Model, the working path is disjoint with recovery path, which leads to delay in the Fault Indication Signal (FIS) reaching the upstream ingress node. Haskin Model is also known as the Reverse Backup Model. In this model, if a fault is identified in the working path then traffic will be reversed from the faulty node to the ingress node. So there is high probability that reverse traffic will change the order of packets. The Fast route back-up Model consumes large amount of resources like bandwidth and routers. We observe that there are many challenges in existing recovery models. Using our approach, we can overcome these challenges.

The entry point in an MPLS network is called the ingress point [5]. This ingress router is called a Label Switching

978-1-4799-3140-8/14/$31.00 2014 IEEE

Router (LSR). It labels the incoming packet and forwards it to next LSR, in the MPLS domain. A Label switched Path (LSP) is setup, through the LSRs, using a signaling protocol, such as Resource Reservation Protocol with Traffic Engineering (RSVP-TE) [1 ][6] or constraint-based Routing Label Distribution Protocol(CR-LDP) [1]. A label switched path(LSP) within the MPLS domain decides the route of the packet. A packet's label is removed by the Egress Router (Egress LSR), before the packet is sent to non-MPLS router.

The working path or active path is the path which is reserved for transmission of traffic. To protect against failure of working path, a backup path or recovery path is reserved. The recovery path [5] can be calculated and new setup established when failure is detected.

There are two main [2] categories of recovery mechanism depending on where a recovery can be placed. These are Global repair and Local repair. Global repair is to protect against any link or node failure on a path or segment of the path. In global repair, the point of repair is usually distant from the failure and needs to be notified by FIS. It is having an advantage that all links and nodes on the working path are protected by a single recovery path. But the disadvantage is that the FIS has to be propagated all the way back to the ingress LSR before recovery can start. Local repair is to protect against a link failure or neighbor node failure and to minimize the amount of time required for failure propagation. In Local repair, a fault indication signal doesn't have to propagate all the way back to the ingress router before the recovery can start. Once the recovery is detected, recovery can be start by the nearest LSP that detects that failure. Due to this it is faster than global recovery. Local repair can be setup in two different ways. i) Link Recovery:-The goal is to protect a link in the working path from failures. ii) Node Recovery: -The goal is to protect a node in the working path from the failure. In Local protection, only one segment of the working path is protected. The rest of the paper is organized as follows. Section II focuses on related work in MPLS recovery. In section III we propose our recovery approach - ASQ. In section IV, we compare existing models with ASQ model, using simulations. Herein we also analyze the results. Section V concludes the paper.

II. RELATED WORK

492

Umaima YamnaHighlight




There are two different ways of recovery , these are called Recovery by Rerouting and Recovery Protection Switching [8]. i) Recovery by Rerouting:-It is a way of recovery in which a recovery path is established on demand after a fault occurs [6][8]. The Recovery path can be based upon fault information, network routing policies and network topology information. ii) Recovery by Protection Switching:-In Protection switching [7] a recovery path is pre-computed and pre-established before a failure occurs on the working path. There are two subtypes of protection switching 1+1 ("one plus one") protection wherein resources like [2] (bandwidth, buffers and processing capacity) on the recovery path are fully reserved. This model is expensive and also recovery path is reserved even when no failure occurs. The other type is I: I ("one for one") protection wherein the resources [2] allocated on the recovery path are fully available to low priority traffic except when the recovery path is in use due to a fault on the working path. This model is less expensive as compare to I + I Protection switching model. There are many approaches or models that have been proposed to move the traffic from the faulty active path to recovery path. In Makam's Model [2] the working path is disjoint with recovery path. If a fault occurs on any link of the working path then FIS needs to be sent to the path switch LSR or ingress LSR to transfer traffic from the faulty working path to recovery path. In Makam's Model, a completely disjoint backup path is established in advance. This saves the time taken for computing the backup path when failures occur. In this model as the FIS has to travel from the failed node to upstream ingress node (PSL), the delays in recovery initiation are large. If sending rate is high from the source node than the number of packets dropped are high.

In Haskin's Model [3], if a fault is identified in the current working path then traffic will be reversed from the faulty node to ingress upstream LSR. Reverse traffic works as a Fault Indication Signal. So it reduces the time required for the FIS to reach the ingress LSR. In this model, until the path switch LSR receives the reversed data packets, many packets may already have been sent on the failed working path and if transmission rate is high then number of packets dropped will be high. There is high probability that the reverse traffic will lead to a change in the order of packets, as compared to the original traffic.

In Fast Reroute one-to-one backup model [4] every node in the working path has its own separate backup, called a detour LSP, which is computed in advance and reserved. Since each LSR has a backup, whenever a fault occurs, the delay in switchover is reduced since the FIS need not be sent to the ingress LSR. This model consumes large amount of resources like bandwidth and routers to connect with backup path.

In the above section, we analyzed various recovery models for MPLS. We observed that the models have either long delay in recovery (Makam's Model [2]) or require large number of resources (Fast reroute model [4]) or cause packet reordering (Haskin's model [3]).

We defme the problem for our work as follows:

The resources required for backup path should be as low and at the same time the recovery model should enable quick switchover with low delay. Based on the above, we propose our recovery model below and discuss its salient features.

III. ASQ MODEL

We propose the ASQ model in which, a set of nodes LSRs/links can be protected by a backup router. General ASQ model: In general in the ASQ model, a backup path is setup for the entire working path. The backup path is disjoint from the working path. This backup path is connected with the working paths after every n hops. On working path failure the nearest LSR which is connected to a backup path, takes the switchover decision, the FIS is not sent to the ingress LSR. The connection with backup path after every n hops ensures redundancy and at the same time lesser resources are needed. Further, since the nearest LSR takes the switchover decision, the switchover is faster as compared to other models. An example implementation is shown in the Figure 1. i)The backup path is setup using the LSRs 2, 6 and 10.

ii)The LSR I is the path switch LSR, responsible for switching the traffic from active path to pre-established backup path is connected with LSR 2.

iii) In this case, every alternate LSR, on the working path is connected with recovery path.

iv) Alternate LSR is only used as an example; a set of links (more than two) could be protected by a single backup link.

In figure I, ASQ Model is shown with 1-9 MPLS nodes and o and 11 are non MPLS nodes. In the next section, ASQ recovery model is compared with the Makam model. This comparison is done with 6-node, 12-node and 20-node chains in control driven and data driven mode respectively. Control driven and data driven modes are explained in the next section.

Figure 1. ASQ Recovery Model

A. Simulation and Validation of ASQ model

We use ns2 for our simulations. We test our model through 6, 12 and 20 node chains in a working path. We assume a single intermediate link to connect to the backup path. Except in the case of a 20 node link. The Fig. 2 is a nam file visualization in ns2 [9]. Nam is the tel based animation tool that is used to visualize the ns simulation and real world packet trace data. The nam file contains the topology

2014 International Conference on Signal Propagation and Computer Technology (ICSPCT) 493

information like nodes, links, queues and node connectivity etc. We implemented the ASQ model by editing the testsuite-mpls.tel, available in tel examples file in ns2.

i) In ASQ recovery model, the node 0 is the source node and Node I I is the destination node. ii) Node 1 is the ingress node or Path Switch LSR (PSL) and node 2 is the egress Path Merge LSR (PML).

iii) The path 1-3-5-7-9 is the working path or the primary path and the path 2-4-6-8-10 is the secondary path or backup path.

The working of the simulation setup, shown in figure 3, is as follows. Each MPLS Node will exchange LDP (Label Distribution Protocol) mapping request sent by the neighboring nodes. Each LSR will receive the LDP mapping request. When Node 0 sends an IP packet to Node 9 in the MPLS network, it sends an un-labeled packet (i.e. an IP packet in an Ethernet frame without MPLS label). Node I is the ingress LSR, after verifying the destination IP address and other related information in the packet header, it pushes a label into the packet and forwards the labeled packet to the output port. Node 3 LSR, receives the labeled packet from the Node I LSR. It examines the label and performs a table look-up at forwarding table to find a new label and the output port. Node 3 then swaps the old label with new label and routes the new labeled packet to the output port. Other LSRs will perform similar tasks. The labeled packet will reach the Node 9, the egress LSR. It then examines the label and performs a table look-up at the forwarding table to find that the packet is to be sent to non-MPLS Node I I. It then removes the label and sends the unlabelled packet to destination Node. When any link fails on the working path (or backup path), the downstream node sends a fault indication signal (FIS) to the nearest LSR connected to the backup path (or working path).

In the ASQ Model, in the simulations, the center node of backup path is connected to the working path using a link. This will reduce the time required for the fault indication to reach the ingress LSR to notify about the failure in the current working path.

Figure 2. NAM trace file visualization of an ASQ Model with 3 node chain

We do a detailed comparison of both Makam's Model and ASQ Model using different scenarios with 6-node, 12-node and 20-node chains. While comparison considers two different modes of label distribution - one is control-driven mode and another is data-driven mode. 1) The control-driven mode the LDP (label distribution protocol) distributes messages between all MPLS based nodes even if there is no packet transfer taking place. LSPs

are setup by sending mapping message from each LDP agent to its neighbors. We show below the results for control driven mode. Control- driven Mode: 6- node chain, J 2-node chain and 20-node chain a) 6-node chain

i) Link failure between LSR I and 3 As shown in Figure 3, there is a link failure in the working path and backup path. So due to which two different fault indication signals propagate in the network. One PIS will be sent from the LSR 1 and 3 to its neighboring LSR's and another will be sent by the LSR 8 and 10 to their respective neighbors. Now LSP between node 8 and 10 fails. So data packet will

travel through the nodes 1-2-4-6-8-7-9-11-13-15-16.

0-0070@@@@ 1= -1- -I 0G)0@-@)0)'-@

Figure 3. 6-chain control-driven ASQ Model in 1-3 link failure

ii) Link failure between LSR 7 and 9 As shown in Figure 4, LSP fails between LSR 7 and 9. LSR 7 is directly connected to backup path. Therefore FIS will reach to LSR 8 immediately. On receiving the FIS, LSR 8 will send the mapping message to its respective neighbors and data packets will switch over from the working path to recovery path. So the new path will be 1-3-5-7-8-10-12-14-13. ... f."\ '" r:;'\ ... f?I .. (.;\ t;;\ f.';\ @' r.?I I @ o -0.;,0-12J-0 - '-'V,I.!,!;- 13 . - 16

I I: ;1 0-@-GY-@!.!.!@.!.!.@.!.!.@

Figure 4.6-chain Control-driven ASQ model in 7-9 link failure

b) 12- node chain i) Link failure between LSR 5 and 7

Similarly, in the 12-node chain shown in Figure 5, where link failure occurs between LSR 5 and 7, the LSR 5 sends the FIS to ingress LSR.

(';:\ I f7\ I I I r:::'\ II I f?\ t';\ t::?\ \Q,J -\lI-0-0- 0 ......... -tiJ; I III I

(D-G)-@-@ ......... @ Figure 5.12-chain Control-driven ASQ model after 5-7 link failure to backup

path

ii) Link failure between LSR 23 and 25 In Figure 6, the 12-chain ASQ model where failure occurs between LSR 23 and LSR 25. LSR 23 will send FIS to neighboring LSR. Once LSR 13 will receive the FIS, it will immediately transfer the control from the current failed working path to backup path.

494 20 J 4 International Conference on Signal Propagation and Computer Technology (ICSPCT)

@!!.!(!) .......... . I;

(D .......... .

@-@) ........... @-@-' @ 1= ;1

1."::\ r.:?\ I t:":;\ t::?\ - .......... ' - Figure 6.12-chain Control-Driven ASQ model after 23-25 link failure

c) 20-node chain i) Link failure between LSR 5 and 7

In Figure 7, the 20-chain ASQ model where failure is occurs between LSR 5 and LSR 7. LSR 5 will send FIS to neighboring LSR. Once LSR 1 will receive the FIS, it will immediately transfer the control from the current failed working path to backup path.

Figure 7. 20-chain Control-driven ASQ model after 5-7 link failure

ii) Link failure between LSR 35 and 37 Similarly, Figure 8 shows the Link failure between LSR 35 and LSR 37. FIS is sent to LSR 21, which will switch over the traffic from the current working path to recovery path.

Figure 8. 20-chain Control-driven ASQ model after 35-37 link failure

2) In data-driven mode when data packets arrive, the label binding are created by LDP (label distribution protocol) between MPLS based nodes . The data packet contain source and destination address. A multilayer switch constructed LSP (label switch path) between nodes once it sees the first data packet in traffic flows [10]. In this way LSP is constructed from the source towards destination. The comparison is done between Makam' s Model and

ASQ Model in 6-chain, 12-chain and 20-chain for both control and data driven mode. Their graph and analysis is shown in the next section. In the next section, we analyze the results of simulation.

IV .RESULTS AND ANALYSIS A. Results The results show the comparison between Makam and ASQ Models for the three different chain modes 6, 12 and 20 in control driven and data driven mode. The simulation analysis is done for different stages after a failure in the working path. The stages are as follows:

a) Reception of first FIS at ingress router b) First label packet on backup path

c) Labels are removed from the packet d) Receive data packet at destination

I) Control-Driven Mode: a) Reception of first FIS at ingress router

As shown in Figure 9, the graph showing a reception of a fault indication signal to a ingress LSR in 6-chain, 12-Chain and 20-chain.As seen from the fig.9, we can observe that FIS takes less time to reach to ingress router in ASQ model as compared to Makam's model.

'" 06 "

model requires more time to switch from faulty working path to predefined backup path. The first FIS takes 28% more time to reach at ingress node, due to which packet transfer is delayed by 15% and also packets from recovery path to destination is delayed by 51 %, as compared to the ASQ model. For other cases also ASQ shows better performance than Makam. Table I below summarizes the results comparing ASQ Model and Makam Model, for controldriven mode.

TABLE I COMPARISON SUMMARY FOR20-CHAlN, CONTROL DRIVEN MODE - ASQ VS

MAKAM

No. Metric for ASQ Model in Makam Model in comparison milliseconds milliseconds

I. Early 0.372 ms 0. 479 ms notification to ingress node

2. Switching time 0.415 ms 0. 481 ms

3. Reception of 0.554 ms O. 84l ms data at

destination

ii) Data Driven mode

Similarly, in the 20-node chain Makam Model, FIS takes 28% more time to reach the ingress router; packet transfer on recovery path will be delayed by 16% and first data packet will take 33% more time to reach at destination node, as compared to ASQ model. Similarly, in 6 and 12 node chains, ASQ shows better performance. Table II below summarizes the results comparing ASQ Model and Makam Model, for data-driven mode.

The reason that ASQ performs better is that ASQ provides a connection to the backup path from the working path at intermediate nodes. This helps in quick switchover to the backup path on any link failure. The Table II is given below which shows comparison between ASQ Model and Makam Model in 20-chain data driven mode.

TABLE II COMPARISON SUMMARY FOR 20-CHAlN, DATA-DRNEN MODE- ASQ VS

MAKAM

No.

I.

2.

3.

Metric for ASQ Model Makam Model comparison observation in observation in

milliseconds milliseconds

Early 0.372 ms 0. 476 ms notification to ingress node

Switching time 0. 412 ms 0.478 ms

Reception of 0.551 ms 0. 735 ms data at

destination

V. CONCLUSIONS AND FUTURE WORK

The Makam's Model has its own disadvantages and advantages. It usage depends on the requirements in the

network. If Quality of Service requirement is low, then it can be better option as it is less expensive as compared to the other recovery models in MPLS network. The ASQ Model is a better option if Quality of Service is of high priority. However, it will be expensive as compared to Makam's model since multiple additional links are required to connect to the backup path. Our simulation results comparing Makam's Model and our ASQ Model show that in a huge network where number of node connections are more, the ASQ model shows better performance. This is because the time taken by the Fault Indication Signal to reach the Path Switch LSR is higher. This time will increase as more number of nodes and links will be added and therefore more delays will exist in case of Makam's model. In our ASQ Model, the above problem has been addressed and the time taken by the Fault Indication Signal to reach the ingress router has been reduced, due to multiple connections between working and backup path. This also resu Its in the packets reaching faster to its destination as compared to Makam's Model, after a fault has occurred on the working path. Further evaluations are needed to compare the ASQ model with other recovery models and to evaluate the behavior with different traffic conditions.

REFERENCES

[I] Changcheng Huang, "Building Reliable MPLS Networks using a path protection Mechanism", Carlton University: IEEE Communications Magazine, pp.156-162,2002.

[2] S.Makam's, V. Sharma, K.Ownes, C Huang "ProtectionlRestoration Of MPLS Networks" draft-Makam's-MPLS-protection-OO.txt, 1999.

[3] D.Haskin, RKrishnan, "A Method of setting an alternative Label switched path to handle Fast Reroute", draft-haskin-MPLS-fastreroute-OS. txt, 2000.

[4] K. Kompella, G.Swallow, "Detecting MPLS Data Plane Failures", Draftietf-mpls-lsp-ping-06.txt, July.2004.

[5] Petersson,J.M, "based Recovery Mechanism", University of Oslo Master Thesis, 2005.

[6] S.Yoon, H. Lee, D. Choi ,Y. Kim "An Efficient Recovery Mechanism for MPLS-based Protection LSP" IEEE ICA TM-200 I September 2001.

[7] V. Sharma, F. Hellstrand "Framework for Multi-Protocol Label (MPLS)based Recovery" IETF, RFC 3469 February 2003.

[8] S. Veni and Dr. G. M Kadhar Nawaz,"Protection Switching and Rerouting in MPLS", India : [EEE Conference ,pp.216-220,201 O.

[9] VINT project at LBL, Xerox PARC,USB and USC/IS The Network Simulator NS2 htm://www.isi. edulnsnam/ns/ (accessed on 2013)

[10] Chuck Semeria. (1999, Sep 27). Juniper Networks [Online]. A vai lable: http://mirror. unpad. ac. id/ orarillibrary/library-ref-eng/ ref-eng- 3/network/mpls/20000 I. pdf

2014 International Conference on Signal Propagation and Computer Technology (ICSPCT) 497

Atif

Documents

Transcript of Atif