17830439 Ethernet and MPLS WP
Transcript of 17830439 Ethernet and MPLS WP
-
7/28/2019 17830439 Ethernet and MPLS WP
1/7
Overview
This paper describes the Ethernet and Multi-Protocol LabelSwitching (MPLS) tools and procedures used to accomplishOperations, Administration, and Maintenance (OAM). Thisfunctionality addresses the fault management aspects of theFault, Configuration, Accounting, Performance, Security(FCAPS) model as defined by the ITU-T Telecommunication
Management Network (TMN), as shown in Figure 1.
Recent enhancements to Ethernet and MPLS have addedcarrier-class OAM features for monitoring, detecting,verifying, isolating, and repairing faults, with appropriatenotifications to network administrators. Theseenhancements enable network operators to deploytimesaving, automated, self-healing practices, as well ason-demand diagnostics and troubleshooting techniques.The purpose of OAM is to improve revenue growth andprofitability for service providers, as outlined in Figure 2.
This white paper describes the OAM features in thecontext of the objectives above, and the unique benefitsof Cienas solution.
OAM Process Flow
Figure 3 describes the serviceprovider process flow when faultsappear in the network, startingwith the fault and ending afterverification of the repair. Eachstep must be optimized toprotect both the serviceprovider and the subscriber.
Fault Detection
Fault detection includes mechanisms to detect faults atthe device control plane or data plane level. Faults mustbe detected quickly enough to minimize Time to Recover(TTR). However, detection should be based on anobservation window large enough to avoid false faultdetections. For example, a control plane can become non-responsive for a few microseconds while handling a burstof interrupts. As long as the control plane is restored to anormal state within an acceptable time window, the networkelement does not experience a software failure. OAMhandles a wide range of failure scenarios that vary in natureand location, from a software defect to a backhoe tearingapart a fiber conduit by mistake.
There are three major categories of failure:
> Link failure
> Service transport failure
> SLA failure
Ethernet and MPLS OAMOperations, Administration and Maintenance
W A S D P White Paper
B M L
S M L
F a u l t M a n a g e m
e n t
C o n f i g
u r a t i o n M a n a g e m e n t
A c c o u n t i n g M
a n a g e m e n t
P e r f o r m a n c e M
a n a g e m e n t
S e c u r i t y M a n a g e m e n t
N M L
E M L
N E L
O AM
F CA P S
T M N
NEL: Network Element Layer (devices)EML: Element Management Layer (device-level functions)NML: Network Management Layer (topology management)SML: Service Management Layer (Service Level Agreements (SLAs))BML: Business Management Layer (budgeting and b illing)
Legend :
Figure 1. FCAPS model
Objectives
Protecting revenue by preventing service outages and offeringfaster service restoration
Maximizing revenue growth by enabling richer service offerings
Reducing operational costs by cutting repair costs and operational overhead
Figure 2. OAM objectives
Fault
Fault Detection
Fault Notification
Fault Verification
Fault Isolation
Repair
Repair Verification
Figure 3. OAM process flow
-
7/28/2019 17830439 Ethernet and MPLS WP
2/7
2
Link Failure
Link failure represents either the complete failure of a link orthe performance of a link degrading below an acceptablelevel. The causes may include an optical transceiver failure ateither end of the link, dust or other impurities in the
connector, a fiber cut between the elements, or elementfailure at the other end of the link.
Service Transport Failure
Ethernet services can be transported natively, using VirtualLocal Area Networks (VLANs) (IEEE 802.1Q) or stacked VLANs (802.1ad), or MPLS tunnels and MPLS VirtualCircuits (VCs). Each of these transport mechanismscan fail due to software failure, memory corruption,or simple misconfiguration.
Service Level Agreement Failure
The SLA describes the characteristics of the servicesprovided by carriers to their subscribers. Adherence tothe SLA can be measured using one or more of thefollowing metrics:
> Frame Delay: delay experienced by the traffic carriedby the service
> Frame Delay Variation: variation in that delay
> Frame Loss: percentage of frames passed throughthe service that were dropped by the network
> Service Availability: percentage of time when theservice is available to the subscriber
Monitoring these SLA parameters provides indications of fault or performance issues. The Metro Ethernet Forum(MEF) and the ITU-T are defining standards for performancemanagement of Ethernet services. This white paper focuseson the fault management aspect of SLA failures. SLA failurescan be caused by link failures, such as a failing opticaltransceiver resulting in partial packet loss, or a servicetransport failure, such as a software failure leading toincorrect forwarding tables.
Fault Notification
Once detected by the network element layer, the faultneeds to be conveyed to the entities that will work towardrepairing the fault. Such entities can require either human or
automated servicingsuch as the manual replacement of afaulty transceiver, or a Rapid Spanning Tree Protocol (RSTP)reconvergence after a link failure, respectively. In any case,fault notification should be:
> Responsive: the time saved will protect revenue andmay avoid penalties.
> Meaningful: a mere link down Simple NetworkManagement Protocol (SNMP) trap sent when anoptical transceiver fails is insufficient. A trap containinginformation regarding the faulty transceiver and thereason for the failure reduces troubleshooting cost.
Ethernet and MPLS OAM
Cienas Carrier Ethernet Service Delivery(CESD) switches are optimized to enablenetwork reconvergence below 50 ms. Theseenhancements allow Ethernet service-deliverynetworks based on Ciena products to supportcritical, time-sensitive applications with thesame SLAs and guarantees of SONET/SDHoptical rings. This level of performance isachieved, in part, by providing high-priority,interrupt-based failure detection, shieldingservices from link-level failures.
Cienas True Carrier Ethernet TM offeringsare the only access/metro edge solutionsthat enable service providers to deployany mix of Ethernet and MPLS-basedservice transports over a commoninfrastructure. This allows service providersto migrate easily from Ethernet to MPLSaccess deployments and extend the servicesand capabilities of an MPLS core network
directly to subscribers, with no additionalcapital investment required.
Ciena, through the early adoption of IEEE802.1ag Connectivity Fault Management(CFM) provides VLAN-based service transportOAM. The combination of Label SwitchedPath (LSP) ping, LSP traceroute, VirtualCircuit Connection Verification (VCCV), Bi-directional Forwarding Detection (BFD) andFast ReRoute (FRR) provides comprehensiveMPLS-based service transport OAM.
Cienas CESD switches offer intelligentclassification and queue servicing, whichminimizes frame delay and variation. In
addition, Ciena provides a unique set of self-healing techniques at the link and servicetransport layers, to minimize SLA failuresrelating to frame loss and service availability.
-
7/28/2019 17830439 Ethernet and MPLS WP
3/7
3
> Concise: sending multiple traps with redundant failureinformation will obfuscate the real cause of the failureand slow down the fault isolation step.
Fault Verification
After notification, the Network Operation Center (NOC)engineer should verify the fault, and determine whether thecondition persists. By the time the link fail indication isreceived, the Ethernet network will have reconverged.
Under most conditions, failover and restoration with CienasCarrier Ethernet Service Delivery devices takes less than 50ms. Fault verification using on-demand OAM techniqueseliminates false failure indications. Not verifying the validityof the fault could lead the network operator to try to isolatea failure that does not exist.
Fault Isolation
Fault isolation consists of determining the exact source,location, and nature of the fault, including the specificnetwork element(s) and network layer(s) experiencing thefault. A failure at a low level may impact higher levels andlead to additional failures. For example, a link failure canlead to broken MPLS tunnel connectivity, also impacting allof the MPLS VCs that tunnel carries.
Notification of a low-level failure can be followed orsurrounded by higher-level failure notifications. This processmakes fault isolation more difficult, time-consuming, and
costly. Features such as alarm correlation help minimize thecost of isolating a fault by decreasing the number of faultnotification messages.
Repair
Depending on the efficiency of the OAM process, repairand preventative maintenance can occur at different stages:
>
After the fault impacts the service. Time-to-repair ismost critical, as the network operator needs to remedythe problem quickly to restore the service. Cienas TrueCarrier Ethernet solutions provide modularity in thenetwork elements, enabling the network operator tochange only the failed element, saving time andeliminating impacts to other services. For example,risk of error is eliminated because the failure of a hot-swappable transceiver does not require the replacementand re-cabling of the entire network element.
> Before the fault impacts the service. Redundancyenables proactive maintenance, significantly reducing
service outage times. Cienas modular solution, coupledwith redundant links, control modules, power supplies,and fans, allows non-invasive repair of networkcomponents, protecting the services the componentscarry. For example, the failure of a redundant controlmodule will lead only to non-invasive switchover to thestandby module.
> Before the fault leads to an element or networkfailure, such as a performance degradation scenario.By continuously monitoring key metrics relating toelement and network health, service providers canschedule maintenance preemptively, thereby using
fewer resources.
Repair Verification
After a remedy is enacted, the same on-demand OAM mechanisms used during faultverification confirm that the fault no longerexists. An IP ping can be used both to verify IPconnectivity faults on the control-plane andrestore connectivity.
Ethernet and MPLS OAM
Ciena provides a comprehensive solutionfor optimum fault notification, includinghigh-priority generation of SNMP trapswith a content focused on failure source. Inaddition, Cienas Ethernet Services Manager(ESM) solution offers alarm correlationcapabilities enabling network operators toassociate alarms to more quickly isolate thecause of the fault.
Ciena offers a complete on-demand OAMsolution, enabling the network operator toconduct layer-by-layer fault isolation (link,service transport, and SLA layers). Figure 4shows the extent of the various OAMmechanisms useful for isolating faults.
Ethernet
Service Agreement Layer
Service Transport LayerService Transport Layer
Link Layer
MPLS MPLS
Link Layer
Figure 4. Major network fault categories
-
7/28/2019 17830439 Ethernet and MPLS WP
4/7
4
OAM ProtocolsWith the addition of comprehensive OAM
capabilities, Ethernet and MPLS offer acomplete feature set that allows carriers tomaximize Ethernet-based service revenue. IEEE,IETF, ITU-T, and MEF now describe mechanismsthat report the status of a given end-to-endservice, representing a subscriber-centric viewof the network, and provide link connectivityinformation, representing a provider-centricview of the network. Figure 5 offers a high-levelview of these mechanisms against the OAMprocess flow and different failure categories.
IEEE 802.3ah Ethernet First Mile (EFM) OAM
EFM OAM, described in Figure 6, provides link-layermechanisms that complement applications that may residein higher layers (such as IEEE 802.1ag or MEF Service OAM).EFM OAM, also called link OAM, encompasses a simpleprotocol that operates across a single link.
Thresholds are configured to monitor signal degradation,such as frame errors. Messages are passed across the link tocommunicate statistics regarding link health. When a failinglink is detected, SNMP communicates this to management
stations. In addition, the link may be taken out of service andplaced in remote loopback mode for fault isolation. Prior toplacing a link in service, EFM OAM may be used to test theperformance of the link. Once verified to be operational anderror-free, the link is taken out of remote loopback andplaced in service. Standby links may be testedcontinuously prior to being activated by protocolssuch as IEEE 802.1w RSTP or IEEE 802.1aq ShortestPath Bridging.
IEEE 802.1ag Connectivity Fault Management
Building upon IEEE 802.3ah EFM OAM, IEEE 802.1agCFM specifies capabilities for detecting, isolating,and reporting connectivity faults for VLAN-based
service transport networks. CFM, operates at both thephysical and logical levels, monitoring and troubleshootingfaults. For instance, CFM can monitor physical linksbetween adjacent or distant devices. In addition, faultmonitoring between two end-points can be configuredbased on a logical network layer (such as per-VLAN). KeyCFM features are shown in Figure 7.
The CFM protocol, often called Ethernet OAM, sends heart-beat style Continuity Check Messages (CCMs). Failure toreceive these messages, in order, in a certain amount of timeindicates one or more possible network errors, includingpath or device failure or network configuration problems.Management stations monitor the status of the receptionof CCMs and take appropriate action.
Ethernet and MPLS OAM
IEEE 802.3ah EFM OAM
Features Benefits
Auto-discovery Eliminates the need for operator configurationUni-directional Fault Signaling Enables the detection of a one-way link failureRemote Loopback Provides on-demand link diagnostics, including
bit-error rate approximationLink Monitoring Offers proactive, traffic-based threshold link
monitoringCritical Events Supports communication of network element
conditions that may cause link failure, includingpower and temperature
Layer 2 Variable Retrieval Allows supplemental link statistics collection,augmenting SNMP
Organization-specificExtensions Enables standards development organizations andvendors to expand scope
Figure 5. OAM protocols matrix
Figure 6. IEEE 802.3ah EFM OAM
ServiceLevel
Agreement
Link
Fault
ServiceTransport
RepairFaultDetectionFault
NotificationFault
VerificationFault
IsolationFault
Verification
MEF Service OAM MEF
VCCV/BFD MSTP LSP Ping
IEEE 802.1ag CFM/ITU-T Y.1731 MSTP802.1ag Y.1731
IEEE 802.3ah EFM OAM MSTPSNMP 802.1ag Y.1731802.3ah
EFM OAM
IEEE 802.1ag CFMFeatures BenefitsContinuity Check Continuously verifies VLAN connectivity and
may indicate network faults or misconfigurationsLoopback Request Offers on-demand or proactive indication of (MAC ping) VLAN control-plane responsivenessLinktrace Request Provides on-demand or proactive VLAN(MAC traceroute) topology information
Figure 7. IEEE 802.1ag CFM
-
7/28/2019 17830439 Ethernet and MPLS WP
5/7
Troubleshooting tools are provided in the form of Media
Access Control (MAC) ping (formally known as IEEE 802.1agLoopback Request) and MAC traceroute (formally known asIEEE 802.1ag Linktrace Request). Network operators mayinitiate these features, or the features may run automaticallyas monitoring functions in background processes.
Since CFM is being developed after completion of the IEEE 802.1ad Provider Bridges protocol, a secondimportant aspect of the project allows multiple nestedMaintenance Domains (MDs) to coexist on the samephysical network, each potentially managed by adifferent administrative organization (service provideror network operator).
ITU-T Y.1731
ITU-T Study Group 13 developed Y.1731 in cooperationwith IEEE 802.1ag CFM, further defining VLAN-based servicetransport OAM functionality. Several additional features offerperformance monitoring capabilities. ITU-T Y.1731 and CFMuse an identical frame format and share the same operationcode (OpCode) space. As a result, these complementaryprotocols are simpler to deploy in a service providersnetwork. Figure 8 provides a summary of the featurescontained in Y.1731.
VLAN-based service transport networksconfigure certain network elements atMaintenance End Points (MEPs). These MEPssit at the boundaries of Ethernet domains.Figure 9 shows the span of the different OAMmechanisms offered by Y.1731
MPLS
MPLS deployed to the customer premisesfacilitates the interconnection of the accessinfrastructure with the existing MPLS corenetwork, while increasing the need for MPLS-specific OAM tools. Further description of MPLS shown in Figures 10 and 11.
LSP Ping
LSP ping is an in-band, on-demand mechanism toverify the status of an MPLS tunnel. An LSP can failbecause of misconfigurations such as disabled MPLS,mismatched labels, or routing into the wrong tunnel,or broken Label Distribution Protocol (LDP)adjacencies, corruption of Forwarding InformationBases (FIB), or other software/ hardware failures. LSPping sends an echo request to a target Label SwitchRouter (LSR) using MPLS addressing. To prevent theIP packet from being routed to its destination, thedestination IP address of the echo request packet isdefined as 127.0.0.0/8. If reached, the destination LSR
sends an echo reply back to the originator of theMPLS echo request.
ITU-T Y.1731
Features Benefits
Alarm Indication Signal Provides fault notification for devices not participating in the(ETH-AIS) VLAN-based Ethernet Continuity Check
Remote Defect Indication Offers fault indication of the other end of a VLAN-based(ETH-RDI) Ethernet serviceLocked Signal (ETH-LCK) Enables maintenance actions while differentiating and
isolating actual fault conditionsTest Signal (ETH-Test) Allows a one-way, on-demand, in-service or out-of-service
VLAN test, such as throughput or frame lossPerformance Monitoring Monitors traffic performance on a point-to-point, end-to-end,(ETH-PM) VLAN-based Ethernet serviceFrame Loss Measurement Collects end-to-end frame loss information to approximate severely(ETH-LM) errored seconds, which indicate VLAN-based service transport availabilityFrame Delay Measurement Provides an on-demand Frame Delay and Frame Delay Variation(ETH-DM) measurement between two points of the VLAN-based service
5
Ethernet and MPLS OAM
Figure 8. ITU-T Y.1731
Ethernet
ETH-PM
ETH-AIS, ETH-RDI, ETH-LCK ETH-Test, ETH-LM, ETH-DM
UNI
CE MEP
UNI
CEMEPMEP
Ethernet
Figure 9. ITU-T Y.1731 architecture
Ciena offers a solution allowingtransport of Ethernet services,either natively or using MPLSencapsulation.
MPLS
Features BenefitsLabel Switched Path Ping Offers on-demand connectivity information about
MPLS tunnels
LSP Traceroute Provides MPLS switching and MaximumTransmission Unit (MTU) configuration information
Virtual Circuit Connection Enables proactive connectivity monitoring of Verification MPLS pseudowires
Bi-directional Forwarding Allows scalable, proactive data-plane verificationof MPLS LSPs
Fast ReRoute Provides automated repair of MPLS failures
Figure 10. MPLS OAM
-
7/28/2019 17830439 Ethernet and MPLS WP
6/7
6
LSP Traceroute
LSP traceroute determines the hop-by-hop path anddestination of an LSP. Like LSP ping, traceroute is an in-band,on-demand MPLS OAM utility that uses an MPLS echorequest/reply mechanism to detect MTU misconfigurationbetween LSRs. However, with LSP traceroute, all LSRs alongthe pathup to and including the destination LSRreply tothe echo request. This technique allows the operator toidentify and distinguish LSRs along a path.
Virtual Circuit Connection Verification
Using LSP ping, a service provider can monitor the status of an MPLS tunnel. To diagnose a problem within the tunnel,the service provider needs a mechanism to verify theconnectivity of the pseudowires (VCs). VCCV allowsproactive monitoring of pseudowires within MPLS tunnels byestablishing a control channel associated with eachpseudowire.
Bi-directional Forwarding Detection
VCCV requires involvement of the MPLS control-plane; asthe number of VCs increase, so will the load on the control-plane. BFD allows systematic and more scalable detection of
MPLS LSP data plane failures, with less involvement from thecontrol plane. As a result, BFD allows faster detection ona larger number of LSPs. BFD relies on a hello packetexchanged by neighbors at negotiated, regular intervals.When a hello packet is not received as expected, theneighbor is declared down.
Fast ReRoute
Fast ReRoute allows automated repair of LSP tunnelsto reduce packet loss on LSPs. If there is a link or nodefailure, an LSP employing Fast ReRoute can redirect MPLStraffic to previously computed and established alternate
paths around the failed link or node. The alternate paths areselected during the establishment of a primary LSP underhop-by-hop control. With Fast ReRoute enabled, ResourceReSerVation Protocol-Traffic Extension (RSVP-TE) establisheslocal alternate LSPs for each potential point of failure alongthe primary path.
MEF Service OAM
The MEF is pursing a complementary set of OAM-relatedfunctions operating at the SLA layer. The Phase 1specification will contain performance monitoringcapabilities for point-to-point services reflecting the frameloss ratio, frame delay (latency), and frame delay variation(jitter) characteristics of the service, as shown in Figure 12.
In addition, per-service fault management will be supportedfor point-to-point, point-to-multipoint, and multi-pointservices. Fault detection encompasses loss of continuitybetween management end-points and detection of potentialfor loops in the service. This fault detection/ verificationcapability is supported proactively or on demand throughoperator action. MEF Service OAM, often called ServiceOAM, also provides fault isolation and fault notification.
IPEthernet services offer the benefit of low deployment costsby not requiring IP provisioning of each individual dataplane element. However, the control plane uses mostly IP-based protocols, such as Telnet, SNMP, or IGMP. In thatregard, control plane failures must be detected at the IPlevel. Two mechanisms have been in use since the adventof IP networking: IP ping, which provides on-demandconnectivity verification of the IP control-plane, and IPtraceroute, which offers routing and delay information for
an IP destination.
Ethernet and MPLS OAM
M P L S T u n n e l
M P L S T u n n e l
VC B
VC A
Figure 11. Basic MPLS constructs
MEF Service OAM
Features Benefits
Point-to-point Ethernet VirtualCircuit Performance Monitoring
Point-to-multipoint EVC PM Provides SLA assurance for different services
Multipoint-to-multipoint EVC PM
EVC Fault Management Enables identification and isolation of faultat the SLA layer
Figure 12. MEF Service OAM
-
7/28/2019 17830439 Ethernet and MPLS WP
7/7
IP Ping
IP ping is a basic mechanism that verifies IP connectivitythrough the network. It verifies that a given IP addressexists, is reachable, and can accept ping requests, andcalculates the latency between the control planes of two
IP network elements.
IP Traceroute
IP traceroute is another OAM tool that records and displaysthe IP message route between two IP elements. It alsocalculates the latency between the control-planes of eachIP element of the route.
Conclusion
Ethernet and MPLS OAM
1201 Winterson RoadLinthicum, MD 210901.800.207.3714 (US and Canada)
1.410.865.8671 (outside US)+44.20.7012.5555 (international)www.ciena.com
Specialising in transition to
service-driven networks to help youchange the way you compete.
Ciena may from time to time make changes to the products or specifications contained herein without notice. 2009 Ciena Corporation. All rights reserved. WP062A4 2.2009
Cienas Carrier Ethernet Service Deliverysolution, described in Figure 13, enablesservice providers to operate, administrate,and maintain any mix of Ethernet andMPLS-based L2 VPNs effectively. Byleveraging this unique OAM capability,service providers can protect currentrevenue and maximize revenue growth,while reducing operational costs.
Objectives Carrier Ethernet Service Delivery Solution
Protectsrevenue by:
Maximizesrevenuegrowth by:
Reducesoperationalcosts by:
Preventing serviceoutages:
Offering fasterservice restora tion:
Enabling richerservice offerings:
Reducing repair
costs:
Reducingoperationaloverhead:
>
Sub-50 ms automated network reconvergence> Robust Quality of Service (QoS) architecture minimizes SLA failures> Modular architecture enables planned non-invasive repairs> Redundancy for mission-critical network components
> Generates precise failure information more quickly> Service-aware OAM feature set intelligently traverses each
layer as needed> Complete OAM feature set covers each network layer (link,
service transport and SLA)
> Comprehensive Ethernet and MPLS OAM feature setsIntelligent classification
> Advanced alarm correlation simplifies fault isolation> Hot-swappable solution enables shorter and less expensive repairs
> On-demand OAM techniques eliminate unnecessary investigationof false failure indications
> Modular solution reduces cost of spares> Proactive monitoring enables cost-effective
preemptive maintenance
Figure 13. Cienas Carrier Ethernet Service Delivery solution