High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

100
© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public Advanced Enterprise Campus Design : Resilient Campus Networks BRKCRS 3032

description

Campus network design is evolving in response to multiple driversUser Expectations: Always ON Access to communicationsIndustry Requirements: Financial, Healthcare, 7x24x365 Global accessTechnology Requirements: Services, Applications, Communications – i.e Unified Communications, VideoThis document is addressing the High Availability solutions in campus environment.

Transcript of High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

Page 1: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Advanced Enterprise Campus Design :

Resilient Campus Networks BRKCRS – 3032

Page 2: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Presenter

Rahul Kachalia – CCIE #11732 (R&S and SP)

Technical Marketing Engineer

System Development Unit (SDU)

Page 3: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Design Zone for Borderless Networks www.cisco.com/go/designzone/borderless

Borderless Campus CVD http://www.cisco.com/en/US/docs/solutions/Enterprise/Campus/Borderless_Campus_Network_1.0/Borderless_Campus_1.0_Design_Guide.pdf

Page 4: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032
Page 5: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

What Are Your Uptime Requirements?

Campus network design is evolving in response to

multiple drivers

User Expectations: Always ON Access to communications

Industry Requirements: Financial, Healthcare, 7x24x365 Global access

Technology Requirements: Services, Applications, Communications – i.e

Unified Communications, Video

Requires a Structured ‘and’ Resilient Design

Global Enterprise

Availability

Collaboration

and Real-Time

Communication

Page 6: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

0

5

10

15

20

25

30

35

40

45

50

Minimal Impact

to Voice User Hangs

Up

No impact to

Voice Phone Resets*

Se

co

nd

s o

f D

ata

Lo

ss

* The time for a phone to reset is variable and depends on the signaling protocol (SCCP

or SIP) and the state of the call (active, ringing, …)

How Does Downtime Affect Voice?

Availability Requirements for UC are more than just five 9’s

Also need to consider the subjective impact to real time communications

200ms 1 sec 5-6 sec

Page 7: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

How Does Downtime Affect Voice or Video?

Network SLAs varies for traditional video conferencing versus TelePresence

Availability Requirements for high-definition TelePresence are more

stringent then UC

Metric TelePresence Traditional

Video

Conferencing Target Threshold 1

(Warning)

Threshold 2

(Call Drop)

Latency 150 ms 200 ms 400 ms 400-450 ms

Jitter 10 ms 20 ms 40 ms 30-50 ms

Loss 0.05% 0.10% 0.20% 1%

BW 2.5 - 12.6 Mbps + overhead 384 or 768 kbps

+ overhead

Page 8: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Design Strategies For Network Survivability

Non-Disruptive Network and Service Availability Resiliency Goal

Resiliency Strategy Network Level

Resiliency

System Level

Resiliency

Operational

Level

Resiliency

Resiliency

Technologies ECMP

EtherChannel

UDLD

NSF/SSO

Power

Redundancy

ISSU

eFSU

GOLD

EEM

Page 9: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

High Availability Campus Design Agenda

Network Level Resiliency

High Availability Design Principles

Simplified and Redundant Campus Design

Campus Routing Best Practices

System Level Resiliency

Integrated Hardware and Software Resiliency

Stateful and Non-Stop Forwarding

‒ Hitless Switching

Operational Level Resiliency

‒ Single and Multi-Chassis ISSU Upgrade

‒ Hitless NX-OS Software Upgrade

Page 10: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Advantage

Highly Redundant Network Design

Redundant System and network paths on mission-critical network points

Protects network availability during major network fault event

Disadvantage

Becomes complex as it scales

Increase control and management plane

Redundant control-plane with redundant topology information

Simple Demand, Complex Design?

Advantage

Operational simplicity – Single Control-Plane between layer

Redundant Network Paths

Single chassis system redundancy

Cost-effective solution for small size network design

Disadvantage

Single point-of-failure design

Any major network fault can cause complete network outage

May not be very cost-effective design compare with dual systems

SiSi SiSi

SiSi SiSi

SiSi SiSi

SiSi

SiSi

SiSi

Page 11: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Structured and Modular Designs Works Best

Optimize the interaction of the physical redundancy

with the network protocols

Provide the necessary amount of redundancy

Pick the right protocol for the requirement

Optimize the tuning of the protocol

The network looks like this so that we can map the

protocols onto the physical topology

We want to build networks that look like this

WAN Internet

SiSi SiSi SiSi SiSi SiSi SiSi

SiSiSiSi

SiSi SiSiSiSiSiSi

Redundant Switches

Redundant Supervisor

Redundant Links

Layer 2 or

Layer 3

Data Center

Page 12: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

High Availability Campus Design Agenda

Network Level Resiliency

‒ High Availability Design Principles

‒ Simplified and Redundant Campus Design

‒ Campus Routing Best Practices

System Level Resiliency

‒ Integrated Hardware and Software Resiliency

‒ Stateful Switchover and Non-Stop Forwarding

‒ Hitless Switching

Operational Level Resiliency

‒ Single and Multi-Chassis ISSU Upgrade

‒ Hitless NX-OS Software Upgrade

Page 13: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Access Layer Redundancy with Dual-Sup

Non-stop business communication with redundant supervisor

Distribute multiple uplinks from both supervisor for following

benefits :

‒ Improve network resource utilization

‒ Minimize control-plane disruption

‒ Improve network recovery to sub-second

‒ Maximize network level protection

Protects switching capacity, network topology and forwarding

information during supervisor switchover

Sup-1 Sup-2

SiSiSiSi

4500E

SiSi

Sup-1

Sup-2

Page 14: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Flexible edge network and bandwidth expansion

Multiple built-in supervisor uplink ports for high-speed distribution-

access block.

Plan inter-distribution link capacity to handle large data re-routing

Minimize network congestion with distributed high-speed uplink

connections to aggregation system

Access Layer Redundancy with Single-Sup

SiSi

SiSi SiSi

4500E

1G Uplink

10G

SiSi

SiSi SiSi

4500E

10G Uplink

10G

SiSi

SiSi SiSi

4500E

10G Uplink

10G

Page 15: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Simplified, Scalable & Reliable Access Network with

Cisco StackWise Plus

Single Management

Centralized control-plane architecture

NSF Capable

Control and Mgmt Plane

Network expansion as it grown

Several 1G link consolidation to 10G

High-speed stack-ring for intra-access traffic

Physical Network

Single point-to-point network

Distributed forwarding architecture

Reduces VLANs and subnets

Network Design

SiSi SiSiVSL

Page 16: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Uplink Redundancy with Cisco StackWise Plus

SiSi SiSi

Build uplinks with two stack-member switches.

Protocol driven network recovery with dual uplinks

Quad distributed uplinks

Increase uplink capacity

Hardware driven network recovery with traditional distribution design

Prevents network topology change and improves network recovery

Dual vs Quad Uplink Design Alternatives

Dist-1 Dist-2

SW1 SW9

Page 17: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

We Will Be Talking About Solutions for Two

Distribution Block Models

Traditional Distribution Block

Design

Dual Standalone System

Distributed Planes

Protocol dependent fault detection

and recovery

Evolution Network Design

Single Virtual System

Unified Control and Management

plane. Distributed Forwarding plane.

Deterministic Network Recovery.

SiSi SiSi

Vlan 10 Vlan 20 Vlan 30 Vlan 10 Vlan 20 Vlan 30

SiSi SiSi

Page 18: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Traditional Distribution Design

Redundant design with sub-optimal topology and complex

operation.

Stabilize network topology with several L2 :

‒ STP Primary and Backup Root Bridge

‒ Rootguard

‒ Loopguard or Bridge Assurance

‒ STP Edge Protection

Protocol restricted forwarding topology –

‒ STP FWD/ALT/BLK Port

‒ Single Active FHRP Gateway

‒ Asymmetric forwarding

‒ Unicast Flood

Protocol dependent driven network recovery

‒ PVST/RPVST+

‒ FHRP Tunings

SiSiSiSiHSRP Active

Rootguard

Loopguard or

Bridge Assurance

Bridge

Assurance

STP Root

BPDU Guard or

PortFast

Port Security

Page 19: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Even with Faster Convergence from RPVST+ We Still Have to Wait on FHRP

Convergence

GLBP offers load balancing within a VLAN

For Voice, sub-second Hello timer enables < 1 Sec traffic recovery upstream

Sub-Second protocol timers must be avoided on SSO capable network

FHRP Active FHRP Standby

SiSiSiSi

interface Vlan4

ip address 10.120.4.2 255.255.255.0

standby 1 ip 10.120.4.1

standby 1 timers msec 250 msec 750

standby 1 priority 150

standby 1 preempt

standby 1 preempt delay minimum 180

interface Vlan4

ip address 10.120.4.2 255.255.255.0

glbp 1 ip 10.120.4.1

glbp 1 timers msec 250 msec 750

glbp 1 priority 150

glbp 1 preempt

glbp 1 preempt delay minimum 180

interface Vlan4

ip address 10.120.4.1 255.255.255.0

ip helper-address 10.121.0.5

no ip redirects

vrrp 1 description Master VRRP

vrrp 1 ip 10.120.4.1

vrrp 1 timers advertise msec 250

vrrp 1 preempt delay minimum 180

HSRP Config

GLBP Config

VRRP Config

Page 20: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

PIM Needs Timer Tuning Too

Multicast recovery depends on PIM DR failure detection in

Layer 2 network

PIM routers exchanges PIM expiration time in query

message –

‒Default Query-Interval – 30 seconds

‒Expiration – Query Interval x 3

‒DR Failure Detection – ~90 seconds

Tune PIM query interval to sub-sec as FHRP for faster

multicast convergence

Sub-second protocol timer must be avoided on SSO capable

network interface Vlan4

ip pim sparse-mode

ip pim query-interval 250 msec

PIM DR SiSiSiSi

Page 21: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Sub-second Protocol Timers and NSF/SSO

NSF is intended to provide availability through route convergence avoidance

Fast IGP timers are intended to provide availability through fast route convergence

In an NSF environment dead timer must be greater than:

SSO recovery + Routing Protocol restart + time to send first hello

Recommendation keep protocol timers to default

Neighbor Loss,

Graceful Restart

SiSiSiSi

SiSiSiSi

NSF Restart

RP Restart

OSPF First Hello

NSF Capable

NSF-Aware

Hello

Page 22: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

STP Root

BPDU Guard or

PortFast

Port Security

Rootguard

Simplify STP Network Topology with VSS

STP BLK Port

Loop-free L2 EtherChannel

Multiple parallel Layer 2 network path builds STP

loop network

VSS with MEC builds single loop-free network to

utilize all available links.

Distributed EtherChannel minimizes STP

complexities compared to standalone distribution

design

STP toolkit should be deployed to safe-guard

multilayer network

Page 23: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Simplified, Scalable and Reliable L3 Gateway with VSS

Single logical Layer 3 gateway. Eliminates complete need of

implementing FHRP protocols.

Removes FHRP dependencies and increases Layer 3 network

scalability.

Hardware based rapid fault-detection and network recovery

with default protocol timers.

Deterministic network sub-second network convergence in

multiple fault conditions.

R1

Single IP Gateway

Page 24: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

EtherChannel Link Convergence Optimal Fast Traffic Restoration

Catalyst Switch

Link failure detection

Removal of the Portchannel entry in the software

Update of the hardware Portchannel indices

1 Link Failure

Detection

2

1

2

3

3

Routing Protocol Process

Spanning Tree Process

Notify the spanning tree and/or routing protocol processes of path cost

change

4

4

Layer 2 Forwarding Table

Load-Balancing Hash

Destination Port

G3/1

G3/2

G4/1

G4/2

VLAN MAC Destination

Index

10 AA Portchannel 1

11 BB G5/1

PortChannel 1 G3/1, G3/2, G4/1, G4/2

SiSi SiSi

Page 25: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Multi-Chassis EtherChannel Performs Better In Any

Network Design

Network Recovery mechanic varies in different distribution

design –

‒ Standalone – Protocol and Timer dependent

‒ VSS – Hardware dependent

VSS logical distribution system –

‒ Single P2P STP Topology

‒ Single Layer 3 gateway

‒ Single PIM DR system

Distributed and synchronized forwarding table –MAC address,

ARP cache, IGMP

All links are fully utilized based on Ether-channel load

balancing

0

0.2

0.4

0.6

0.8

1

L2-FHRP L2-MEC

Co

nv

erg

en

ce (

sec)

Upstream Downstream Multicast

Page 26: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

timers throttle spf 10 100 5000

timers throttle lsa all 10 100 5000

timers lsa arrival 80

OSPF SPF Tuning

The Best Deployment for Standalone Is Routed Access

Simplified Operation with single control-plane – Routing Protocols

Improved Network Design – No FHRP, STP, Trunk, VTP etc.

Optimized Forwarding Topology – Layer 3 ECMP

Improved convergence with fewer protocols

EIGRP/OSPF

Layer 3

Layer 2

SiSiSiSiHSRP Active

Rootguard

Loopguard or

Bridge Assurance

Bridge Assurance

STP Root

BPDU Guard or

PortFast

Port Security

Page 27: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

VSS Simplifies Routed Access

Builds single point-to-point routing peer adjacency with MEC

EtherChannel delivers deterministic network recovery

Minimizes adjusting protocol timers and parameters

EIGRP / OSPF

Single Adjacency

Page 28: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

SiSi

Designated

Router

(High IP Address)

IGMP Querier

(Low IP address)

Designated

Router & IGMP

Querier

Non-DR has to

drop all non-RPF

Traffic

SiSiSiSi SiSi

Routed Access Optimized Multicast Operation

Layer 2 access has two multicast routers on the access subnet, causing one to have to discard frames

Routed Access has a single multicast router which simplifies management of multicast topology

Page 29: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

VSS Optimizes Multicast Performance with Routed

Access

Single logical L3 path to RP from access to join multicast

distribution tree

Single OIL/IIL PIM interface in Multicast Routing Table

Increases multicast bandwidth capacity with all MEC

member-links programmed for switching

Transparent to network faults and provides deterministic

sub-second multicast data recovery Single PIM Join Message

Single OIL

OIL = Outgoing Interface List IIL = Incoming Interface List

Page 30: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Routed Access Provides Rapid Convergence with

Optimized Traffic Flow and Ease of Mgmt

CEF and protocol based network recovery in Standalone Routed Access Design ‒ EIGRP converges in <200 msec

‒ OSPF with sub-second tuning converges in <200 msec

‒ Multicast with sub-second tuning convergences in ~600 msec

EtherChannel hash based network recovery in VSS Routed Access Design ‒ Deterministic sub-second unicast & multicast network

convergence

EtherChannel does not require any further protocol tunings

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

EIGRP-ECMP EIGRP-MEC OSPF-ECMP OSPF-MEC

Co

nve

rge

nce

(se

c)

Upstream Downstream Multicast

Page 31: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Intra-Chassis Recovery

SiSi SiSi

Inter-Chassis Recovery

Diversify Links For Module Redundancy

Distribute multiple connections to single or logical remote

system between different linecard module when possible.

Recovery mechanic same as link failure.

Prevents topology changes or forwarding updates and provides

intra-chassis sub-second recovery.

Depending network load it minimize the network congestion

SiSi SiSi

Page 32: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Best Practice for Module OIR

Module OIR is supported on all modular systems.

Network recovery have higher impact with Module OIR due to

‒ OIR detection

‒ Hardware Synchronization

‒ Protocol Dependencies

‒ Forwarding Updates

Minimize network impact with following techniques :

‒ Admin Power Down

‒ Admin Reset

0

0.5

1

1.5

2

2.5

OIR Power Down Soft Reset

Co

nve

rge

nce

(se

c)

Upstream Downstream Multicast

6500E(config)#no power enable module <slot-id>

6500 Standalone

6500-VSS(config)no power enable switch <1|2> module <slot-id>

6500 VSS

Page 33: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

High Availability Campus Design Agenda

Network Level Resiliency

‒ High Availability Design Principles

‒ Simplified and Redundant Campus Design

‒ Campus Routing Best Practices

System Level Resiliency

‒ Integrated Hardware and Software Resiliency

‒ Stateful Switchover and Non-Stop Forwarding

‒ Hitless Switching

Operational Level Resiliency

‒ Single and Multi-Chassis ISSU Upgrade

‒ Hitless NX-OS Software Upgrade

Page 34: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Core Layer Routing Design Strategy

Design Campus Core with Simplicity

Optimize Routing Topologies:

Hide Topology – EtherChannel

Hide Reachability – Route Summarization

Filter – Stub, Distribute-list, Route-Maps

High-Performance, Reliable Network Design

Increase Application Performance

Deterministic Network Recovery

Page 35: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Data Center WAN Internet

VSS Enabled Campus Design End-to-End VSS Design Option

Data Center WAN Internet

SiSi SiSi SiSi SiSi

SiSiSiSi

SiSi SiSiSiSiSiSi

SiSi SiSi

Page 36: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Scalable and Hitless Core Design Alternative with

Nexus 7000

Standalone Redundant Core System

High-scale, High-Performance system.

Hitless forwarding design with distributed forwarding architecture by de-coupling centralized control and management.

Highly Available – Hitless Forwarding, NSF/SSO, EC, ISSU etc.

Data Center WAN Internet

Page 37: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Deploy EtherChannel for Simplify, Optimize and

Reliable Core

Single Unified Core System

Single point-to-point network per neighbor.

Simplified, Optimized and resilient Unicast and Multicast Network Design

Highly Available – VSS, Quad-Sup, NSF/SSO, MEC, eFSU etc.

6500-VSS

Standalone Redundant Core System

Single point-to-point network per neighbor.

EtherChannel ECMP to simplify, optimize and build resilient Network Design

Highly Available – Hitless Forwarding, NSF/SSO, EC, ISSU etc.

Nexus 7000

Page 38: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

EIGRP Is Unique with Multi-Level Summarization

Capability

The greatest advantages of EIGRP are gained when

the network has a structured addressing plan that

allows for use of summarization and stub routers

EIGRP provides the ability to implement multiple

tiers

of summarization and route filtering

Able to maintain a deterministic convergence time in

very large L3 topology

10.10.0.0/17 10.10.128.0/17

10.10.0.0/16

SiSi SiSi SiSi SiSi

SiSi SiSi2001:DB8:10::/56 2001:DB8:10:128:/56

2001:DB8:10::/48

Page 39: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

EIGRP Convergence Is Improved with Summarization,

Filtering and EtherChannel

EIGRP convergence is largely dependent on query paths and response times

Implement EtherChannel to reduce query paths

Minimize the number and time for query response to speed up convergence

Summarize distribution block routes upstream to the core

Configure all L3 access switches as EIGRP stub routers

router eigrp 100

network 10.0.0.0

eigrp stub connected

!

interface TenGigabitEthernet 4/1

ip summary-address eigrp 100 10.120.0.0 255.255.0.0

Query Response

SiSiSiSi

SiSiSiSi

Page 40: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Avoid Default Route Black Hole

Know default route source in the network.

EIGRP advertises default-route if exists in Routing Table.

Maintain network availability in campus by advertising following routes to EIGRP Stub routers

‒ Summarized Internal Route

‒ Default-Route to Stub routers

WAN Internet

10.1.0.0/16 10.2.0.0/16 10.3.0.0/16

10.4.0.0/16 10.5.0.0/16

router eigrp 100

network 10.0.0.0

distribute-list EIGRP_STUB_Routes out <Port-Channel#>

!

ip access-list standard EIGRP_STUB_Routes

permit 10.0.0.0

permit 0.0.0.0

! Data Center

Page 41: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

OSPF Area Boundaries Offer Summarization for

Improved Scale

Area boundaries provide buffers between fault

domains

Keep area 0 for core infrastructure

Do not extend area 0 to the access routers when

using Routed Access

WAN Internet

SiSi SiSi

SiSiSiSi

SiSi SiSiSiSiSiSi

Area 100 Area 110 Area 120

Area 0

SiSi SiSi SiSi SiSi

Data Center

Page 42: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

SiSiSiSi

SiSiSiSi

OSPF Downstream Summarization Is Accomplished

with Multiple Area Types

ABR for a regular area forwards

Summary LSAs (Type 3)

ASBR summary (Type 4)

Specific externals (Type 5)

Stub area ABR forwards

Summary LSAs (Type 3)

Summary default (0.0.0.0 - ::/0)

A totally stubby area ABR forwards

Summary default (0.0.0.0 - ::/0)

router ospf 100

area 120 stub no-summary

network 10.120.0.0 0.0.255.255 area 120

network 10.122.0.0 0.0.255.255 area 0

OSPF

Area

120

Page 43: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

SiSiSiSi

SiSiSiSi

ABR’s originate Summary 10.120.0.0/16 &

2001:DB8:10:120::/48

OSPF Upstream Summarization Helps Minimize LSA

Churn in the Core

Summarize routes from the distribution block upstream into the core

Minimize the number of LSA’s and routes in the core

Reduce the need for SPF calculations due to internal distribution block

changes

router ospf 100

area 120 stub no-summary

area 120 range 10.120.0.0 255.255.0.0 cost 10

network 10.120.0.0 0.0.255.255 area 120

network 10.122.0.0 0.0.255.255 area 0

interface Vlan120

ip address 10.120.0.1 255.255.255.192

!

Page 44: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Cost = 3

OSPF Cost Matters in EtherChannel Designs

Route metrics (bandwidth) automatically adjusted on EtherChannel

interface

Maximum bandwidth or cost computation differs between OS –

‒ IOS – 10G (default) *

‒ NX-OS – 40G (default) **

Single core-layer member-link failure in OSPF – EC/MEC design

may

‒ Under-utilize Network Resources

‒ Build Asymmetric Forwarding Topology

‒ Increases Network Convergence Time

* Adjustable. Recommended to keep default

Cost = 1 Cost = 1

Cost = 3

SiSi SiSi

Summary Net 10.100.0.0/16

Auto-Cost = 10G

Auto-Cost = 10G

Cost = 5

SiSi SiSi

Auto-Cost = 40G

Auto-Cost = 40G

Summary Net 10.100.0.0/16

Cost = 1

Cost = 3

Cost = 1

Cost = 3

** Recommended to adjust OSPF auto-cost ref. bw to 10G on Nexus 7000

N7K-Core(config-router)#auto-cost reference-bandwidth 10000

Nexus 7000

Page 45: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Optimize EtherChannel Load Balancing

SiSi

Default : src-mac

Recommended : src-dst-ip

Default : src-dst-ip vlan

Recommended : src-dst-mixed-ip-port vlan

Default : src-dst-ip

Recommended : src-dst ip-l4port-vlan Default : src-dst-ip vlan

Recommended : src-dst-mixed-ip-port

Default : src-dst-ip vlan

Recommended : src-dst-mixed-ip-port vlan

Load share egress data traffic based on input hash

Optimal load sharing results with :

‒ Bucket-based load-sharing – Bundle member-links in power-of-2 (2/4/8)

‒ Multiple variation of input for hash (L2 to L4)

Recommended algorithm * :

‒ Access – Src/Dst IP

‒ Dist/Core – Src/Dst IP + Src/Dst L4 Ports

* May vary based on your network traffic pattern

Access

Dist

Core

Page 46: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Layer 3 Load Balancing Can Be Randomized with a Unique ID Associated

with Switch

“Universal ID” concept (also called Unique ID) is used to prevent CEF polarization

Universal ID generated at bootup (32-bit pseudo-random value seeded by router’s base IP address)

Universal ID used as input to ECMP hash, introduces variability of hash result at each network layer

Universal ID supported on Catalyst 6500 Sup-32 and Sup-720

Universal ID supported on Catalyst 4500 SupII+10GE, SupV-10GE and Sup6E

Hash using

Source IP (SIP),

Destination IP (DIP)

&Universal ID

Original Src IP + Dst IP

Universal* Src IP + Dst IP + Unique ID

Include Port Src IP + Dst IP + (Src or Dst Port) + Unique ID

Default* Src IP + Dst IP + Unique ID

Full Src IP + Dst IP + Src Port + Dst Port

Full Exclude Port Src IP + Dst IP + (Src or Dst Port)

Simple Src IP + Dst IP

Full Simple Src IP + Dst IP + Src Port + Dst Port

Catalyst 4500 Load-Sharing Options Catalyst 6500 PFC3** Load-Sharing Options

* = default load-sharing mode

SiSi SiSi

SiSi SiSi

SiSi

Page 47: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Simple Network Design Delivers Deterministic Network

Recovery

Routing Protocol Independent network convergence

ECMP Prefix-Independent Convergence (PIC) for with 6500 (VSS/Standalone) from 12.2(33)SXI2

Hardware-based fault detection and recovery in MEC/EC designs

Number or Unicast Routes Core/Distribution – Sup720-10GE

Time for ECMP/MEC Unicast Recovery

0

0.5

1

1.5

2

2.5

3

3.5

500 1000 5000 10000 15000 20000 25000

Co

nve

rgen

ce

(s

ec

)

ECMP (W/o PIC) ECMP (With PIC) MEC

Page 48: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

VSS Core Simplifies Multicast Operation, Improve

Performance and Redundancy

Standalone Core needs AnyCast MSDP peering for RP Redundancy.

VSS based Core simplifies PIM RP Redundancy with NSF/SSO/MMLS technologies.

ECMP builds single Multicast forwarding path.

MEC increases multicast forwarding capacity by utilizing all member-links.

Single Logical PIM RP

Single Logical PIM Interface

Dist Single Logical PIM Router

PIM Join

Single Logical OIL

Multiple Multicast Forwarding Paths

Core

SiSi SiSi

PIM RP

Core

PIM RP

SiSi SiSi

PIM Router Dist

PIM Router

AnyCast - MSDP

PIM Join

Single OIL

Page 49: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Simplified Multicast Network Design Delivers

Deterministic Network Recovery

ECMP multicast recovery is mroute scale dependent could range

in seconds.

MEC/EC multicast recovery is hardware-based and recovery is

scale-independent in sub-seconds

0

1

2

3

4

5

6

100 500 1000 5000

Co

nve

rge

nc

e (

se

c)

ECMP

MEC/EC

Number or Multicast Routes Core/Distribution – Sup720-10GE

Time for ECMP/MEC Multicast Recovery

Page 50: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

High Availability Campus Design Agenda

Network Level Resiliency

‒ High Availability Design Principles

‒ Simplified and Redundant Campus Design

‒ Campus Routing Best Practices

System Level Resiliency

‒ Integrated Hardware and Software Resiliency

‒ Stateful Switchover and Non-Stop Forwarding

‒ Hitless Switching

Operational Level Resiliency

‒ Single and Multi-Chassis ISSU Upgrade

‒ Hitless NX-OS Software Upgrade

Page 51: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Does I still need Dual Supervisor?

Redundant physical paths Protects Network Availability

Converges in sub-second

May not maintain capacity and performance.

Increases outage probability during major node failure

Redundant Supervisor Module Protects Network and Services Availability

Maintains capacity and performance

System remains in-service during major supervisor failure

Hitless to insignificant data loss during switchover

SiSi

Single Point of Failure

Reduced Capacity

Self Recovery Fail

Reduced Capacity

Page 52: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Supervisor Redundancy Provides Stateful Switch Over

1:1 Supervisor Redundancy Architecture

Stateful Synchronization

‒ System Variables

‒ Configuration – Running/Startup

‒ Layer 2/3 Protocol State and Topologies

‒ Policies – ACLs, QoS etc.

‒ Linecards Status

Active Supervisor owns control-plane ownership.

Develops central and distributed forwarding table

Graceful system recovery by protecting hardware and

software state-machines

Architecture varies between modular systems

Page 53: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

NSF Works with SSO to Keep Neighbors Forwarding During

a Supervisor Switchover

Non-Stop Forwarding provides graceful restart enhancements to

EIGRP, OSPF, IS-IS, BGP and LDP

An NSF-capable router continuously forwards packets during an

SSO processor recovery

NSF-aware and NSF-capable routers provide for transparent

routing protocol recovery

Graceful restart extensions enable neighbor recovery without resetting

adjacencies

Routing database re-synchronization occurs in the background NSF-Aware,

NSF-Capable

SiSiSiSiNSF-Aware

NSF-Aware,

NSF-Capable

Page 54: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Cisco vs IETF OSPF NSF Capability

NSF Capable NSF Aware NSF Capable NSF Aware

Restart event

Fast Hello (2 sec interval RS bit set)

Fast Hello (2 sec interval RS bit clear)

Database Description Database

Description

LSA Requests/Update

Hello (RS bit clear) Hello

(RS bit clear)

Fast H

ello

LSA Requests/Update

Ou

t-of-B

an

d S

yn

c

Restart event

LS Update (Grace LSA) LS ACK

(Grace LSA)

Hello

An

no

un

ce

Gra

ce

ful-

res

tart

Hello

Fast Hello (2 sec interval RS bit set)

Fast Hello (2 sec interval RS bit clear)

Database Description Database

Description

LSA Requests/Update

Hello

Hello

LSA Requests/Update

Data

base E

xch

an

ge

OS

PF

Dis

co

ve

ry

225.0.0.5

225.0.0.5

Recommendation When peering with IETF capable device, use IETF NSF Capability using “nsf ietf” command under routing process”

Page 55: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

High Availability Campus Design Agenda

Network Level Resiliency

‒ High Availability Design Principles

‒ Simplified and Redundant Campus Design

‒ Campus Routing Best Practices

System Level Resiliency

‒ Integrated Hardware and Software Resiliency

‒ Stateful and Non-Stop Forwarding

‒ Hitless Switching

Operational Level Resiliency

‒ Single and Multi-Chassis ISSU Upgrade

‒ Hitless NX-OS Software Upgrade

Page 56: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Mix Processing Architecture

‒Centralized (Master) – CDP, LACP, Layer 3 (ARP/Routing/Multicast) and Management

Plane.

‒Distributed (All stack-members) – MAC Learning, STP, QoS, ACL etc.

Distributed Forwarding Architecture

‒Single Forwarding Table – Master synchronizes the RIB/FIB with all stack-member

switches

‒Local-switching – Within port-asic and between port-asics thru local switch-fabric

1:N Master Switch Redundancy in stack-ring. Dynamic re-

election after failure

Protects distributed L2/L3 FIB. Gracefully restarts routing

adjacencies

StackWise+ Provides Stack-Ring Redundancy

Master

Distributed FIB

Master

SiSiSiSi

VSL

Page 57: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Designating Master/Slave Switch in Stack-Ring

Any stack-member can become Master. Recommended to

increase switch priority for deterministic role.

Master switch failure-detection, propagation and re-election could

range in 2-3 seconds.

Network recovery mechanic differs in different designs –

‒ Master Switch with Uplink

‒ Master Switch without Uplink (Recommended)

Master (Priority=15)

Slave (Priority=14)

!Increase Master Switch Priority to 15(highest)

switch 5 priority 15

!Increase Slave Switch Priority to 14(lower than Master)

switch 6 priority 14

SiSiSiSi

VSL

Page 58: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Multilayer – StackWise Plus Master Switch Recovery

Analysis Stack Design – 3 (Recommended)

Master Switch w/o Uplink

Slave Switch (w/o Uplink) set in stack-ring

0

0.5

1

1.5

2

2.5

Design - 1 Design - 2 Design - 3

Co

nve

rgen

ce

(s

ec

)

Catalyst 3750-X StackWise Plus Master Failure Analysis

Upstream Downstream

Stack Design – 1

Master Switch with Uplink

No Slave Switch (same priority)

Stack Design – 2

Master Switch with Uplink

Slave Switch (w/o Uplink) set in stack-ring

SiSiSiSi

VSL

Page 59: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

NSF Recovery

Graceful Routing with StackWise Plus

Routing adjacencies and L3 FIB preserved during Master

failure.

Graceful routing capability supported for EIGRP and OSPF.

Network recovery mechanic differs in different designs –

‒ Master Switch with Uplink

‒ Master Switch without Uplink (Recommended) EIGRP / OSPF

Master

Distributed FIB

Master router eigrp 100

nsf

!

router ospf 100

nsf

SiSiSiSi

VSL

Page 60: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Routed Access – StackWise Plus Master Switch

Recovery Analysis

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Design - 1 Design - 2 Design - 3

Co

nv

erg

en

ce (

sec)

Catalyst 3750-X StackWise Plus Master Failure Analysis – EIGRP Routed Access

Upstream Downstream

Stack Design – 1

Master Switch with Uplink

No Slave Switch (same priority)

Stack Design – 2

Master Switch with Uplink

Slave Switch (w/o Uplink) set in stack-ring

Stack Design – 3 (Recommended)

Master Switch w/o Uplink

Slave Switch (w/o Uplink) set in stack-ring

SiSi SiSi

VSL

EIGRP / OSPF

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Design - 1 Design - 2 Design - 3

Co

nv

erg

en

ce (

sec)

Catalyst 3750-X StackWise Plus Master Failure Analysis – OSPF Routed Access

Upstream Downstream

Page 61: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

4500E SSO Architecture Protects Network Availability

and Capacity

1+1 Supervisor Redundancy Architecture

Centralized Processing Architecture Active Supervisor maintains all three-planes

In real-time hardware and software state-machine synchronization from Active to Standby Supervisor

Centralized Forwarding Engine Switch data-traffic between linecard modules

Stub-Linecards – No local-switching

Decouples Control and Forwarding Plane –

Protects Network Capacity during Soft/Admin Forced Switchover

IOS Software Upgrade

Line Card

Line Card

Line Card

Line Card

Line Card

Active Sup

Forwarding Engine FFE /VFE

Shared Memory Fabric PPE / IPP

Standby Sup

SSO Redundancy

Catalyst 4500E

Page 62: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Active Standby

MultiLayer – 4500E Supervisor Switchover Analysis

0

0.02

0.04

0.06

0.08

0.1

0.12

Co

nv

erg

en

ce (

sec)

Upstream Downstream Multicast

SiSiSiSi

4500E

Standby

VSL

Active

Stateful Layer 2 Protocol Synchronization

STP, MAC Table, IGMP Snooping, PAgP etc.

Protects Network Capacity

Maintains all uplinks, including on failed Sup

All linecard module remains operational

Deterministic <100msec Convergence

Forwarding-Engine decouples control and forwarding plane

Sup Fabric Connectivity remains operational even after failure

Page 63: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Active Standby

Routed Access – 4500E Supervisor Switchover

Analysis

0

0.5

1

1.5

2

2.5

Co

nv

erg

en

ce (

sec)

Upstream Downstream Multicast

4500E

Standby Active

EIGRP / OSPF

Stateful Layer 3 Protocol Synchronization

EIGRP, OSPF, ARP etc.

PIM SSO capability not supported.

Deterministic <100msec Unicast Convergence

Forwarding-Engine decouples control and forwarding plane

Sup Fabric Connectivity remains operational even after failure

SiSiSiSi

VSL

Page 64: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

VSS-SW2 VSS-SW1

6500-E VSS Architecture

Intra-Chassis SSO Redundancy

Catalyst 6500-E

Line Card

Line Card

Active Sup

SF PFC RP

Internal EOBC

Standalone

External EOBC (VSL)

Line Card

Line Card

Internal EOBC

Standby Sup

SF PFC RP

Standby Sup

SF PFC RP

Inter-Chassis SSO Redundancy

Catalyst 6500-E

SF : Switch Fabric PFC : Policy Feature Card

RP : Route Processor EOBC : Ethernet Out-of-Band Channel

Internal EOBC : Internal communication control channel between supervisor and linecards within single-chassis

External EOBC : External communication control channel between supervisors between two-chassis

Page 65: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Standby

VSS Dual-Sup Inter-Chassis Redundancy

VSS Dual-Sup (single per virtual-switch) supports inter-chassis

SSO redundancy.

Single in-chassis supervisor - SSO Active or Standby role.

Stateful SSO synchronization and redundancy between virtual-

switches

Single Sup System Design –

‒ Supervisor switchover requires chassis reset, including all linecard

and service modules

‒ Network capacity reduced until system returns to operational state

Reduced Capacity

Reduced Capacity

SiSi

Reduced Capacity

Reduced Capacity

NSF Recovery

Active Active Standby

Page 66: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

New Active Supervisor

VSS Quad-Sup Extends HA Capability

Starting 12.2(33)SXI4 Sup720-10GE VSS supports two sup

redundancy modes :

‒ Dual-Sup – One Sup per virtual-switch

‒ Quad-Sup – Two Sup’s per virtual-switch

Dual Sup offers single redundancy option –

‒ Inter-Chassis only. Resetting Active or Standby supervisor reboots all installed

modules

‒ Sup hardware failure may increase MTTR, reduce network capacity, services

availability and may build un-reliable network

Quad Sup offers dual redundancy options –

‒ Inter-Chassis – Same design as dual-sup

‒ Intra-Chassis – Allows virtual switch to return in-service, reduce MTTR and

stabilize network from major fault

SiSi

Self Recovery Fail

Single Point of Failure

Reduced Capacity

Reduced Capacity

NSF Recovery

Sup720-10GE Quad-Sup Redundancy

Page 67: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

ICS – RPR-WARM ICS – RPR-WARM

VSS Quad Sup Supports Dual HA Mode

VSL

SiSiSiSiSiSi SiSiSiSiSiSi

Inter-Chassis Sup Redundancy

SW1 SW2

Intra-Chassis Sup Redundancy

Intra-Chassis Sup Redundancy

Dual in-chassis supervisors, each in different redundancy modes –

In-chassis Active Supervisor (ICA) – In SSO Active OR Standby Mode

In-chassis Standby Supervisor (ICS) – RPR-WARM Mode

Stateful SSO synchronization from SSO Active to Standby supervisor

System configuration synchronization between ICA and ICS supervisors

Chassis reset when ICA supervisor reset

ICA – SSO Active ICA – SSO Standby

Sup720-10GE Quad-Sup Redundancy

Page 68: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

VSS Quad Sup RPR-WARM Design

Provides system redundancy during major ICA failure.

RPR-WARM – Sup in hybrid operational mode :

‒ ICS Supervisor – RPR cold-state with extended capabilities

‒ DFC Linecard – Distributed linecard with all available 1G/10G

uplink ports for network connectivity.

ICS synchronizes various configuration from ICA :

‒ Startup-Configuration

‒ VLAN Database

‒ Boot Variable

‒ VSS Virtual-Switch ID

ICS – RPR-WARM

ICS – RPR-WARM

VSL

SiSiSiSiSiSi SiSiSiSiSiSi

SW1 SW2

ICA – SSO Active

ICA – SSO Standby

6500#show switch virtual redundancy | inc Switch|Current Software My Switch Id = 1 Peer Switch Id = 2 Switch 1 Slot 5 Processor Information : Current Software state = ACTIVE Switch 1 Slot 6 Processor Information : Current Software state = RPR-Warm Switch 2 Slot 5 Processor Information : Current Software state = STANDBY HOT (switchover target) Switch 2 Slot 6 Processor Information : Current Software state = RPR-Warm

Sup720-10GE Quad-Sup Redundancy

Page 69: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Graceful VSS Quad-Sup Deployment

Software Upgrade

Deploy ICS

Redesign VSL

Upgrade VSS supervisor

(Active/Standby) to 12.2(33)SXI4

or onwards.

Maintain network availability

during software upgrade with

enhanced Fast Software

Upgrade (eFSU)

Install redundant (ICS)

supervisors on each virtual-

switch chassis.

Bootup ICS supervisor with

common software version

and license as ICA.

Build full-mesh VSL

physical paths between

quad supervisor module.

Bundle new VSL

connections in VSL EC.

Failure to follow recommended procedure may de-stabilize VSS system and network operation

Sup720-10GE Quad-Sup Redundancy

Page 70: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

ICS – RPR-WARM 12.2(33)SXI5

ICS – RPR-WARM 12.2(33)SXI5

Installing ICS Supervisor With Mismatch IOS Version

Incompatible IOS software between ICA and ICS supervisor may force ICS to fallback in ROMMON

mode

ICS with Quad-Sup software capability may allow to boot up with mismatch IOS version to install

common software version

No effect of disabling IOS mismatch version if ICS boot up without Quad-Sup capability (pre-

12.2(33)SXI4)

ICS - ROMMON 12.2(33)SXI3

ICS - ROMMON 12.2(33)SXI3

SiSiSiSiSiSi SiSiSiSiSiSi

SW1 SW2

ICA – SSO Active 12.2(33)SXI4

ICA – SSO Standby 12.2(33)SXI4

6500-VSS(config)#no switch virtual in-chassis standby bootup mismatch-check

!

Sup720-10GE Quad-Sup Redundancy

Page 71: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

ICS Supervisor IOS Upgrade Process

Step – 1 Disable IOS software mismatch version check from global configuration mode:

6500-VSS (config)#no switch virtual in-chassis standby bootup mismatch-check

Step – 2 Insert the ICS supervisor module in both chassis. Intra-chassis role negotiation will allow ICS to complete the bootup

process in RPR-WARM mode

Step – 3 Copy the ICA-compatible IOS software version on both ICS supervisor modules:

‒ 6500-VSS#copy <image_src_path> sw1-slot6-disk0:<image>

‒ 6500-VSS#copy <image_src_path> sw2-slot6-disk0:<image>

Step – 4 Re-enable IOS software mismatch version check from global configuration mode. Keeping disable may cause chassis to go

in RPR mode in next-switchover.

‒ 6500-VSS (config)#switch virtual in-chassis standby bootup mismatch-check

Step – 5 Force ICS supervisor module reset. In the next bootup process, the ICS module will now bootup with an ICA-compatible IOS

software version:

‒ 6500-VSS#hw-module switch 1 ics reset

‒ 6500-VSS#hw-module switch 2 ics reset

Sup720-10GE Quad-Sup Redundancy

Page 72: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Dual and Quad Sup SSO Analysis

MEC based network recovery mechanic with VSS in dual

or quad-sup design.

Deterministic sub-second network convergence for unicast

and multicast data traffic.

Only SSO Active failure triggers graceful protocol recovery.

0

0.1

0.2

0.3

EIGRP - ECMP EIGRP - MEC OSPF- ECMP OSPF - MEC

Co

nve

rge

nce

(se

c)

6500-VSS Dual/Quad Sup NSF/SSO Analysis – Unicast Application

Upstream Downstream

0

20

40

60

80

100

120

140

ECMP MEC

Co

nve

rge

nce

(se

c)

6500-VSS Dual/Quad Sup NSF/SSO Analysis – Multicast Application

Active-IIL Standby-IIL

Sup720-10GE Quad-Sup Redundancy

Page 73: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

VSS Dual Sup – VSL Design

Two Cisco recommended designs

VSL

Sup Sup

Profile 1 – VSL on Supervisor (Sup2T/Sup720-10GE)

Cost-effective solution to leverage both uplinks.

Continue to use non-VSL capable linecard for 10G

core connection.

Redundant fibers connects thru common fabric and

ASICs, this could result vulnerability in system

stability.

Optimal and preset VSL parameters – Load-

Balancing, QoS, HA, Traffic-engg, Dual-Active etc.

Restricted to bundle 2 x VSL ports or 20G switching

capacity on per virtual-switch node basis.

VSL

Sup Sup

Profile 2 – Diversified VSL between Supervisor (Sup2T/Sup720-10GE) and VSL capable Linecard

Redundant and diversified fibers between

supervisor and next-gen VSL capable linecards.

Same design as Profile 1 but increases system

reliability as each VSL port are diversified across

different fabric/ASICs.

Optimal and preset VSL parameters – Load-

Balancing, QoS, HA, Traffic-engg, Dual-Active

etc.

Flexible to scale up to 8 x VSL for high-dense

system to aggregate uplink, service modules,

single-home etc.

Sup2T and Sup720-10GE Design

Page 74: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

SiSiSiSiSiSi

VSS Quad Sup – VSL Design

Sup-3 Sup-4 VSL

SiSiSiSiSiSi

SW1 SW2

Sup-1 Sup-2

Sup-3 Sup-4

Same Design Profile – 1 Dual Sup

Flexible to increase VSL Capacity

Continue to leverage existing non-VSL 10G

linecard for uplink connection

Retains all original VSL benefits

Vulnerable design during any supervisor self-

recovery fault incident

Recommended Full-Mesh VSL on Quad-Sup

SiSiSiSiSiSi

Sup-3 Sup-4

VSL

SiSiSiSiSiSi

SW1 SW2

Sup-1 Sup-2

Sup-3 Sup-4

Highly Redundant and cost-effective VSL

Design.

Increases overall VSL Capacity

Maintains 20G VSL Capacity during

supervisor failure.

Increases network reliability by minimizing the

dual-active probability

Sup720-10GE Quad-Sup VSL Redundancy

Page 75: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

VSS Dual-Active Detection Redundancy

All VSL link failure forces both virtual-switch to transition in ACTIVE

role known as – Dual-Active

Dual-Active condition confuses neighbor devices and de-stabilizes

network.

Two Detection and Recovery Mechanic :

Direct = Dual-Active Fast Hello or BFD

In-Direct = Enhanced PAgP (ePAgP)

Recommended to use ePAgP and Fast-Hello mechanic for redundancy

BFD detection mechanic deprecated starting 15.0(SY1)

SiSiSiSiSiSi

SiSi

ePAgP Layer 2 Port-Channel

Catalyst 2K/3K/4K

SiSi

ePAgP Layer 3 Port-Channel

Fast-Hello

!Enable Enhanced PAgP on trusted L2/L3 Port-Channel interface

6500-VSS(config-vs-domain)#dual-active detection pagp trust channel-group 101

!

!Enable dual-active fast-hello on directly connected interface (copper/fiber)

6500-VSS(config#interface range Gi1/1/1 , Gi2/1/1

6500-VSS(config-if)#dual-active fast-hello

SiSiSiSiSiSi

Dual-Sup or Quad-Sup VSL Redundancy

Page 76: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Dual-Active Recovery Analysis

Dual-Active Network Recovery depends on –

‒ Uplink Network Design – ECMP vs MEC

‒ Routing Protocols – EIGRP vs OSPF

‒ Detection Mechanic – Fast-Hello vs ePAgP

OSPF ECMP faster in failure detection then ePAgP. Slow

network convergence

Starting 12.2(33)SXI3 Dual-Active Fast-Hello performs rapid

failure detection and delivers deterministic recovery

independent of network design and protocol

0

5

10

15

20

25

30

35

EIGRP - ECMP EIGRP - MEC OSPF - ECMP OSPF - MEC

Co

nv

erg

en

ce

(sec)

6500E VSS – Dual-Active Recovery Analysis – ePAgP

Upstream Downstream

0

0.1

0.2

0.3

0.4

0.5

EIGRP - ECMP EIGRP - MEC OSPF - ECMP OSPF - MEC

Co

nv

erg

en

ce (

sec)

6500E VSS – Dual-Active Recovery Analysis – Fast-Hello

Upstream Downstream

Dual-Sup or Quad-Sup VSL Redundancy

Page 77: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

High Availability Campus Design Agenda

Network Level Resiliency

‒ High Availability Design Principles

‒ Simplified and Redundant Campus Design

‒ Campus Routing Best Practices

System Level Resiliency

‒ Integrated Hardware and Software Resiliency

‒ Stateful Switchover and Non-Stop Forwarding

‒ Hitless Switching

Operational Level Resiliency

‒ Single and Multi-Chassis ISSU Upgrade

‒ Hitless NX-OS Software Upgrade

Page 78: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Nexus 7000 Distributed Architecture

Nexus 7018

URIB MRIB FIB

URIB MRIB FIB

ACTIVE

STANDBY

Distributed IPFIB/MFIB

SSO Synchronization

Fabric Modules

1 Crossbar Fabric ASICs

2 Crossbar Fabric ASICs

5 Crossbar Fabric ASICs

46Gbps/slot

46Gbps/slot

46Gbps/slot

46Gbps/slot

46Gbps/slot

4 Crossbar Fabric ASICs

3 Crossbar Fabric ASICs

Local Switching

URIB : Unicast Routing Info Base MRIB : Multicast Routing Info Base

FIB : Forwarding Info Base MFIB : Multicast Forwarding Info Base

Page 79: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

1+1 Supervisor Redundancy architecture

Decouple centralized control-plane with

distributed forwarding plane

Redundant central arbiter

Hitless Supervisor Switchover with –

‒ Distributed I/O Module

‒ Crossbar Fabric Module

Standby

Active CPU CMP CA Standby

CPU CMP CA Active

NSF Recovery

Hitless Supervisor Redundancy with Nexus 7000

Page 80: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Fabric Module Capacity and Redundancy

46Gbps 92Gbps 138Gbps 184Gbps 230Gbps per slot bandwidth

Nexus 7018

1 x 23G channel per supervisor slot

2 x 23G channels per I/O module slot

Fabric Modules

1 Crossbar Fabric ASICs

2 Crossbar Fabric ASICs

5 Crossbar Fabric ASICs

46Gbps/slot

46Gbps/slot

46Gbps/slot

46Gbps/slot

46Gbps/slot

4 Crossbar Fabric ASICs

3 Crossbar Fabric ASICs

Required for 80G/slot

Insufficient Capacity

N+1 Redundancy

N+1 Redundancy

AND Future Proof

8x10GE I/O Module

Page 81: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

46G/Slot 80G/Slot

80G/Slot

Nexus 7000 Crossbar Failure May Cause Fabric

Congestion

%XBAR-2-XBAR_INSUFFICIENT_XBAR_BANDWIDTH: Module in slot 1 has insufficient xbar-bandwidth.

80G/Slot

46G/Slot 80G/Slot

Asymmetric Forwarding

Capacity

Symmetric Forwarding

Capacity

Asymmetric Forwarding

Capacity

Symmetric Forwarding

Capacity

No Topology Change

No Topology Change

Crossbar Fabric module reduces internal switching capacity. And may cause congestion

Supervisor and I/O Module remains operational

No network topology change gets triggered.

Page 82: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Hitless Fabric Switching with Nexus 7000

3 3

4 4

4 4

5

1

Right and Left Ejector - Open 1

2

2

Signal Software to start graceful data re-routing 2

Hitless data re-routing 3

Fabric Interface Shutdown 4

Crossbar Fabric Module Power Down 5

Hitless Fabric Switchover

Hitless Fabric Switchover

Page 83: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

High Availability Campus Design Agenda

Network Level Resiliency

‒ High Availability Design Principles

‒ Simplified and Redundant Campus Design

‒ Campus Routing Best Practices

System Level Resiliency

‒ Integrated Hardware and Software Resiliency

‒ Stateful Switchover and Non-Stop Forwarding

‒ Hitless Switching

Operational Level Resiliency

‒ Single and Multi-Chassis ISSU Upgrade

‒ Hitless NX-OS Software Upgrade

Page 84: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

In Service Software Upgrade Allows Upgrade Without

Taking Switch Down

In redundant topology standard maintenance practice is

to shut down devices during upgrade

and let the network converge

ISSU provides the ability to upgrade software in place

without having to shut down

Offers significant uptime improvements

ISSU—All Paths

and Switches Active

During Upgrade

Scheduled

Maintenance—

Half Capacity

SiSi

SiSiSiSi

SiSi

SiSi

SiSi

SiSiSiSi

SiSi

SiSi

Page 85: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

ISSU – Graceful IOS Software Upgrade Cycle

ACTIVE

OLD

STANDBY

OLD

issu loadversion Standby Sup reboots with new software version

ACTIVE

OLD

STANDBY

NEW

issu runversion

SSO switchover and new software becomes effective

STANDBY

OLD

ACTIVE

NEW

issu acceptversion

Acknowledge successful new software activation (Optional)

STANDBY

OLD

ACTIVE

NEW

STANDBY

NEW

ACTIVE

NEW

issu commitversion

Commit and reboot the STANDBY with new software

issu abortversion

Return to original version

Page 86: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

ISSU Software Upgrade Prep List

Save system configuration and save in local and remote server (TFTP/FTP)

Copy new software (same version/license) in local storage of Active and Standby

Supervisor and change boot parameters with new software version

NSF capability is enabled under routing process

Prevent following major system changes until software upgrade process

completes –

Add or remove hardware modules

Modifying software configuration

Modifying Boot-registers

Page 87: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Simplified Catalyst 4500E ISSU Upgrade Process

Supported on all Supervisor Modules

Attentive four-step manual software

upgrade process

Opportunity to verify and upgrade new

software

LV

RV

AV CV

New SW

Manual Upgrade Automatic Upgrade

Supported Supervisor Modules –

Sup7E – Starting 3.1.0SG

Sup6E/Sup6L-E – Starting 15.0.2SG

Single-CLI and automated software upgrade process

Opportunity to schedule upgrade new software

ChV

RV

CV

New SW

Recommendation : Use both methods for safe and graceful software roll-out in large deployment

issu changeversion loadversion

runversion

acceptversion commitversion commitversion

runversion

Page 88: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Catalyst 4500E Network Recovery with ISSU

Protects Network Capacity during entire software

upgrade process

Real-time software upgrade with NSF/SSO capability

Completes entire software upgrade process with

<50msec loss in Multilayer design

0

0.01

0.02

issu loadversion issu runversion issu commitversion

Co

nve

rgen

ce

(s

ec

)

4500E Network Recovery With ISSU Software Upgrade – Multilayer Design

Upstream Downstream Multicast

0

0.5

1

1.5

2

2.5

issu loadversion issu runversion issu commitversion

Co

nve

rgen

ce

(s

ec

)

4500E Network Recovery with ISSU Software Upgrade – Routed Access Design

Upstream Downstream Multicast

Page 89: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

4

Standby Active

VSS Inter-Chassis Software Upgrade Process

VSL

SiSiSiSiSiSi

VSL

SW1 SW2

SiSiSiSiSiSi SiSiSiSiSiSi

Active Standby

SW1

SiSiSiSiSiSi

SW2

SiSiSiSiSiSi

ISSU LoadVersion –

Triggers Standby chassis to reset with new software version.

1

ISSU RunVersion –

Forces SSO Switchover and makes new software version operational.

New Active starts graceful protocol recovery. Active switch starts ISSU

roll-back timer after Standby becomes operational

2

ISSU AcceptVersion –

Stops Roll-back Timer

3

ISSU CommitVersion –

Triggers Standby chassis to reset with new software version. 4

1

2 3

Starting 12.2(33)SXI 6500 VSS supports enhanced Fast Software Upgrade (eFSU)

Dual-Sup eFSU Upgrade Process

Page 90: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

VSS Quad Sup Software Upgrade Process

Standby Active

SiSiSiSiSiSi

SW1 SW2

Active Standby

SW1 SW2

ISSU LoadVersion – Triggers ICA and ICS Supervisor modules in Standby chassis to reset with new software version.

1

ISSU RunVersion – Forces SSO Switchover and makes new software version operational. New Active starts graceful protocol recovery. Active switch starts ISSU roll-back timer after Standby becomes operational

2

ISSU AcceptVersion – Stops Roll-back Timer

3

ISSU CommitVersion – Triggers ICA and ICS Supervisor modules in Standby chassis to reset with new software version.

4

SiSiSiSiSiSi SiSiSiSiSiSi

VSL

SiSiSiSiSiSi SiSiSiSiSiSi

VSL

Sup720-10GE Quad-Sup eFSU Upgrade Process

Page 91: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Network capacity is reduced until Standby chassis

becomes operational

Network availability is maintained with MEC

MEC based recovery mechanic allows complete

software upgrade process ~1-second traffic loss

Catalyst 6500E VSS Network Recovery with eFSU

0

0.05

0.1

0.15

0.2

0.25

issu loadversion issu runversion issu commitversion

Co

nve

rge

nce

(se

c)

6500E – VSS Dual/Quad Sup Network Recovery with eFSU Software Upgrade

Upstream Downstream Multicast

Page 92: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

High Availability Campus Design Agenda

Network Level Resiliency

‒ High Availability Design Principles

‒ Simplified and Redundant Campus Design

‒ Campus Routing Best Practices

System Level Resiliency

‒ Integrated Hardware and Software Resiliency

‒ Stateful Switchover and Non-Stop Forwarding

‒ Hitless Switching

Operational Level Resiliency

‒ Single and Multi-Chassis ISSU Upgrade

‒ Hitless NX-OS Software Upgrade

Page 93: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Nexus 7000 NX-OS ISSU Benefits

Simplified – Single-CLI to upgrade (system/kickstart) several

distributed hardware components

Automated – Fully automates the upgrade process in serial order.

Reliable – Runs new software compatibility test on current

hardware inventory, generates impact report prior initializing

upgrade.

Hitless – Graceful and non-disruptive procedure, leverages

distributed forwarding architecture to upgrade entire system with

zero packet loss.

Hitless ISSU Upgrade

Hitless ISSU Upgrade

System Kickstart CMP CMP-BIOS

System Kickstart CMP CMP-BIOS

I/O BIOS

I/O BIOS

Page 94: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Nexus 7000 NX-OS ISSU Upgrade Prep List

Save system configuration and save in local and remote server (TFTP/FTP)

Copy new software in local storage of Active and Standby Supervisor

Run new software compatibility test and generate detail upgrade analysis report

‒ show install all impact system bootflash:/<system-image-name> kickstart bootflash:/<kickstart-image-name>

Prevent following major system changes until software upgrade process completes –

- Add or remove hardware modules

- Modifying software configuration

- Modifying Boot-registers

Page 95: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Standby

Compatibility check is done: Module bootable Impact Install-type Reason ---------- ----------- ------------------- --------------- ---------- 1 yes non-disruptive rolling 2 yes non-disruptive rolling 5 yes non-disruptive reset 6 yes non-disruptive reset Module Image Running-Version(pri:alt) New-Version Upg-Required ---------- ---------- ------------------------------ ----------------- ------------------- 1 lc1n7k 5.0(5) 5.1(1a) yes 1 bios v1.10.14(04/02/10):v1.10.14(04/02/10) v1.10.14(04/02/10) no 2 lc1n7k 5.0(5) 5.1(1a) yes 2 bios v1.10.14(04/02/10):v1.10.14(04/02/10) v1.10.14(04/02/10) no 5 system 5.0(5) 5.1(1a) yes 5 kickstart 5.0(5) 5.1(1a) yes 5 bios v3.22.0(02/20/10):v3.22.0(02/20/10) v3.22.0(02/20/10 no 5 cmp 5.0(2) 5.1(1) yes 5 cmp-bios 02.01.05 02.01.05 no 6 system 5.0(5) 5.1(1a) yes 6 kickstart 5.0(5) 5.1(1a) yes 6 bios v3.22.0(02/20/10):v3.22.0(02/20/10) v3.22.0(02/20/10 no 6 cmp 5.0(2) 5.1(1) yes 6 cmp-bios 02.01.05 02.01.05 no

Do you want to continue with the installation (y/n)? [n] Y

Nexus 7000 Hitless NX-OS Upgrade Process

Hitless ISSU Upgrade

Hitless ISSU Upgrade

Active

Active

install all … Starts compatibility test and generates impact report. Upon user action proceed or terminate ISSU upgrade process

1

1 N7K#install all system bootflash:///<system-image-name> kickstart bootflash:///<kickstart-image-name> !

1 3

Updates boot variable and resets Standby supervisor to reboot with new NX-OS software

2

Active supervisor resets and performs hitless SSO switchover. Reboots with new NX-OS software. This step makes new NX-OS in-effect

3

Starts non-disruptive I/O Module upgrade in serial order. Roll-over CPU with new NX-OS software in-effect. Remains operational during upgrade

4

Upgrades CMP Processor and BIOS on Active and Standby Supervisor

5

2 System Kickstart CMP CMP-BIOS System Kickstart CMP CMP-BIOS

System Kickstart CMP CMP-BIOS

Standby

System Kickstart CMP CMP-BIOS

I/O BIOS I/O BIOS

I/O BIOS I/O BIOS

4

4

Page 96: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Summary

Simplify and Optimize your campus network design with system and network

consolidation to maintain application performance even during common network faults

Leverage hardware-based fault detection for scale-independent and deterministic

network recovery

Build non-stop communication network with system-level redundancy in all campus

layer – Access / Distribution / Core

Design mission-critical campus backbone that offers scale flexibility, key foundational

services and uncompromised high-availability.

Reduce maintenance window and upgrade system while maintaining network

availability

Page 97: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Recommended Reading

Continue your Cisco Live learning experience with further reading

from Cisco Press

Check the Recommended Reading flyer for suggested books

End-to-End QoS Network Design: Quality of Service in LANs, WANs and VPNs

ISBN: 1-58705-176-1

Building Resilient IP Networks

ISBN: 1-58705-215-6

Top-Down Network Design, Second Ed.

ISBN: 1-58705-152-4

Available Onsite at the Cisco Company Store

Page 98: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Complete Your Online Session Evaluation

Give us your feedback and you could win

fabulous prizes. Winners announced daily.

Receive 20 Passport points for each session

evaluation you complete.

Complete your session evaluation online now

(open a browser through our wireless network

to access our portal) or visit one of the Internet

stations throughout the Convention Center.

Don’t forget to activate your

Cisco Live Virtual account for access to

all session material, communities, and

on-demand and live activities throughout

the year. Activate your account at the

Cisco booth in the World of Solutions or visit

www.ciscolive.com.

98

Page 99: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. BRKCRS-3032 Cisco Public

Final Thoughts

Learn more in the World of Solutions. Visit Booth #XXXX

Visit www.ciscoLive365.com after the event for updated PDFs, on-

demand session videos, networking, and more!

Follow Cisco Live! using social media:

‒ Facebook: https://www.facebook.com/ciscoliveus

‒ Twitter: https://twitter.com/#!/CiscoLive

‒ LinkedIn Group: http://linkd.in/CiscoLI

99

Page 100: High Availability in Campus Network 2012-Usa-PDF-BRKCRS-3032

© 2012 Cisco and/or its affiliates. All rights reserved. Presentation_ID Cisco Public