Approaches to Designing a High-Performance Switch Router

28
Approaches to Designing Approaches to Designing a High-Performance a High-Performance Switch Router Switch Router Dr. Vishal Sharma Principal Consultant Metanoia, Inc. Phone: +1 408-955-0910 Email: [email protected] Web: http://www.metanoia- inc.com Metanoia, Inc. Critical Systems Thinking™ © Copyright 2002 All Rights Reserved

description

This talk/tutorial was one that I delivered to multiple organizations -- ranging from semiconductor houses, to start-up system vendors, to research and academic institutions, back in the 2002 time frame. As the abstract below illustrates, it captures the key essence & principles behind the router designs of two of the most popular and landmark switch/routers in our industry -- the Cisco...

Transcript of Approaches to Designing a High-Performance Switch Router

Page 1: Approaches to Designing a High-Performance Switch Router

Approaches to Designing a Approaches to Designing a High-Performance Switch High-Performance Switch

RouterRouterDr. Vishal Sharma Principal ConsultantMetanoia, Inc.Phone: +1 408-955-0910Email: [email protected] Web: http://www.metanoia-inc.com

Metanoia, Inc.Critical Systems Thinking™

© Copyright 2002All Rights Reserved

Page 2: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 2

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Classification of Switch Architectures

1st gen. – shared-bus based Bus-based with central memory, centralized processing

2nd gen. – advanced shared-bus based Bus-based with local memory, distributed processing

3rd gen. – interconnection fabric w/ multiple parallel paths Crossbar or cross-point switch, rings, …

4th gen. – distributed switch Interconnect smaller, ASIC-based 1st, 2nd, or 3rd generation switches

in a regular topology Centralized, high-perf. switch core, with distributed line cards

Page 3: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 3

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

CPU

MemoryDMA DMA

DMA DMA

LC1LCNR R

Switch Architectures: Shared-bus with Central Memory

Without DMA a packet crosses bus 4 times (2 times with DMA)

1 3

24

Backplane

Line Card 2

Line Card 1

CPU

Memory

Page 4: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 4

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Switch Architectures: Shared-bus with Central Memory

Blocking if bus b/w or CPU processing < 4.N.R (2.N.R w/ DMA)

Delay: function of memory I/O speed and CPU processing

Throughput: upper-bounded by min(bus speed, CPU power) Most commercial Ethernet switching platforms -- 1-2 Gb/s backplane

The most expensive backplanes today could yield up to 20 Gb/s

Page 5: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 5

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Switch Architectures: Shared-bus with Central Memory

Example: Cisco Catalyst 2820 Ethernet Switch (also 1900 family) 24 10BaseT and 2 100BaseT full-duplex ports (on 2820)

440 Mbps x 2 = 880 Mbps min. bus throughput required

Bus bandwidth : 1 Gb/s

CPU: Intel 486 with 1 MB of flash

Central memory: 3 MB of RAM

Observations: 10 Mbps ports Require 20 Kpps/port for 64B packets

Available: 14.8 Kpps per port

Require: 880 Kpps aggregate forwarding perf. (Ethernet + Fast Eth.) Available: 450 Kpps

Performance is CPU limited (not bus bandwidth limited)

Latency: ~70 us

Page 6: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 6

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

LC1 LCNR R

CPU

MemoryRouting/Lookup

Buffering

Buffering

DMA DMA

Full RoutingFunction

Switch Architectures: Shared-bus, Distributed Memory & Processing

1

Fast Path

2

Slow Path

Backplane

Line Card 2

Line Card 1

CPU

Memory

DMA

Page 7: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 7

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Switch Architectures: Shared-bus with Distributed Memory

Blocking if bus b/w or CPU processing < 2.N.R (N.R with DMA)

Delay: function of memory I/O speed and CPU processing

Packet forwarding via dedicated engines, one per line card (LC) Allows line rate forwarding, even with small packets

Enables design parameter adjustment based on LC type

Throughput: upper-bounded by min(bus speed, forwarding engine)

Page 8: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 8

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Switch Architectures: Shared-bus with Distributed Memory

Example: 3Com CoreBuilder 5000 Switching System 17 slot/chassis, 24 10BaseT’s/slot or 4 100BaseT’s/slot (or port)

17x24x10 = 4.08 Gb/s minimum bus throughput required!

Bus bandwidth: 2 Gb/s max. 3.9 Mpps @ 64B/packet

CPU + 18MB DRAM: for address learning, fragmentation, SPT algorithm

Packet switching:custom ASIC + 4MB DRAM per slot: for forwarding, filtering

Observations: Require: 480 Kpps/slot (Eth.) or 800 Kpps/slot (Fast Eth.)

Available: 650 Kpps per switching ASIC

Performance here is bus-bandwidth limited (not forwarding limited)

Latency: ~45-100 us

Jitter: ~ 5 us

Page 9: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 9

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Switch Architectures: Inter-connect Fabric with Multiple Parallel Paths

Backplane I/F

Line Card 1

Forwarding

I/F

Line Card N

ForwardingCPU

Memory

Switch Interconnect

Switch Interconnect

Switch InterconnectSwitch Interconnect

MidplaneI/F

Line Card 1

Forwarding

I/F

Line Card N

ForwardingCPU

Memory

Memory

Full RoutingFunction

CPU

I/F

Forwarding

LocalMemory

I/F

MAC

Interconnect

LC1 LCN

MAC

Page 10: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 10

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Switch Architectures: Inter-connect Fabric with Multiple Parallel Paths

Non-blocking (for unicast) if crossbar or shared memory with adequate bandwidth (2NR)

Delay: 10s of us (in an unloaded system)

Throughput: full line rate, subject to queueing discipline Provided LC processing & interconnect scheduling keep up

Note that this is not always the case!

Applicability: state of the art for many current switches/routers Cisco GSR 12000 family (high-end, core router 98-99), Ascend GRF

(mid-end router, 96-97), Cisco Catalyst 8500 (low-end, enterprise router 97-98),

Page 11: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 11

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Switch Architectures: Distributed Switch

Interconnect smaller switches, each with the architecture of a 1st, 2nd, or 3rd generation switch.

The smaller switches are usually ASIC based

Connected in a specific topology, such as a hypercube or mesh (more on this ahead)

1st, 2nd, or 3rd

gen. switch

Distributedinterconnect

RP Mem

Route Processorwith Memory

Page 12: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 12

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Switch Architectures: Distributed Switch

Line Card 1

Line Card 2

Line Card N

Line Card 1

Line Card 2

Line Card N

Switch Core

RP Mem

Electrical or OpticalConnections

Centralized, high-performance switch core, with distributed line cards

Switch core and line cards may be in different chassis

Interconnect composed of optical or electronic links

Page 13: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 13

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Functional Map of Processing in a Typical IP Router

PhysicalLayer

InputFraming

OutputFraming

LookupEngine

TrafficManager

LookupTables

Buffer/StateMemory

FabricI/F

uP

LinkScheduler

To RouteProcessor

FabricI/F

Buffer/StateMemory

Packet Processing

To Fabric

FromFabric

O/E

E/O

Page 14: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 14

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

A Canonical Realization of the Functional Map

Trans-ceiver

InputFramer

OutputFramer

NetworkProc.

TrafficManager

SDRAMDRAM

FabricI/F

LCP

TrafficManager

To RouteProcessor

FabricI/F

Buffer/StateMemory

Packet Processing SwitchFabric

Co-Proc.

Trans-ceiver

PCI

SPI-4

SFI-4

3.125 Gb/sSERDES

Lookup TableBuffer Memory

Page 15: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 15

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Juniper M40 and M160: A Comparison M40 M160

Throughput 20 Gb/s 80 Gb/s

Processing @ 64B packets

40 Mpps (1 pkt. proc.)

160 Mpps (4 pkt. procs.)

Back/mid-plane (full duplex)

25.6 Gb/s 102.4 Gb/s

Data Slots 8 (4 ports/slot) 8 (4 ports/slot)

Data Ports (max.) 8 OC-48 8 OC-192

Power (max.) 1.7 KW 3.4 KW

Weight 280 lb 370 lb

Size Half telco rack Half telco rack

Dimensions (HxWxD in.)

35x19x23.5 35x19x29

M40 M160

Page 16: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 16

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Juniper M-Series System Architecture

RoutingProcessUser

Interface

ChassisMgmt.

InterfaceMgmt.

Routing Engine(CPU-based)

Forwarding Engine(ASIC-based)

JUNOS Router OS(routing & signalingprotocols, system

management)

Computer-scale ASIC-based centralizedpacket processor

RoutingTable

Packets In Packets Out

PacketProcessing

Line Card

Line Card

Line Card

Line Card

Switch Fabric

ForwardingTable

Page 17: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 17

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Juniper M-Series Functional System Operation

ControllerASIC

FPC

PIC

FPC

PIC

Input Port Output Port

Backplane orMidplane

1 2

3

4a

4b

5

6

7

8

9 10I/O Manager

ASIC

Shared Memory (distributed on FPCs)

I/O ManagerASIC

Distributed BufferManager ASIC

Distributed BufferManager ASIC

ForwardingTable

InternetProcessor II ASIC

Packets

Notification

64B Blocks

Packets

Page 18: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 18

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

#4

#2

Juniper M-Series Module Organization

FPC #8

Switching &Forwarding Module

FPC

uP

M40 Backplane (51.2 Gb/s)

#2

FPC

PIC#4Cntlr.

#1

Control Plane

Data Plane

#2

#1

PCI

3.2 Gb/sfull duplex

DistributedBuffer Mgr.

InternetProc. II

100 Mb/sEthernet

PIC#1Cntlr. I/O

Manager

JUNOS Internet S/W

128MB

Routing Engine

Misc. Control Subsys.

FT

12.8 Gb/sfull duplex

I/OManager

M160 Midplane (204.8 Gb/s)

#1

PacketDirector

Page 19: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 19

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Cisco Catalyst 6000 Family: A Comparison 6009 6513

Throughput (non-blocking)

32 Gb/s 128 Gb/s

Processing @ 256B packets

15 Mpps 100 Mpps (?)

Back/mid-plane 32 Gb/s (bus) 128 Gb/s (switch) 32 Gb/s (bus)

Data Slots† 8 10

Data Ports (max.) 128 GbE†† 128 GbE

Power (max.) ~1.3 KW > 2.5 KW

Weight ~166 lb 240 lb

Size >1/3 telco rack ~Half telco rack

Dimensions (HxWxD in.)

25.2x17.2x18.1 33.3x17.2x18.1

† Only includes usable data slots

† † This number of max. ports means an oversubscription of 4x (so not non-blocking!)

6000 Family

6513

Page 20: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 20

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Cisco Catalyst Family System Architecture

NetworkManagement

Forwarding Engine(CPU-based)

Supervisor Engine

RoutingTable

Packets InLine CardLine Card

ForwardingTable

Bus

MSFC

PFC

Routing Engine(CPU-based)

Management Engine

Packets Out

Data Plane

Control Plane

First Generation of Catalyst: Catalyst 6000

Page 21: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 21

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Cisco Catalyst Family Functional System Operation

Results Bus

NetworkManagement

MSFC

PFC

FabricArbitration

Control Bus

64KB

448KB

64KB

448KB

#1

#4

#1

#4

SupervisorEngine

ControllerASIC

ControllerASIC1

2

3

5

4

5

6

Data Bus 32 Gb/s

Page 22: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 22

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Cisco Catalyst 6500 System Architecture

NetworkManagement

Forwarding Engine(ASIC-based)

Supervisor Engine

RoutingTable

Packets InLine CardLine Card

ForwardingTable

Bus

MSFC

PFC

Routing Engine(CPU-based)

Management Engine

Packets Out

Data Plane

Control Plane

Headers

Data

SwitchingFabric

Second Generation of Catalyst: Catalyst 6500

Page 23: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 23

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Cisco Catalyst 6500 Functional System Operation

Results Bus

Control Bus

SupervisorEngine

5Data Bus 32 Gb/s

NetworkMgt.

MSFC

PFC

FabricArb.

512KB

ASIC#4

#1

#4

512KB

ASIC#1

ASIC#4

Line Card Line Card

Fabric I/F

SwitchingFabric

ASIC#1 Fabric I/F

1

2

3

4

6

78

9

6

8 Gb/s

16 Gb/s

Second Generation Catalyst: Catalyst 6500

Page 24: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 24

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Cisco Catalyst 6500 System Architecture

NetworkManagement

ForwardingEngine

Supervisor Engine

RoutingTable

Packets In

MSFCRouting Engine

ManagementEngine

Packets Out

Data Plane

Control Plane

SwitchingFabric

Packets In

Packets OutLine Card

PFC PFC

ForwardingEngine

Third Generation of Catalyst: Catalyst 6500+

Page 25: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 25

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Cisco Catalyst Family Functional System Operation

#1

#4

1

ASIC#4

ASIC#1

DFC

512KB

Line Card

2

3

4

57

#1

#4

9

ASIC#4

ASIC#1

DFC

512KB

Line Card

8

Supervisor Engine

NetworkMgt.

MSFC

Fabric Arb.

Fabric I/F

SwitchingFabric

Fabric I/F6

Third Generation of Catalyst: Catalyst 6500+

Page 26: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 26

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Building Very High-Speed Switches from Low-speed Components

Problem: scale this architecture to handle higher link speeds Emulate output queueing Provide some measure of perf., such as bounded delay

Virtual OutputQueues Switch Fabric

InputLinks

OutputQueues

OutputLinks

1

N

N

InputQueues

1

NN

OQN

VOQ1,1

VOQ1,N

VOQN,1

VOQN, N

Scheduler

11OQ

1

221

Page 27: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 27

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Building Very High-Speed Switches from Low-speed Components

InputDemultiplexers

D 1

D N

M 1

M N

1

N

1

N

i

j

D i

M j

S1

Sk

ParallelSwitches

OutputMultiplexers

Operate parallel switch system under control of a global scheduler

Requires No speedup in the system No reordering at outputs

Mneimneh, Sharma & Siu

GlobalScheduler Operate parallel switches s. t. they

collectively mimic an OQ switch Requires

Speedup in the system Emulation of shadow OQ switch

Iyer, Awadallah & McKeown

Page 28: Approaches to Designing a High-Performance Switch Router

Designing a High-Performance Switch Router 28

Metanoia, Inc.Critical Systems Thinking™

Copyright 2002, All Rights Reserved

Building Very High-Speed Switches: References

[SAN00] S. Iyer, A. Awadallah, N. McKeown, ““Analysis of a packet switch with memories running slower than the line rate,” Proc. IEEE Infocom’00, March 2000.

[Sun00] S. Iyer, “Analysis of a packet switch with memories running slower than the line rate,” MS Thesis, Stanford University, May 2000.

[SuM03] S. Iyer, N. McKeown, “Analysis of the parallel packet switch architecture,” to appear IEEE/ACM Trans. on Networking, April 2003.

[MSS01] S. Mneimneh, V. Sharma, K. Y. Siu, “On scheduling using parallel input-output queued crossbar switches with no speedup,” Proc. IEEE Workshop on High Performance Switching & Routing (HPSR’01), May 2001.

[MSS02] S. Mneimneh, V. Sharma, K. Y. Siu, “Switching using parallel input-output queued switches with no speedup,” IEEE/ACM Trans. on Networking, vol. 10, no. 5, Oct. 2002.

[Mne02] S. Mneimneh, “Algorithms for high-speed switching and routing,” Ph.D. Thesis, MIT, June 2002.