Router Construction II

OutlineNetwork ProcessorsAdding ExtensionsScheduling Cycles

Observations

• Emerging commodity components can be used to build IP routers– switching fabrics, network processors,

• Routers are being asked to support a growing array of services– firewalls, proxies, p2p nets, overlays, ...

Data Plane(IP)

Control Plane(BGP, RSVP…)

Router Architecture

Software-Based Router

+ Cost

+ Programmability

– Performance (~300 Kpps)

– RobustnessData Plane(IP)

Hardware-Based Router

– Cost

– Programmability

+ Performance (25+ Mpps)

+ RobustnessData Plane(IP)

NP-Based Router Architecture

+ Cost ($1500)

+ Programmability

? Performance

? RobustnessData Plane(packet flows)

Control Plane(packet flows)

IXP1200

In General...

Pentium

Architectural Overview

. . . Network Services . . .

VirtualRouter

. . . Hardware Configurations . . .

Packet Flows

Forwarding Paths

Switching Paths

Virtual Router

• Classifiers

• Schedulers

• Forwarders

Simple Example

Active Protocol

Intel IXP

Scratch

6 Micro-Engines

StrongARM

IXP1200 Chip

Processor Hierarchy

MicroEngines

Pentium

StrongArm

Data Plane Pipeline

DRAM(buffers)

SRAM(queues,

state)

InputFIFO Slots

OutputFIFO Slots

InputContexts

OutputContexts

Data Plane Processing

INPUT context loopwait_for_datacopy in_fiforegsBasic_IP_processingcopy regsDRAMif (last_fragment)

enqueueSRAM

OUTPUT context loop

if (need_data)

select_queue

dequeueSRAMcopy DRAMout_fifo

0123456789

0 4 8 16 24

MicroEngine Contexts

input output

Pipeline Evaluation

100Mbps Ether 0.142Mpps

Measured independently

What We Measured

• Static context assignment– 16 input / 8 output

• Infinite offered load• 64-byte (minimum-sized) IP packets• Three different queuing disciplines

Single Protected Queue

• Lock synchronization• Max 3.47 Mpps• Contention lower bound 1.67 Mpps

Output FIFO

Multiple Private Queues

• Output must select queue• Max 3.29 Mpps

Output FIFO

Multiple Protected Queues

• Output must select queue• Some QoS scheduling (16 priority levels)• Max 3.29 Mpps

Output FIFO

Data Plane ProcessingINPUT context loopwait_for_datacopy in_fiforegsBasic_IP_processingcopy regsDRAMif (last_fragment)

enqueueSRAM

OUTPUT context loop

if (need_data)

select_queue

Cycles to WasteINPUT context loopwait_for_datacopy in_fiforegsBasic_IP_processingnopnop…nopcopy regsDRAMif (last_fragment)

enqueueSRAM

OUTPUT context loop

if (need_data)

select_queue

How Many “NOPs” Possible?

0/0 40/4 80/8 160/16 320/32 640/64

Extra Register/SRAM Operations

IXP1200 Evalualtion Board1.2 Mpps = 8x100Mbps

Data Plane Extensions

Processing Memory Ops Register Ops

Basic IP 6 32

TCP Splicer 6 45

TCP SYN Monitor 1 5

ACK Monitor 3 15

Port Filter 5 26

Wavelet Dropper 2 28

Control and Data Plane

Smart Dropper

Layered Video Analysis(control plane)

(data plane)

Shared State

What About the StrongARM?

• Shares memory bus with MicroEngines– must respect resource budget

• What we do– control IXP1200 Pentium DMA– control MicroEngines

• What might be possible– anything within budget– exploit instruction and data caches

• We recommend against– running Linux

Performance

MicroEngines

Pentium

StrongArm

310Kpps with1510 cycles/packet

3.47Mpps w/ no VRP or1.13Mpps w/ VRP buget

Pentium

• Runs protocols in the control plane– e.g., BGP, OSPF, RSVP

• Run other router extensions– e.g., proxies, active protocols,

overlays

• Implementation– runs Scout OS + Linux IXP driver– CPU scheduler is key

Processes...

Input Port

Pentium

Output PortP

Performance

0 50 100 150 200 250 300 350 400

Aggregate Offered Load (Kpps)

Interrupt

Polling, 1 Process

Polling, 2 Process

Polling, 3 Process

Polling, 3 Process,w/ o Batching

Performance (cont)

IP - -IP ++

Active IP (native)

Active IP (Java)

Transparent Proxy

Classic Proxy

450MHz P-I ICisco 7200

Scheduling Mechanism• Proportional share forms the base

– each process reserves a cycle rate– provides isolation between processes– unused capacity fairly distributed

• Eligibility– a process receives its share only when its

source queue is not empty and sink queue is not full

• Batching– to minimize context switch overhead

Share Assignment

• QoS Flows– assume link rate is given, derive cycle rate– conservative rate to input process– keep batching level low

• Best Effort Flows– may be influenced by admin policy– use shares to balance system (avoid livelock)– keep batching level high

Experiment

A (BE)

B (QoS)

C (QoS)

Mixing Best Effort and QoS

0 20 40 60 80 100 120 140

Flow A Offered Load (Kpps)

(Kpps)

Aggreate

Flow A (BE)

Flow B(QoS,90Kpps)

Flow C(QoS,90Kpps)

• Increase offered load from A

CPU vs Link

0 10 20 30 40 50 60

Flow A Additional Processing Delay (ms)

(Kpps)

Aggreate

Flow A (BE)

Flow B(QoS,90Kpps)

Flow C(QoS,90Kpps)

• Fix A at 50Kpps, increase its processing cost

Turn Batching Off

0 10 20 30 40 50 60

Flow A Additional Processing Delay (us)

(Kpps)

Aggreate

Flow A (BE)

Flow B(QoS,90Kpps)

Flow C(QoS,90Kpps)

• CPU efficiency: 66.2%

Enforce Time Slice

0 10 20 30 40 50 60

Flow A Additional Processing Delay (us)

(Kpps)

Aggreate

Flow A (BE)

Flow B(QoS,90Kpps)

Flow C(QoS,90Kpps)

• CPU efficiency: 81.6% (30us quantum)

Batching Throttle• Scheduler Granularity: G

– flow processes as many packets as possible w/in G

• Efficiency Index: E, Overhead Threshold: T– keep the overhead under T%, then 1 / (1+T) < E

• Batch Threshold: Bi– don’t consider Flow i active until it has

accumulated at least Bi packets, where Csw / (Bi x

Ci) < T

• Delay Threshold: Di– consider a flow that has waited Di active

Dynamic Control

• Flow specifies delay requirement D• Measure context switch overhead

offline• Record average flow runtime• Set E based on workload• Calculate batch-level B for flow

Packet Trace

0 5 10 15 20 25 30 35 40 45 50 55

T=10%D=1000

T=25%D=1000

T=10%D=500

T=10%D=300

Router Construction II

Documents

Transcript of Router Construction II

UNDER CONSTRUCTION II

Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles.

Construction safety ii

Construction of a new CNC Router - LiddievilleConstruction of a new CNC Router Constructed by John Horne & William Beatty Jan/Feb, 2010 A couple of years ago I constructed a CNC router

Buku NCVS Bab II Kontruksi (Chapter II Construction)

Building construction-ii

IMAGE II - Construction

Unit II Construction by BeST

CSE 131B – Compiler Construction II

User Guide - Linksys Router EA6900...ii Linksys EA-Series Contents How to put your new router behind an existing router. . . . . . 38 To add your router to an existing router or gateway

Management in Construction II Present

On-Chip Networks II: Router Microarchitecture & Routing

Sanitary Construction II

Project phases II. Tendering-construction-operation phases II 2018.pdf · Project phases II. Tendering-construction-operation Construction management 2. Lepel –BME Department of

Comparative Construction II

Role of Construction Managers II

Project phases II. Tendering-construction-operation - · PDF fileProject phases II. Tendering-construction- ... Tendering Public procurement procedure types ... Project phases II.

Construction II Roads & Highways

Router Internals CS 4251: Computer Networking II Nick Feamster Spring 2008.

Router Construction II