Fast Switches Switch Fabric Architecture Fast Datagram Switches Higher-Layer and Active Processing...
-
Upload
timothy-summers -
Category
Documents
-
view
219 -
download
2
Transcript of Fast Switches Switch Fabric Architecture Fast Datagram Switches Higher-Layer and Active Processing...
Fast Switches
Switch Fabric Architecture Fast Datagram Switches Higher-Layer and Active Processing (From Kwangwoon Univ.)
Introduction
• The way of the determination and setting the path– Centralized control : single point control– Distributed control : per input port processing– Self-routing : autonomous control– Distributed control & Self-routing
• Advantage : don’t limit scalability• Disadvantage : difficult global optimization
• Blocking Characteristics– Strictly nonblocking– Wide-sense nonblocking : switching algorithm (set path)– Rearrangeably nonblocking : rearrange path– Virtually nonblocking : low probability of blocking
• Nonblocking Switch Fabric Principle : Avoid blocking by space-division parallelism, internal speedup, and internal pipelined buffering with cut-through
Buffering
• Why buffering?– If all traffic were uniform, buffering would not be needed.
– If traffic is bursty, however, buffering would be needed
because packets are to be dropped.
IN 1
IN 2
OUT
collisions
delayed
Buffering
• Buffer Location– Unbuffered
• This way is undesirable for fast packet switches.(buffering)
• Optical components are suitable, because there is no way to queue.– Dealing with an optical burst switch
» Dropping burst and retransmitting end-to-end are enable.
» Burst be deflected by scheduling.
» Convert burst to the electronic domain for queueing.
– Internally buffered• Increases complexity
– Input or ouput queued
– Input AND ouput queued
– Shared buffer switch• A logical partitioning of physical memory
Buffering
• Buffer location– Unbuffered vs internally buffered
Buffering
• Buffer Location– Input or output buffered switches.
Buffering
• Buffer Location– Combined input/output buffered switch
Buffering
• Buffer Location– Shared buffer switch
Buffering
• Head-of-line blocking– Input queueing
• Input queueing holds packets until the switch is able to direct them to the appropriate output
– Output queueing• Shared medium network due to contention from other network nodes for M
AC
– Speedup (S) : the ratio of internal to external data rates
– Internal buffering
– Internal expansion : clos fabic
•Head-of-line blocking Avoidance principle
•Output queueing requires internal speedup, expansion, or buffering. Virtual output requires additional queues or queuing complexity. The two techniques must be traded against on another,and can be used in combination
Buffering
• Head-of-Line Blocking– Input vursus output queueing
Buffering
• Head-of-line blocking– Clos fabric
Buffering
• Virtual Output Queueing– This scheme requires that packets be multiplexed and timestamped todet
ermain the arrival orderamong the queues at each input
– A scheduling algorithm can be applied to determine which packets to accept to match a set of nonconflicting output
– Disadvantage• Waste of buffer space
– tradoff• Increase memory density for more queues practical
• Increased logic density makes more complex hardware
Buffering
• Virtual Output Queueing : – Head of line blocking can be eliminated
Single-Stage Shared Elements
• Bus Interconnects
i0 i1 i2 i3 i4 i5 i6 i7
o0 o1 o2 o3 o4 o5 o6 o7
w
Single-Stage Shared Elements
• Bus Interconnects– Packet must wait in input queues until the bus is free
– Aggregate throughput : ri < w/nt (w:bandwidth, 1/nt:n port, bit rate)
– Bus speedup is limited by the available electronic technology
– Multicast
– Ring Switches
– Throughput can be higher due to better ring utilization of the MAC protocol and the isolation of electrical effects.
Single-Stage Shared Elements
• Shared Memory Fabrics
Shared memory
Output demultiplex
INPUT
MULTIPLEX
I0
I1
I2
I3
I4
I5
I6
I7
o0 o1 o2 o3 o4 o5 o6 o7
Single-Stage Shared Elements
• Shared Memory Fabrics– Difficulties
• memory density is increaing exponentially, memory access time are not.• Packet must typically be completely read into memory before being output
– Multicast
Single-Stage Space Division Elements
• Basic Switch Element
– Electronic Switch Elements : 2 * 2 switch element
straight cross duplicate
Control
Packet buffer
Packet buffer
Cut-through
Cut-through
Outputmultiplexor
o0
i1
c
i0
o1
Single-Stage Space Division Elements
• Basic Switch Element– Electronic Switch Elements(2 * 2 Self-routing switch element)
Control
Packet buffer
Packet buffer
Cut-through
Cut-through
Outputmultiplexor
o0
i1
i0
o1
Headerdecode
Headerdecode
delay
delay
Single-Stage Space Division Elements
• Basic Switch Element– Optical Switch Elements
electrode
electrode
i0
i1
o0
o0
electrode
electrode
i0
i1
o0
o0
Cross state
straight state (voltage applied)
Single-Stage Space Division Elements
• Crossbar
– Crossbar switch point states
column
oj
ii
column
oj
ii
electronic
Optical MEMS
cross turn duplicate
Multistage Switches
• Crossbar– Advantage
• Simple and regularity
– Disadvantage• Scaling complexity(n2)
• Simple model of the cost in chip area– A=ac + n(ai + ao) + n2ax
Single-Stage Space Division Elements
• Crossbar– Crossbar switch
I0
I1
I2
I3
I4
I5
I6
I7 o0 o1 o2 o3 o4 o5 o6 o7
Multistage Switches
• Tiling Crossbar– Tile switch elements in a square array
– This is not a cost effective solution for large switches
• Multistage Interconnection Networks(MINs)– Delta switch
• advantage– Elimination of central switch control(self-routing)
• Disadvantage– Preservation of packet sequence since cell has same path
– Load is not distributed
– Benes switch• Dinamically route packets with additional stages.
– Resequencing buffer by using a timestamp inserted into the internal switch header
– Banyan switch• Using shared memory and crossbar switchs.
1010
Multistage Switches
• Multistage Interconnection Networks – Delta switch
I0
I1
I2
I3
I4
I5
I6
I7
I8
I9
I1
0
I1
1
I1
2
I1
3
I1
4
I1
5
o0
o1
o2
o3
o4
o5
o6
o7
o8
o9
o10
o11
o12
o13
o14
o15
Multistage Switches
• Multistage Interconnection Networks– Benes switch
I0
I1
I2
I3
I4
I5
I6
I7
I8
I9
I1
0
I1
1
I1
2
I1
3
I1
4
I1
5
o0
o1
o2
o3
o4
o5
o6
o7
o8
o9
o10
o11
o12
o13
o14
o15
s0 s1 s2 s3 s4 s5 s1010 1010 1010 1010
Multistage Switches
• Multistage Interconnection Networks– Banyan switch
I0
I1
I2
I3
I4
I5
I6
I7
I8
I9
I10
I11
I12
I13
I14
I15
o0
o1
o2
o3
o4
o5
o6
o7
o8
o9
o10
o11
o12
o13
o14
o15
S0 S1
Multistage Switches
• Multistage Interconnection Networks– Optical Multistage Networks
• Incapable of buffering : nonblocking bufferless interconnection fabrics
• Crosstalk problem : dilation techniques
• Dilated Benes switch
Pass Cross
Multistage Switches
• Scaling Speed(parallel switch slices)
datadelay
σ0
σm-1
io
i1
In-1
co
c1
cn-1
on-1
o1
o1
FabricControl
Multicast Support
• Crossbar Switch Multicast– Service disciplines
• No fanout splitting : according to output blocking
• Fanout splitting
– The Goal of schedule servicing• Throughput is high
• Some fairness measure is met, in particular packets should not be starved
• The scheduling discipline can be implemented at high-speed(line rate)
– Variety of scheduling are possible• Concentrates residue among as few inputs as possible
• Weight based
Multicast Support
• Crossbar Switch Multicast scheduling
I
1
I
2
I
3
I
4
I
5
1_3_5
_2345
1234_
_23_5
_2_4_
1 1 12 2
3 4 2 5 2
3 3 3 4
5 4
o1 o2 o3 o4 o5
Multicast Support
• Multistage Fabric Multicast
I0
I1
I2
I3
I4
I5
I6
I7
I8
I9
I1
0
I1
1
I1
2
I1
3
I1
4
I1
5
o0
o1
o2
o3
o4
o5
o6
o7
o8
o9
o10
o11
o12
o13
o14
o15
Copy stages Routing stagesTranslate
00000100
1010
1110
Review – Fast Packet SwitchingReview – Fast Packet Switching
• 80’s link rate technology improvement.• Connection-oriented fast packet switching technologies for
high speed networks.• 90’s widely deployed.
– ex. ATM for high-speed backbone networks
• Benefit (5.3)– Simplifying packet processing and forwarding.– Eliminating the store and forward latency.– Provide QoS quarantees, Resources reservation.
Fast Datagram SwitchesFast Datagram Switches
• Resisted the global deployment of CON.– IP-based Internet, WWW.
– shared medium link protocols were overcome
• Fast Datagram Switches– Motivation
• High Performance maintaining.
• Support Connectionless networks.
– Derivation• Complexity of Switch input and output processing
Fast Packet Switching ArchitectureFast Packet Switching Architecture
routing and s ignaling
switchfabric
contro l
input processing output processing
linkschedulingC ID table
switchfabric
link
link
link
link
label swap
Connection-Oriented Vs ConnectionlessConnection-Oriented Vs Connectionless
• Similarity– At a high-level, Each switch has the same functional block.– Ex. Routing, Signaling, Management…
• Difference– Input processing
• Address lookup using a prefix table.• Packet classification.
– Output processing• Packet scheduling to meet QoS requirement.
Architecture of Fast Datagram SwitchingArchitecture of Fast Datagram Switching
routing and s ignaling
switchfabric
contro l
output processinginput processing
inputprocessor
outputscheduling
link
outputscheduling
link
link
link
inputprocessor
headers
switchfabric
prefixesprefixes
headerprocessing
headerprocessing
forwardingengines
Packet Processing RatesPacket Processing Rates
• Design a switch– Datagram size : Min 40Byte ~ Max 1500Byte.
– Rule of thumb : average packet size.
• Form of Processing– Sequentially processing for minimum packet size.
– Parallel processing for average packet size.
• Packet Processing Rate
The Packet processing rate is a key throughput measure of a switch. Packet processing software and shared parallel hardware resources must be able to sustain the average packet processing rate.
Fast Forwarding LookupFast Forwarding Lookup
• Review - Fast Packet Switching– CID for Fast Packet Switching.
– Problem : Table entry size.
• Fast Datagram Switching– Problem : similar to Fast Packet Switching.
– Solutions• Flat Addressing
• Hierarchical Address
• Software Prefix Matching
• Hardware Matching Support
• Source Routing
Flat AddressingFlat Addressing
adest pout
asrc adest payload payloadasrc adestpout
hardware match
softwaresearch
=
Figure 5.50 Address lookup
Software SearchSoftware Search
• Lookup Time– Worst case : minimum packet size, worst-case lookup algorithm
• Memory Required– Trade Off( performance vs cost ).– Amount of memory reasonable to contain in the switch input proce
ssing.
• Update Time– Lookup data structure.
• Techniques– Tree search( O(logBN) for N entries, B is branch factor ).– Hash function( O(1) for no hash collisions ).
Content Addressable Memory(CAM)Content Addressable Memory(CAM)
• Feature– Parallel scan– Memory Access– Referencing One(by Key)– Return Associate Data
• Benefit– Initutive & speed
• Model
• Each word consists of a <search-field, return-field>.
• All words are checked in parallel in a single CAM cycle.
• Return-field portion of the word is the output of the CAM read.
• CAMs specifically designed for network address lookup.Key
Data
Association
Data
Hierarchical AddressesHierarchical Addresses
• Exploited to reduce the size of the forwarding tables.
• Forwarding entries can be represented as prefix addresses.
• Higher order bit portion of an address that must be matched to lead toward the destination.
• Similar to PSTN.
Software Prefix MatchingSoftware Prefix Matching
prefix pout fstate
pout 101 011 01 payloadpayload101 011 01
- hop countchecksum fix
*
00*
001*
0001*
0101*
101*
10100*
11*
111*
Figure 5.52 IP Prefix matching
Basic Prefix Matching Algorithm Basic Prefix Matching Algorithm
*
0 1
00 * 01 10 11 *
000 001 * 010 101 * 111 *
0001 * 0101 * 1010
10100 *
=101 011 01
101
Figure 5.53 Trie prefix matching
Hardware Matching SupportHardware Matching Support
• Motivation– Complexity of software algorithms.
• Hardware techniques for line rate lookup.– Assisting logic can be Embedded in the memory.
• CAMs for Variable Prefixes.– Translation logic can be provided that assists the location of addre
sses in conventional memory.• Multistage Lookup.
CAMs for Variable PrefixesCAMs for Variable Prefixes
prefix pout fstate
*00XXXX001XXX0001XX0101XX101XXX10100X11XXXX111XXX
- hop c ountc hec ksum fix
101 011 01 payload 101 011 01pout payload
prioritymux
Figure 5.54 Ternary C AM prefix matc hing
Multistage LookupMultistage Lookup
101 011 01
101011
p 2- 1
0 i
0
1
pout / index
101 011 block
pout
long prefix table
short prefix table
pout 101 011 01 payload
payload
Figure 5.55 Multistage prefix match
Source RoutingSource Routing
• Eliminate the per hop address lookup – By precomputing the route.
• Include the entire path in the packet header.
1
2 3
5 0 6 .
0 6 .6 .
.p5
p0
p6
Figure 5.56 Source routed label stack
Packet ClassificationPacket Classification
• Two other common forms – Separation of control packets.– Separation of packets belonging to different traffic classes.
• General classification include– Classification into a QoS traffic class.– Policy based routing.– Security.– Higher-layer switching functions.– Active networking.
Packet Filtering ProblemPacket Filtering Problem
• Classification occur before queueing in the node.
• General problem of classification.
TO S src adr payload
R2
R3
R4
R1R0
source address
TOS
R5
Packet Classification ImplementsPacket Classification Implements
• Hardware Classification– Ternary CAMs can be used to match the rules in parallel.– Similar to the address lookup.
• Software Classification– Forwarding table lookup(section 5.1.1).– “Grid of tries”, “Tuple space search”.
• Preprocessing Classifiers– Preprocess all possible packet fields.
Output Processing and Packet Scheduling (1)Output Processing and Packet Scheduling (1)
• Reasons to perform output scheduling
– Datagrams are consist of• Quranteed Service classes.• Best Effort Traffic.
– Sufficient to meet delay and bandwidth bounds.– Fair service among the best-effort flows.
– Congestion control mechanisms does not protect quaranteed service classes from the best-effort traffic.
Output Processing and Packet Scheduling (2)Output Processing and Packet Scheduling (2)
• Fair Queuing– Packet Fair Queuing(PFQ).
– Weighted Fair Queuing(WFQ).
• Per-Flow Queuing– The highest degree of isolation.
– Control occurs when per flow queuing is used.
• Congestion Control– Large building queues increase delay, resulting in congestion.
– Discard to keep Queues from building.
– Ex. RED(Random early detection)
Higher-Layer and Active ProcessingHigher-Layer and Active Processing
• Active networking uses general classification techniques.– First, identify packets for active processing.
– Executes active applications in the network nodes on the identified packets, connections, or flows to provide the desired service.
• Motivation for “Active networking”– Open flexible interfaces to allow provisioning of new protocols and
services.
• Condition for “Active networking”– Should not impede the non-Active fast path.
Active Network Node Reference ModelActive Network Node Reference Model
MEE( Manage-- ment EE )
EEs ( Execution environments )
NodeO S
switch control
switchfabric
normal forwarding pathpacket filter
Ac tiveProcessing
Figure 5.58 Ac tive network node reference model