switching

63
Switching An Engineering Approach to Computer An Engineering Approach to Computer Networking Networking

description

know about switching

Transcript of switching

  • SwitchingAn Engineering Approach to Computer Networking

    S. Keshav

  • What is it all about?How do we move traffic from one part of the network to another?Connect end-systems to switches, and switches to each otherData arriving to an input port of a switch have to be moved to one or more of the output ports

    S. Keshav

  • Types of switching elementsTelephone switchesswitch samplesDatagram routersswitch datagramsATM switchesswitch ATM cells

    S. Keshav

  • ClassificationPacket vs. circuit switchespackets have headers and samples dont Connectionless vs. connection orientedconnection oriented switches need a call setupsetup is handled in control plane by switch controllerconnectionless switches deal with self-contained datagrams

    S. Keshav

    Connectionless (router)

    Connection-oriented (switching system)

    Packet switch

    Internet router

    ATM switching system

    Circuit switch

    Telephone switching system

  • Other switching element functionsParticipate in routing algorithmsto build routing tablesResolve contention for output trunksschedulingAdmission controlto guarantee resources to certain streamsWell discuss these laterHere we focus on pure data movement

    S. Keshav

  • RequirementsCapacity of switch is the maximum rate at which it can move information, assuming all data paths are simultaneously activePrimary goal: maximize capacitysubject to cost and reliability constraintsCircuit switch must reject call if cant find a path for samples from input to outputgoal: minimize call blockingPacket switch must reject a packet if it cant find a buffer to store it awaiting access to output trunkgoal: minimize packet lossDont reorder packets

    S. Keshav

  • A generic switch

    S. Keshav

  • OutlineCircuit switchingPacket switchingSwitch generationsSwitch fabricsBuffer placementMulticast switches

    S. Keshav

  • Circuit switchingMoving 8-bit samples from an input port to an output portRecall that samples have no headersDestination of sample depends on time at which it arrives at the switch actually, relative order within a frameWell first study something simpler than a switch: a multiplexor

    S. Keshav

  • Multiplexors and demultiplexorsMost trunks time division multiplex voice samplesAt a central office, trunk is demultiplexed and distributed to active circuitsSynchronous multiplexorN input linesOutput runs N times as fast as input

    S. Keshav

  • More on multiplexingDemultiplexorone input line and N outputs that run N times slowersamples are placed in output buffer in round robin orderNeither multiplexor nor demultiplexor needs addressing information (why?)Can cascade multiplexorsneed a standardexample: DS hierarchy in the US and Japan

    S. Keshav

  • Inverse multiplexingTakes a high bit-rate stream and scatters it across multiple trunksAt the other end, combines multiple streamsresequencing to accommodate variation in delaysAllows high-speed virtual links using existing technology

    S. Keshav

  • A circuit switchA switch that can handle N calls has N logical inputs and N logical outputsN up to 200,000In practice, input trunks are multiplexedexample: DS3 trunk carries 672 simultaneous callsMultiplexed trunks carry frames = set of samplesGoal: extract samples from frame, and depending on position in frame, switch to outputeach incoming sample has to get to the right output line and the right slot in the output framedemultiplex, switch, multiplex

    S. Keshav

  • Call blockingCant find a path from input to outputInternal blockingslot in output frame exists, but no pathOutput blockingno slot in output frame is availableOutput blocking is reduced in transit switchesneed to put a sample in one of several slots going to the desired next hop

    S. Keshav

  • Time division switchingKey idea: when demultiplexing, position in frame determines output trunkTime division switching interchanges sample position within a frame: time slot interchange (TSI)

    S. Keshav

  • How large a TSI can we build? Limit is time taken to read and write to memoryFor 120,000 circuitsneed to read and write memory once every 125 microsecondseach operation takes around 0.5 ns => impossible with current technologyNeed to look to other techniques

    S. Keshav

  • Space division switchingEach sample takes a different path thoguh the swithc, depending on its destination

    S. Keshav

  • CrossbarSimplest possible space-division switchCrosspoints can be turned on or offFor multiplexed inputs, need a switching schedule (why?)Internally nonblockingbut need N2 crosspointstime taken to set each crosspoint grows quadraticallyvulnerable to single faults (why?)

    S. Keshav

  • Multistage crossbarIn a crossbar during each switching time only one crosspoint per row or column is activeCan save crosspoints if a crosspoint can attach to more than one input line (why?)This is done in a multistage crossbarNeed to rearrange connections every switching time

    S. Keshav

  • Multistage crossbarCan suffer internal blockingunless sufficient number of second-level stagesNumber of crosspoints < N2Finding a path from input to output requires a depth-first-searchScales better than crossbar, but still not too well120,000 call switch needs ~250 million crosspoints

    S. Keshav

  • Time-space switchingPrecede each input trunk in a crossbar with a TSIDelay samples so that they arrive at the right time for the space division switchs schedule

    S. Keshav

  • Time-space-time (TST) switchingAllowed to flip samples both on input and output trunkGives more flexibility => lowers call blocking probability

    S. Keshav

  • OutlineCircuit switchingPacket switchingSwitch generationsSwitch fabricsBuffer placementMulticast switches

    S. Keshav

  • Packet switchingIn a circuit switch, path of a sample is determined at time of connection establishmentNo need for a sample header--position in frame is enoughIn a packet switch, packets carry a destination fieldNeed to look up destination port on-the-flyDatagramlookup based on entire destination addressCelllookup based on VCIOther than that, very similar

    S. Keshav

  • Repeaters, bridges, routers, and gatewaysRepeaters: at physical levelBridges: at datalink level (based on MAC addresses) (L2)discover attached stations by listeningRouters: at network level (L3)participate in routing protocolsApplication level gateways: at application level (L7)treat entire network as a single hope.g mail gateways and transcodersGain functionality at the expense of forwarding speedfor best performance, push functionality as low as possible

    S. Keshav

  • Port mappersLook up output port based on destination addressEasy for VCI: just use a tableHarder for datagrams:need to find longest prefix matche.g. packet with address 128.32.1.20entries: (128.32.*, 3), (128.32.1.*, 4), (128.32.1.20, 2)A standard solution: trie

    S. Keshav

  • Tries

    Two ways to improve performancecache recently used addresses in a CAMmove common entries up to a higher level (match longer strings)

    S. Keshav

  • Blocking in packet switchesCan have both internal and output blockingInternalno path to outputOutputtrunk unavailableUnlike a circuit switch, cannot predict if packets will block (why?)If packet is blocked, must either buffer or drop it

    S. Keshav

  • Dealing with blockingOverprovisioninginternal links much faster than inputsBuffersat input or outputBackpressureif switch fabric doesnt have buffers, prevent packet from entering until path is availableParallel switch fabricsincreases effective switching capacity

    S. Keshav

  • OutlineCircuit switchingPacket switchingSwitch generationsSwitch fabricsBuffer placementMulticast switches

    S. Keshav

  • Three generations of packet switchesDifferent trade-offs between cost and performanceRepresent evolution in switching capacity, rather than in technologyWith same technology, a later generation switch achieves greater capacity, but at greater costAll three generations are represented in current products

    S. Keshav

  • First generation switch

    Most Ethernet switches and cheap packet routersBottleneck can be CPU, host-adaptor or I/O bus, depending

    S. Keshav

  • ExampleFirst generation router built with 133 MHz PentiumMean packet size 500 bytesInterrupt takes 10 microseconds, word access take 50 nsPer-packet processing time takes 200 instructions = 1.504 sCopy loopregister
  • Second generation switch

    Port mapping intelligence in line cardsATM switch guarantees hit in lookup cacheIpsilon IP switchingassume underlying ATM networkby default, assemble packetsif detect a flow, ask upstream to send on a particular VCI, and install entry in port mapper => implicit signaling

    S. Keshav

  • Third generation switchesBottleneck in second generation switch is the bus (or ring)Third generation switch provides parallel paths (fabric)

    S. Keshav

  • Third generation (contd.)Featuresself-routing fabricoutput buffer is a point of contentionunless we arbitrate access to fabricpotential for unlimited scaling, as long as we can resolve contention for output buffer

    S. Keshav

  • OutlineCircuit switchingPacket switchingSwitch generationsSwitch fabricsBuffer placementMulticast switches

    S. Keshav

  • Switch fabricsTransfer data from input to output, ignoring scheduling and bufferingUsually consist of links and switching elements

    S. Keshav

  • CrossbarSimplest switch fabricthink of it as 2N buses in parallelUsed here for packet routing: crosspoint is left open long enough to transfer a packet from an input to an outputFor fixed-size packets and known arrival pattern, can compute schedule in advanceOtherwise, need to compute a schedule on-the-fly (what does the schedule depend on?)

    S. Keshav

  • Buffered crossbarWhat happens if packets at two inputs both want to go to same output?Can defer one at an input bufferOr, buffer crosspoints

    S. Keshav

  • Broadcast

    Packets are tagged with output port #Each output matches tagsNeed to match N addresses in parallel at each outputUseful only for small switches, or as a stage in a large switch

    S. Keshav

  • Switch fabric elementCan build complicated fabrics from a simple element

    Routing rule: if 0, send packet to upper output, else to lower outputIf both packets to same output, buffer or drop

    S. Keshav

  • Features of fabrics built with switching elementsNxN switch with bxb elements has elements with elements per stageFabric is self routingRecursiveCan be synchronous or asynchronousRegular and suitable for VLSI implementation

    S. Keshav

  • BanyanSimplest self-routing recursive fabric

    (why does it work?)What if two packets both want to go to the same output?output blocking

    S. Keshav

  • BlockingCan avoid with a buffered banyan switchbut this is too expensivehard to achieve zero loss even with buffersInstead, can check if path is available before sending packetthree-phase schemesend requestsinform winnerssend packetsOr, use several banyan fabrics in parallelintentionally misroute and tag one of a colliding pairdivert tagged packets to a second banyan, and so on to k stagesexpensivecan reorder packetsoutput buffers have to run k times faster than input

    S. Keshav

  • SortingCan avoid blocking by choosing order in which packets appear at input portsIf we can present packets at inputs sorted by outputremove duplicates remove gapsprecede banyan with a perfect shuffle stagethen no internal blockingFor example, [X, 010, 010, X, 011, X, X, X] -(sort)-> [010, 011, 011, X, X, X, X, X] -(remove dups)-> [010, 011, X, X, X, X, X, X] -(shuffle)-> [010, X, 011, X, X, X, X, X]Need sort, shuffle, and trap networks

    S. Keshav

  • SortingBuild sorters from merge networksAssume we can merge two sorted listsSort pairwise, merge, recurse

    S. Keshav

  • Merging

    S. Keshav

  • Putting it together- Batcher Banyan

    What about trapped duplicates?recirculate to beginningor run output of trap to multiple banyans (dilation)

    S. Keshav

  • Effect of packet size on switching fabricsA major motivation for small fixed packet size in ATM is ease of building large parallel fabricsIn general, smaller size => more per-packet overhead, but more preemption points/secAt high speeds, overhead dominates!Fixed size packets helps build synchronous switchBut we could fragment at entry and reassemble at exitOr build an asynchronous fabricThus, variable size doesnt hurt too muchMaybe Internet routers can be almost as cost-effective as ATM switches

    S. Keshav

  • OutlineCircuit switchingPacket switchingSwitch generationsSwitch fabricsBuffer placementMulticast switches

    S. Keshav

  • BufferingAll packet switches need buffers to match input rate to service rateor cause heavy packet losesWhere should we place buffers?inputin the fabricoutputshared

    S. Keshav

  • Input buffering (input queueing)

    No speedup in buffers or trunks (unlike output queued switch)Needs arbiterProblem: head of line blockingwith randomly distributed packets, utilization at most 58.6%worse with hot spots

    S. Keshav

  • Dealing with HOL blockingPer-output queues at inputsArbiter must choose one of the input ports for each output portHow to select?Parallel Iterated Matchinginputs tell arbiter which outputs they are interested inoutput selects one of the inputssome inputs may get more than one grant, others may get noneif >1 grant, input picks one at random, and tells outputlosing inputs and outputs try againUsed in DEC Autonet 2 switch

    S. Keshav

  • Output queueing

    Dont suffer from head-of-line blockingBut output buffers need to run much faster than trunk speed (why?)Can reduce some of the cost by using the knockout principleunlikely that all N inputs will have packets for the same outputdrop extra packets, fairly distributing losses among inputs

    S. Keshav

  • Shared memory

    Route only the header to output portBottleneck is time taken to read and write multiported memoryDoesnt scale to large switchesBut can form an element in a multistage switch

    S. Keshav

  • Datapath: clever shared memory design

    Reduces read/write cost by doing wide reads and writes1.2 Gbps switch for $50 parts cost

    S. Keshav

  • Buffered fabricBuffers in each switch elementProsSpeed up is only as much as fan-inHardware backpressure reduces buffer requirementsConscostly (unless using single-chip switches)scheduling is hard

    S. Keshav

  • Hybrid solutionsBuffers at more than one pointBecomes hard to analyze and manageBut common in practice

    S. Keshav

  • OutlineCircuit switchingPacket switchingSwitch generationsSwitch fabricsBuffer placementMulticast switches

    S. Keshav

  • MulticastingUseful to do this in hardwareAssume portmapper knows list of outputs Incoming packet must be copied to these output portsTwo subproblemsgenerating and distributing copesVCI translation for the copies

    S. Keshav

  • Generating and distributing copiesEither implicit or explicitImplicitsuitable for bus-based, ring-based, crossbar, or broadcast switchesmultiple outputs enabled after placing packet on shared busused in Paris and Datapath switchesExplicitneed to copy a packet at switch elementsuse a copy networkplace # of copies in tagelement copies to both outputs and decrements count on one of themcollect copies at outputsBoth schemes increase blocking probability

    S. Keshav

  • Header translationNormally, in-VCI to out-VCI translation can be done either at input or outputWith multicasting, translation easier at output port (why?)Use separate port mapping and translation tablesInput maps a VCI to a set of output portsOutput port swaps VCINeed to do two lookups per packet

    S. Keshav