Routing Algorithm - Faculty

30
Routing Algorithm How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Many routing algorithms exist 1) Arithmetic 2) Source-based 30 (C) 2010 Daniel J. Sorin from Adve, Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221 2) Source-based 3) Table lookup 4) Adaptive—route based on network state (e.g., contention)

Transcript of Routing Algorithm - Faculty

Page 1: Routing Algorithm - Faculty

Routing Algorithm

• How do I know where a packet should go?– Topology does NOT determine routing (e.g., many pat hs through

torus)

• Many routing algorithms exist1) Arithmetic2) Source -based

30(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

2) Source -based3) Table lookup4) Adaptive—route based on network state (e.g., cont ention)

Page 2: Routing Algorithm - Faculty

(1) Arithmetic Routing

• For regular topology, use simple arithmetic to determine route

• E.g., 3D Torus– Packet header contains signed offset to destination (per dimension)– At each hop, switch +/- to reduce offset in a dimens ion– When x == 0 and y == 0, then at correct processor

31(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

• Drawbacks– Requires ALU in switch– Must re-compute CRC at each hop

(0,0,0) (1,0,0)

(0,0,1) (1,0,1)

(0,1,1)(1,1,1)

(0,1,0)(1,1,0)

Page 3: Routing Algorithm - Faculty

(2) Source Based & (3) Table Lookup Routing

Source Based• Source specifies output port for each switch in rou te• Very simple switches

– No control state– Strip output port off header

• Myrinet used this

32(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

• Can’t be made adaptive

Table Lookup• Very small header, index into table for output port• Big tables, must be kept up-to-date

Page 4: Routing Algorithm - Faculty

010 110

Deterministic vs. Adaptive Routing

• Deterministic (static) —follows a pre-specified route

– K-ary d-cube: dimension-order routing» (x1, y1) ���� (x2, y2)» First Dx = x2 - x1,» Then Dy = y2 - y1,

– Tree: common ancestor

33(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

001

000

101

100

010 110

111011

• Adaptive —route determined by contention for output port

Page 5: Routing Algorithm - Faculty

(4) Adaptive Routing

• Essential for fault tolerance– At least multipath

• Can improve utilization of the network• Simple deterministic algorithms easily run into bad permutations

34(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

• Fully/partially adaptive, minimal/non-minimal• Can introduce complexity or anomalies• A little adaptation goes a long way!

Page 6: Routing Algorithm - Faculty

Hot Potato Routing

• Every cycle, each switch takes each input and route s it to an output

– But not necessarily to the “desired” output

• No switch buffering!

• Possibility of livelock if no precautions taken

35(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

• Possibility of livelock if no precautions taken– E.g., could grant priority based on age of packet

• Variants also known as “deflection routing” or “mad postman routing”

Page 7: Routing Algorithm - Faculty

Deadlock

• Necessary conditions to achieve deadlock– Use more than one resource– Not willing to release resource in use– Cycle in order of recourse use

36(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

Guri Sohi

Page 8: Routing Algorithm - Faculty

Two Causes of Deadlock

P1

P2

Response

full of requests

full of requests

Response

Endpoint

Deadlock

37(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

Message M1

full of messages

full of messages

Message M2

Switch

Deadlock

switch1

switch2

Page 9: Routing Algorithm - Faculty

Avoiding Deadlock

• Simple but wasteful solution: full buffering– But it’s rare that we ever need full buffering

• More efficient solution: virtual channels (networks )• Endpoint deadlock solution: virtual networks

– Need a virtual network per type of message

• Switch deadlock solution #1: virtual channels

38(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

• Switch deadlock solution #2: deadlock-free routing

Page 10: Routing Algorithm - Faculty

Virtual Channels

• Need some number of virtual channels per virtual network , which depends on network topology and routing scheme

• Not to be confused with “virtual cut-through”• Add buffers so flits of wormhole packets can be

interleaved• Optional paper by Dally on virtual channels (see

39(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

• Optional paper by Dally on virtual channels (see course website)

• Upshot: total #virtual channels equals product of #virtual networks times #virtual channels for avoiding switch deadlock

Page 11: Routing Algorithm - Faculty

Deadlock Free Routing

• Up*-Down*– For spanning trees (superposed on any topology)– Route up, make one turn , route down

• Turn Model Routing– Restrict order of turns

» West first» North last

40(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

» North last» Negative first

– Can increase number of hops

Page 12: Routing Algorithm - Faculty

Minimal turn restrictions in 2D

West-first

-x +x

+y

41(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

north-last negative first-y

Page 13: Routing Algorithm - Faculty

Outline

• Topology• Routing• Flow Control

• Designing Switch Hardware• Case Studies

3 main aspects of networks

42(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

• Case Studies

Page 14: Routing Algorithm - Faculty

Buffer-less Flow Control

Circuit Switching• Establish route then send data• Like the telephone system• No buffers needed• This approach differs from packet switching, which is

what we’ve implicitly assumed until this slide

43(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

what we’ve implicitly assumed until this slideHot Potato Routing• No buffers needed, since all incoming packets get

sent out on some link without waitingPacket Discarding• If two packets contend for same resource, one gets

dropped ���� relies on higher-order mechanism (retry)

Page 15: Routing Algorithm - Faculty

Buffered Flow Control

• Packet switched networks do not reserve bandwidth, which can lead to contention

• Solution: prevent packets from entering until contention is reduced (e.g., metering lights)

44(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

• Flow control : between pairs of receivers and senders; use feedback to tell the sender when it is allowed to send the next packet

– Link-level: flow control done on per-link basis– End-to-end: flow control done over entire path leng th

Page 16: Routing Algorithm - Faculty

Link-Level Flow Control

Data

Ready

45(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

• Transfer single flit when receiver is ready• Could have long links with many flits in flight

Page 17: Routing Algorithm - Faculty

Credit-based (Window) Flow Control

• Receiver gives N credits to sender– Sender decrements count– Stops sending if zero– Receiver sends back credit as it drains its buffer– Bundle credits to reduce overhead

• Must account for link latency

46(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

• Must account for link latency

Page 18: Routing Algorithm - Faculty

Water Level

• High water, low water• “Stop” & “go” sent back to source switch (Myrinet)• Can send redundant stop/go

Stop

Incoming phits

47(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

Stop

Go

Outgoing phits

Page 19: Routing Algorithm - Faculty

Outline

• Topology• Routing• Flow Control

• Designing Switch Hardware• Case Studies

3 main aspects of networks

48(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

• Case Studies

Page 20: Routing Algorithm - Faculty

A Generic Switch

• At minimum, must route inputs to outputs

InputBuffer

OutputBuffer

Crossbar

Receiver Transmitter

OutputPorts

InputPorts

49(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

ControlRouting, Scheduling

Page 21: Routing Algorithm - Faculty

Switch Operation

• Each packet (flit) traverses the switch’s pipeline– Arrive in input buffer and wait to get to head of q ueue– Compute route (once per packet)– Allocate virtual channel (once per packet)– Allocate crossbar and output buffer entry– Traverse crossbar– Wait in output buffer to be allocated output link

50(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

• Switch is like very simple in-order processor pipel ine– Packet can stall at any stage– Only head flit, though, can stall when computing ro ute or allocating

virtual channel

Page 22: Routing Algorithm - Faculty

Switch Buffering

• Must absorb burstiness in arriving traffic– Unless using hot potato routing

• Must also hold flits that are stalled

• Option #1: Shared buffer pool– Need high bandwidth to buffer (bottleneck)– One congested output port could hog all buffer spac e

51(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

– One congested output port could hog all buffer spac e

• Option #2: Input buffering– #2a) Separate buffer per input port– #2b) Separate buffer per virtual channel per input port

Page 23: Routing Algorithm - Faculty

More Input Buffering

• If buffer per input port, then could suffer from he ad of line (HOL) blocking

– Subsequent packet may be routed to unused output po rt

• Either way (#2a or #2b), still likely to need outpu t buffering, but this doesn’t need to be divided up b y virtual channel

52(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

virtual channel

Page 24: Routing Algorithm - Faculty

Resource Allocation

• Policies for arbitration for crossbar, output link, etc.– Static priority– Random– Round-robin– Oldest-first

• Effects of adaptive routing?

53(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

• Effects of adaptive routing?– Select output link based on availability– Requires feedback from output port

Page 25: Routing Algorithm - Faculty

For Future Reading

• We have covered only the tip of the iceberg, and we have hidden most of the complexity

• Issues we’ve brushed under the rug:– Physical design of buffers, arbitration logic, etc.– Non-crossbar implementations (e.g., using a bus)– Control logic for managing switch– Using speculation to reduce switch pipeline depth

54(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

– Using speculation to reduce switch pipeline depth– How flow control fits into the switch design– Etc.

• I refer you to the textbook by Dally and Towles for a more comprehensive treatment of this subject

Page 26: Routing Algorithm - Faculty

Outline

• Topology• Routing• Flow Control

• Designing Switch Hardware• Case Studies

3 main aspects of networks

55(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

• Case Studies

Page 27: Routing Algorithm - Faculty

Case Study Cray T3D

• 1024 switch nodes each connected to 2 processors• 3D torus, bidirectional, 300 MB/s• Link: 16 bits, 8 control bits• Variable size packet (multiple of 16 bits)• Logical request & response networks

– 2 virtual channels each for deadlock

56(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

– 2 virtual channels each for deadlock

• Stacked dimension routing• Wormhole for large packets, virtual cut-through for

small packets

Page 28: Routing Algorithm - Faculty

Real (But Old) Machines

57(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

Page 29: Routing Algorithm - Faculty

Alpha 21364 (EV7) Network

• PRESENTATION

58(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221

Page 30: Routing Algorithm - Faculty

Flattened Butterfly

• PRESENTATION

59(C) 2010 Daniel J. Sorin from Adve,Falsafi, Hill, Lebeck, Reinhardt, Singh ECE 259 / CPS 221