Interconnect-Centric Computing
Transcript of Interconnect-Centric Computing
![Page 1: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/1.jpg)
HPCA: 1 Feb 12, 2007
Interconnect-Centric Computing
William J. DallyComputer Systems Laboratory
Stanford University
HPCA Keynote
February 12, 2007
![Page 2: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/2.jpg)
HPCA: 2 Feb 12, 2007
Outline
• Interconnection Networks (INs) are THE centralcomponent of modern computer systems
• Topology driven to high-radix by packagingtechnology
• Global adaptive routing balances load - and enablesefficient topologies
• Case study, the Cray Black Widow
• On-Chip Interconnection Networks (OCINs) faceunique challenges
• The road ahead…
![Page 3: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/3.jpg)
HPCA: 3 Feb 12, 2007
Outline
• Interconnection Networks (INs) are THE centralcomponent of modern computer systems
• Topology driven to high-radix by packagingtechnology
• Global adaptive routing balances load - and enablesefficient topologies
• Case study, the Cray Black Widow
• On-Chip Interconnection Networks (OCINs) faceunique challenges
• The road ahead…
![Page 4: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/4.jpg)
HPCA: 4 Feb 12, 2007
INs: Connect Processors in Clusters
IBM Blue Gene
![Page 5: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/5.jpg)
HPCA: 5 Feb 12, 2007
and on chip
MIT RAW
![Page 6: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/6.jpg)
HPCA: 6 Feb 12, 2007
Connect Processors to Memories in Systems
Cray Black Widow
![Page 7: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/7.jpg)
HPCA: 7 Feb 12, 2007
and on chip
Texas TRIPS
![Page 8: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/8.jpg)
HPCA: 8 Feb 12, 2007
provide the fabric fornetwork Switches and Routers
Avici TSR
![Page 9: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/9.jpg)
HPCA: 9 Feb 12, 2007
and connect I/O Devices
Brocade Switch
![Page 10: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/10.jpg)
HPCA: 10 Feb 12, 2007
Group History: Routing Chips &Interconnection Networks
• Mars Router, Torus Routing Chip, Network DesignFrame, Reliable Router
• Basis for Intel, Cray/SGI, Mercury, Avici network chips
MARS Router
1984
Torus Routing Chip
1985
Network Design Frame
1988
Reliable Router
1994
![Page 11: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/11.jpg)
HPCA: 11 Feb 12, 2007
Group History: Parallel Computer Systems
• J-Machine (MDP) led to Cray T3D/T3E
• M-Machine (MAP)
– Fast messaging, scalable processing nodes, scalablememory architecture
• Imagine – basis for SPI
MDP Chip J-Machine Cray T3D MAP Chip Imagine Chip
![Page 12: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/12.jpg)
HPCA: 12 Feb 12, 2007
Interconnection Networks are THE CentralComponent of Modern Computer Systems
• Processors are a commodity
– Performance no longer scaling (ILP mined out)
– Future growth is through CMPs - connected by INs
• Memory is a commodity
– Memory system performance determined by interconnect
• I/O systems are largely interconnect
• Embedded systems built using SoCs
– Standard components
– Connected by on-chip INs (OCINs)
![Page 13: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/13.jpg)
HPCA: 13 Feb 12, 2007
Outline
• Interconnection Networks (INs) are THE centralcomponent of modern computer systems
• Topology driven to high-radix by packagingtechnology
• Global adaptive routing balances load - and enablesefficient topologies
• Case study, the Cray Black Widow
• On-Chip Interconnection Networks (OCINs) faceunique challenges
• The road ahead…
![Page 14: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/14.jpg)
HPCA: 14 Feb 12, 2007
0.1
1
10
100
1000
10000
1985 1990 1995 2000 2005 2010
year
ba
nd
wid
th p
er
rou
ter
no
de
(G
b/s
)
Torus Routing ChipIntel iPSC/2J-Machine
CM-5Intel Paragon XPCray T3D
MIT AlewifeIBM VulcanCray T3E
SGI Origin 2000AlphaServer GS320IBM SP Switch2Quadrics QsNet
Cray X1Velio 3003IBM HPS
SGI Altix 3000Cray XT3YARC
BlackWidow
Technology Trends…
![Page 15: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/15.jpg)
HPCA: 15 Feb 12, 2007
High-Radix Router
Router
Router
![Page 16: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/16.jpg)
HPCA: 16 Feb 12, 2007
High-Radix Router
Router
Router
Low-radix (small number of fat ports) High-radix (large number of skinny ports)
RouterRouter
![Page 17: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/17.jpg)
HPCA: 17 Feb 12, 2007
4 hops 2 hops
96 channels 32 channels
Low-Radix vs. High-Radix Router
O0
O1
O2
O3
O4
O5
O6
O7
O8
O9
O10
O11
O12
O13
O14
O15
I0
I1
I2
I3
I4
I5
I6
I7
I8
I9
I10
I11
I12
I13
I14
I15
I0
I1
I2
I3
I4
I5
I6
I7
I8
I9
I10
I11
I12
I13
I14
I15
O0
O1
O2
O3
O4
O5
O6
O7
O8
O9
O10
O11
O12
O13
O14
O15
Low-Radix High-Radix
Latency :
Cost :
![Page 18: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/18.jpg)
HPCA: 18 Feb 12, 2007
Latency
Latency = H tr + L / b
= 2trlogkN + 2kL / B
where k = radix B = total router Bandwidth N = # of nodes L = message size
![Page 19: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/19.jpg)
HPCA: 19 Feb 12, 2007
Latency vs. Radix
0
50
100
150
200
250
300
0 50 100 150 200 250
radix
late
nc
y (
ns
ec
)
2003 technology 2010 technology
Optimal radix ~ 40
Optimal radix ~ 128
Serialization latency increases
Header latency
decreases
![Page 20: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/20.jpg)
HPCA: 20 Feb 12, 2007
Determining Optimal Radix
Latency = Header Latency + Serialization Latency
= H tr + L / b
= 2trlogkN + 2kL / B
Optimal radix
k log2 k = (B tr log N) / L
= Aspect Ratio
where k = radix B = total router Bandwidth N = # of nodes L = message size
![Page 21: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/21.jpg)
HPCA: 21 Feb 12, 2007
Higher Aspect Ratio, Higher Optimal Radix
1996
2003
2010
1991
1
10
100
1000
10 100 1000 10000
Aspect Ratio
Op
tim
al R
ad
ix (
k)
![Page 22: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/22.jpg)
HPCA: 22 Feb 12, 2007
High-Radix Topology
• Use high radix, k, to get low hop count
– H = logk(N)
• Provide good performance on both benign andadversarial traffic patterns
– Rules out butterfly networks - no path diversity
– Clos networks work well
• H = 2logk(N) - with short circuit
– Cayley graphs have nice properties but are hard to route
![Page 23: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/23.jpg)
HPCA: 23 Feb 12, 2007
Example radix-64 Clos Network
Y0
BW0 BW1 BW31
Y31
BW992 BW993 BW1023
Y1
BW32 BW33 BW63
Y32 Y33 Y63
Rank 1
Rank 2
![Page 24: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/24.jpg)
HPCA: 24 Feb 12, 2007
Flattened Butterfly Topology
![Page 25: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/25.jpg)
HPCA: 25 Feb 12, 2007
Packaging the Flattened Butterfly
![Page 26: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/26.jpg)
HPCA: 26 Feb 12, 2007
Packaging the Flattened Butterfly (2)
![Page 27: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/27.jpg)
HPCA: 27 Feb 12, 2007
Cost
![Page 28: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/28.jpg)
HPCA: 28 Feb 12, 2007
Outline
• Interconnection Networks (INs) are THE centralcomponent of modern computer systems
• Topology driven to high-radix by packagingtechnology
• Global adaptive routing balances load - and enablesefficient topologies
• Case study, the Cray Black Widow
• On-Chip Interconnection Networks (OCINs) faceunique challenges
• The road ahead…
![Page 29: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/29.jpg)
HPCA: 29 Feb 12, 2007
Routing in High-Radix Networks
• Adaptive routing avoids transient load imbalance
• Global adaptive routing balances load for adversarialtraffic
– Cost/perf of a butterfly on benign traffic and at low loads
– Cost/perf of a clos on adversarial traffic
![Page 30: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/30.jpg)
HPCA: 30 Feb 12, 2007
A Clos can statically load balance trafficusing oblivious routing
Y0
BW0 BW1 BW31
Y31
BW992 BW993 BW1023
Y1
BW32 BW33 BW63
Y32 Y33 Y63
Rank 1
Rank 2
![Page 31: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/31.jpg)
HPCA: 31 Feb 12, 2007
Transient Imbalance
![Page 32: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/32.jpg)
HPCA: 32 Feb 12, 2007
With Adaptive Routing
![Page 33: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/33.jpg)
HPCA: 33 Feb 12, 2007
Latency for UR traffic
![Page 34: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/34.jpg)
HPCA: 34 Feb 12, 2007
Flattened Butterfly Topology
0 1 2 3 4 5 6 7
![Page 35: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/35.jpg)
HPCA: 35 Feb 12, 2007
Flattened Butterfly Topology
0 1 2 3 4 5 6 7
What if node 0 sends all of its traffic to node 1?
![Page 36: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/36.jpg)
HPCA: 36 Feb 12, 2007
Flattened Butterfly Topology
0 1 2 3 4 5 6 7
What if node 0 sends all of its traffic to node 1?
How much traffic should we route over alternate paths?
![Page 37: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/37.jpg)
HPCA: 37 Feb 12, 2007
Simpler Case - ring of 8 nodesSend traffic from 2 to 5
• Model: Assume queues to be a network ofindependent M/D/1 queues
21 3 4
5670
x1
x2
= x1 + x2
Min path delay = Dm(x1)
Non-min path delay = Dnm(x2)
• Routing remains minimal as long as
Dm’( ) Dnm’(0)
• Afterwards, route a fraction, x2, non-
minimally such that
Dm’(x1) = Dnm’(x2)
![Page 38: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/38.jpg)
HPCA: 38 Feb 12, 2007
Traffic divides to balance delayLoad balanced at saturation
0
0.1
0.2
0.3
0.4
0.5
0.6
0 0.1 0.2 0.3 0.4 0.5 0.6
Offered Load (fraction of capacity)
Accepte
d T
hro
ughput
Model Overall
Model Minimal
Model Non-minimal
![Page 39: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/39.jpg)
HPCA: 39 Feb 12, 2007
Channel-Queue Routing
• Estimate delay per hop by local queue length Qi
• Overall latency estimated by
– Li ~ QiHi
• Route each packet on route with lowest estimated Li
• Works extremely well in practice
![Page 40: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/40.jpg)
HPCA: 40 Feb 12, 2007
Performance on UR Traffic
![Page 41: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/41.jpg)
HPCA: 41 Feb 12, 2007
Performance on WC Traffic
![Page 42: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/42.jpg)
HPCA: 42 Feb 12, 2007
Allocator Design Matters
![Page 43: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/43.jpg)
HPCA: 43 Feb 12, 2007
Outline
• Interconnection Networks (INs) are THE centralcomponent of modern computer systems
• Topology driven to high-radix by packagingtechnology
• Global adaptive routing balances load - and enablesefficient topologies
• Case study, the Cray Black Widow
• On-Chip Interconnection Networks (OCINs) faceunique challenges
• The road ahead…
![Page 44: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/44.jpg)
HPCA: 44 Feb 12, 2007
Putting it all togetherThe Cray BlackWidow Network
In collaboration with Steve Scott andDennis Abts (Cray Inc.)
![Page 45: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/45.jpg)
HPCA: 45 Feb 12, 2007
Cray Black Widow
• Shared-memory vector parallel computer
• Up to 32K nodes
• Vector processor per node
• Shared memory across nodes
![Page 46: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/46.jpg)
HPCA: 46 Feb 12, 2007
Black Widow Topology
• Up to 32K nodes in a 3-levelfolded Clos
• Each node has 4 18.75Gb/schannels, one to each of 4network slices
![Page 47: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/47.jpg)
HPCA: 47 Feb 12, 2007
YARCYet Another Router Chip
• 64 Ports
• Each port is 18.75 Gb/s (3 x 6.25Gb/s links)
• Table-driven routing
• Fault tolerance
– CRC with link-level retry
– Graceful degradation of links
• 3 bits -> 2 bits -> 1 bit -> OTS
![Page 48: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/48.jpg)
HPCA: 48 Feb 12, 2007
YARC Microarchitecture
• Regular 8x8 array of tiles
– Easy to lay out chip
• No global arbitration
– All decisions local
• Simple routing
• Hierarchical organization
– Input buffers
– Row buffers
– Column buffers
![Page 49: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/49.jpg)
HPCA: 49 Feb 12, 2007
A Closer Look at a Tile
• No global arbitration
• Non-blocking with an 8xinternal speedup insubswitch
• Simple routing– Small 8-entry routing table per tile
– High routing throughput for small packets
![Page 50: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/50.jpg)
HPCA: 50 Feb 12, 2007
YARC Implementation
• Implemented in a 90nmCMOS standard-cell ASICtechnology
• 192 SerDes on the chip• (64 ports x 3-bits per port)
• 6.25Gbaud data rate
• Estimated power• 80 W (idle)
• 87 W (peak)
• 17mm x 17mm die
![Page 51: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/51.jpg)
HPCA: 51 Feb 12, 2007
YARC Implementation
• Implemented in a 90nmCMOS standard-cell ASICtechnology
• 192 SerDes on the chip• (64 ports x 3-bits per port)
• 6.25Gbaud data rate
• Estimated power• 80 W (idle)
• 87 W (peak)
• 17mm x 17mm die
![Page 52: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/52.jpg)
HPCA: 52 Feb 12, 2007
Outline
• Interconnection Networks (INs) are THE centralcomponent of modern computer systems
• Topology driven to high-radix by packagingtechnology
• Global adaptive routing balances load - and enablesefficient topologies
• Case study, the Cray Black Widow
• On-Chip Interconnection Networks (OCINs) faceunique challenges
• The road ahead…
![Page 53: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/53.jpg)
HPCA: 53 Feb 12, 2007
Much of the future is on-chip(CMP, SoC, Operand)
2006 2007.5 2009
2010.5 2012 20152013.5
![Page 54: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/54.jpg)
HPCA: 54 Feb 12, 2007
On-Chip Networks are Fundamentally Different
• Different cost model
– Wires plentiful, no pin constraints
– Buffers expensive (consume die area)
– Slow signal propagation
• Different usage patterns
– Particularly for SoCs
• Significant isochronous traffic
• Hard RT constraints
• Different design problems
– Floorplans
– Energy-efficient transmission circuits
![Page 55: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/55.jpg)
HPCA: 55 Feb 12, 2007
NSF Workshop Identified 3 Critical Issues
• Power
– OCINs will have 10x the required power with currentapproaches
• Circuit and architecture innovations can close this gap
• Latency
– OCIN latency currently not competitive with buses anddedicated wiring
• Novel flow-control strategies required
• Tool Integration
– OCINs need to be integrated with standard tool flows toenable widespread use
![Page 56: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/56.jpg)
HPCA: 56 Feb 12, 2007
The Road Ahead
• INs become an even more dominant system component– Number of processors goes up, cost of processors decreases
– Communication dominates performance and cost
– From hand-held media UI devices to huge data centers
• Technology drives topology in new directions– On-chip, short reach electrical (10m), optical
– Expect radix to continue to increase
– Hybrid topologies to match each packaging level
• Latency will approach that of dedicated wiring– Better flow-control and router architecture
– Optimized circuits
• Adaptivity will optimize performance– Balance load, route around defects, tolerate variation, tune power to
load
![Page 57: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/57.jpg)
HPCA: 57 Feb 12, 2007
Summary
• Interconnection Networks (INs) are THE central component of moderncomputing systems
• High-radix topologies have evolved to exploit packaging/signalingtechnology
– Including hybrid optical/electrical
– Flattened Butterfly
• Global adaptive routing balances load and enables advanced topologies
– Eliminate transient load imbalance
– Use local queues to estimate global congestion
• Cray Black Widow - an example high-radix network
• On-Chip INs
– Very different constraints
– Three “Gaps” identified - power, latency, tools.
• The road ahead
– Lots of room for improvement, INs are in their infancy
![Page 58: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/58.jpg)
HPCA: 58 Feb 12, 2007
Some very good books
![Page 59: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/59.jpg)
HPCA: 59 Feb 12, 2007
Backup
![Page 60: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/60.jpg)
HPCA: 60 Feb 12, 2007
Virtual Channel Router Architecture
Switch
Allocator
VC
Allocator
Output k
Crossbar switch
RouterRouting
computation
Output 1
VC 1
VC 2
VC v
VC 1
VC 2
VC v
Input 1
Input k
Switch
Allocator
VC
Allocator
Output k
Crossbar switch
RouterRouting
computation
Output 1
VC 1
VC 2
VC v
VC 1
VC 2
VC v
Input 1
Input k
Switch
Allocator
VC
Allocator
Output k
Crossbar switch
RouterRouting
computation
Output 1
VC 1
VC 2
VC v
VC 1
VC 2
VC v
Input 1
Input k
Switch
Allocator
VC
Allocator
Output k
Crossbar switch
RouterRouting
computation
Output 1
VC 1
VC 2
VC v
VC 1
VC 2
VC v
Input 1
Input k
Switch
Allocator
VC
Allocator
Output k
Crossbar switch
RouterRouting
computation
Output 1
VC 1
VC 2
VC v
VC 1
VC 2
VC v
Input 1
Input k
Switch
Allocator
VC
Allocator
Output k
Crossbar switch
RouterRouting
computation
Output 1
VC 1
VC 2
VC v
VC 1
VC 2
VC v
Input 1
Input k
Switch
Allocator
VC
Allocator
Output k
Crossbar switch
RouterRouting
computation
Output 1
VC 1
VC 2
VC v
VC 1
VC 2
VC v
Input 1
Input k
![Page 61: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/61.jpg)
HPCA: 61 Feb 12, 2007
Baseline Performance Evaluation
0
10
20
30
40
50
0 0.2 0.4 0.6 0.8 1
offered load
late
ncy (
cycle
s)
low-radix
![Page 62: Interconnect-Centric Computing](https://reader031.fdocuments.us/reader031/viewer/2022012416/6170947eb939fa19c63a5f32/html5/thumbnails/62.jpg)
HPCA: 62 Feb 12, 2007
Baseline Performance Evaluation
0
10
20
30
40
50
0 0.2 0.4 0.6 0.8 1
offered load
late
ncy (
cycle
s)
low-radix
baseline(high-radix)
Low
radix
better