Comparison Of Network On Chip Topologies Ahmet Salih BÜYÜKKAYHAN 2007706435 - 2009 Fall.

Post on 29-Mar-2015

216 views 0 download

Tags:

Transcript of Comparison Of Network On Chip Topologies Ahmet Salih BÜYÜKKAYHAN 2007706435 - 2009 Fall.

ComparisonComparison OfOf NetworkNetwork On Chip On Chip TopologiesTopologies

Ahmet Salih BÜYÜKKAYHAN2007706435 - 2009 Fall

OUTLINEOUTLINEIntroductionBasic DefinitionsProperties of a TopologyNOC TopologiesEvaluationConclusion

Introduction to NOCIntroduction to NOC

NOC◦ A micronetwork of components◦ Transfers information between nodes

Challenges◦ Performance requirements

Latency as small as possible As many concurent transfers as possible

◦ Tight energy boundaries◦ Reliability requirements◦ Low Cost

NOC MotivationNOC MotivationMoore’s Law, doubling the number of

gates every18 months by shrinking the technology dimensions

wire dimensions resistance (R=L/A) inter-wire spacing capacitance (C =

εoA/d) Require the periodic insertion of repeaters Consume more dynamic and leakage

power50% of the power dissipation is due to the

(long) wires.[1]

What Chracterizes NOCWhat Chracterizes NOCTopology (What)

◦ Physical interconnection structure of the network graph

Routing Algorithm (Which)◦ Restricts the set of paths that msgs may follow

Switching Strategy (How)◦ How data in a message traverse a route◦ Circuit / Packet / Wormhole

Flow Control Mechanism (When)◦ when a msg or portions of it traverse a route◦ what happens when traffic is encountered?

Properties of a TopologyProperties of a TopologyPerformance

◦ Diameter (Max routing Distance)◦ Average Distance

Cost◦ Avg. Nodal Degree (Avg number of links for

each node)◦ Number of links (Total number of links)

Reliability ◦ Min number of links to disconnect the

graph

NOC TopologiesShared-Medium Local Networks

◦Contention Bus, Token Bus and Token Ring

Direct Networks◦1D: Linear, Ring ◦2D: Mesh, Tree◦3D: Cube, Toroid

Indirect Networks◦Crossbar, Benes, Perfect shuffle and

OmegaHybrid Networks

Shared-Medium Local Shared-Medium Local NetworksNetworks

Local Area Networks◦Contention Bus(Ethernet)◦Token Bus (Arcnet)◦Token Ring (FDDI Ring, IBM Token Ring)

All communication devices share the transmission medium.

Only one device can drive network at a time

Contention Bus (Ethernet)Contention Bus (Ethernet)

All devices can monitor the state of the bus, such as idle, busy, and collision.

“collision” means that two or more devices are using the bus at the same time and their data collided.

When the collision is detected, the competing devices will quit transmission and try later.

Ethernet adopts carrier-sense multiple access with collision detection (CSMA/CD) protocol.

Token Bus & Token RingToken Bus & Token Ring

◦ Contention Bus has an undeterministic nature◦ Not suitable for Real-Time applications

Solution: ◦ Passing a token among network devices◦ The owner of the token has the right to acess to the bus◦ Maximum token holding time

Token Ring: ◦ Natural extension of token bus◦ Passing of the token forms a ring structure

Properties of Shared Medium Properties of Shared Medium LAN LAN Bus system is not scalable because bus becomes the

bottleneck. Fully connected to each other Bus systems:

◦ Diameter = 1

◦ Avg. Dist = 1

◦ Reliability = 1

◦ Number of links = N + Bus

◦ Nodal Degree = 1

Ring Systems:◦ Diameter: N/2

◦ Avg. Dist = N/2 = (N-1)*(N) / 2*(N-1)

◦ Number of links : N-1

◦ Nodal Degree = 2

◦ Reliability = 2

Direct Networks (Router Direct Networks (Router Based)Based)

◦ Strictly Orthogonal Topologies Mesh Torus Hypercube

◦ Other Topologies Trees Cube connected cycles

Node processors are connected directly with each other by the network

Each node performs dataflow routingEvery direct network can be represented as

indirect, by splitting each node into a terminal and a switch

Orthogonal

Every link and node can be arranged in such a way that it produces a displacement in a single dimension

Most of the implemented networks have an orthogonal topology.

Orthogonal Topologies

4 ary 2 dim Mesh 8 – Cube◦ Diameter = 6 Diameter = 3◦ Number of Links = 24 # of Links = 12◦ Node Degree = 3 Node Degree = 3◦ Avg Distance = 3 Avg. Distance = 1.71◦ Reliability = 2 Reliability = 3

HypercubesHypercubesDiameter = logN Node Degree = logN Reliability = logN

TreesTrees

Binary Tree ◦ diameter: 2 log(N)◦ Reliability: 1◦ Total Number of links : N-1◦ Nodal Degree : 1<Nodal Degree <2

Problems◦ Congestion◦ Fault tolerance is low

Fat TreesFat Trees

Fatter links (really more of them) as you go up, so bisection BW scales with N

There are many possible paths, so at each level the routing processor chooses a path at random, in order to balance the load.

Cube Connected CyclesCube Connected CyclesLike n-dimensional

hypercube of virtual nodes

each virtual node is a ring with n nodes, for a total of n2n nodes

Each node in the ring is connected to a single dimension of the hypercube

diameter is same with hypercube of similar size

Cube Connected CyclesCube Connected Cycles

Total number of links : ( n2n * n )/ 2

Node Degree = Reliability : nDiameter: 2*n

Embed Multiple Embed Multiple DimensionsDimensions

Embed multiple logical dimension in one physical dimension using long wires

Indirect Networks(Switch Indirect Networks(Switch Based)Based)

◦Crossbar◦Fully Connected◦Perfect Shuffle◦Multistage Interconnection Networks

Blocking Networks Omega Banyan

Non Blocking Networks Clos Network Benes Network

node processors (1 n ) node switches

Switches

Switches◦Perform the routing ◦Provide a programmable connection

between their ports◦Do not perform information

processing

CrossbarCrossbar

Free of interconnect contentionCrossbar networks are used in the design

of high-performance small-scale multiprocessors

However, the bit energy will increase linearly with the number of input and output ports N

Fully Connected SwitchFully Connected Switch

Using a single N × N crossbar is much cheaper than using a fully connected direct network topology

Requiring N routers, each one having an internal N × N crossbar

Perfect ShuPerfect Shuffffle Networkle Network

a) The perfect shuffleb) Inverse perfect shufflec) Bit reversal permutations for N=8

Omega NetworksOmega Networks

The omega network is another example of a banyan multistage interconnection network that can be used as a switch fabric

The omega uses the “perfect shuffle”

Omega NetworksOmega Networks

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

4

Omega NetworksOmega Networks

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

4

Omega NetworksOmega Networks

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

4

Omega NetworksOmega Networks

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

4

Omega NetworksOmega Networks

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

4

Omega NetworksOmega Networks

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

4

Omega NetworksOmega Networks

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

4

Path ContetionPath ContetionThe omega network has the

problems as the delta network with output port contention and path contention

Again, the result in a bufferless switch fabric is cell loss (one cell wins, one loses)

Path contention and output port contention can seriously degrade the achievable throughput of the switch

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

4

Path Contention

5

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

4

Path Contention

5

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

4

Path Contention

5

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

4

Path Contention

5

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

Path Contention

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

Path Contention

5

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

Path Contention

5

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

Path Contention

5

Batcher Sorter & Banyan NetworkOne solution to the contention

problem is to sort the cells into increasing order based on desired destination portBanyan networks are a

class of MINs with the property that there is a unique path between any pair of source and destination

Batcher-Banyan Example

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

1

0

4

6

7

3

Batcher-Banyan Example

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

0

6

1

7

3

4

Batcher-Banyan Example

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

0

6

1

7

3

4

Batcher-Banyan Example

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

0

3

6

1

7

4

Batcher-Banyan Example

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

77

0

3

1

6

4

Batcher-Banyan Example

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

6

7

4

3

1

0

Batcher-Banyan Example

1

2

3

4

6

7

5

0 0

1

2

3

4

5

6

7

0

1

3

4

6

7

Clos NetworksClos NetworksClos networks have three stages:

the ingress stage, middle stage, and the egress stage. Each stage is made up of a number of crossbar switches

BenesNetworksBenesNetworks

Clos networks may also be generalised to any odd number of stages. By replacing each centre stage crossbar switch with a 3-stage Clos network, Clos networks of five stages may be constructed. By applying the same process repeatedly,

Hybrid NetworksHybrid Networks◦ Multiple-backplane ◦ Hierarchical buses

Cluster tightly coupled computational units with high communication bandwidth

Provide lower bandwidth intercluster communication link sctures◦ performance comparable with

homogeneous,◦ high-bandwidth architectures◦ energy efficiency is a strong driver toward

using hybrid architectures.

Cluster Based 2-D Mesh

At the lower level, each cluster consists of four processors connected by a bus.

At the higher level, a 2-D mesh connects the clusters. The broadcast capability of the bus is used at the cluster level

Evaluation I

# of links Nodaldegree Diameter Avg. Dist Reliability

7 BinTree 6 1.71 4 2.21 1

8 Ring 8 2 4 2.21 2

9 Mesh 12 2,66 4 2 2

8 Cube 12 3 3 1,71 3

Evaluation I

# of links Nodaldegree Reliability Diameter Avg. Dist

15 BinTree 14 1.87 1 6 3.5

16 Mesh 24 3 2 6 3

16 HyperCube 32 4 4 4 2.13

16 Chord.Ring 32 4 4 3 2

Power Consumption Under Different Number of Ports

ConclusionConclusionShared Medium topologies have a

bottleneck on shared medium. So not extensible

Direct topologies can be easily extensible but there are thresholds between cost, performance and reliability

Embed multiple logical dimension in one physical dimension using long wires is another disadvantage

ConclusionConclusionIndirect topologies blocking

topologies have contention problems. Non blocking networks have extra stages and costs.

Non-Blocking networks are cheaper than a crossbar with the same size

Hybrid networks have high bandwith and energy efficiency using clustering

ConclusionConclusionInterconnect contention (internal

blocking) induces significant power consumption on internal buffers, and the power consumption on buffers will increase sharply as throughput increases.

ReferencesReferences [1]N. Magen, A. Kolodny, U. Weiser, and N. Shamir.

Interconnect-power dissipation in a microprocessor. In SLIP’04, Feb. 2004.

[2]Cidon, I., Keidar, I.: Zooming in on Network on Chip Architectures. Technion Department of Electrical Engineering, 2005

[3]Jose Duato , Sudhakar Yalamanchili , Lionel Ni, Interconnection Networks: An Engineering Approach, IEEE Computer Society Press, Los Alamitos, CA, 1997

[4]T.T. Ye: On-Chip Multiprocessor Communication Network Design and Analysis. Standford University of Electrical Engineering, Dec. 2003

[5] L Benini and G.D. Micheli, Networks on chips: a new SoC paradigm. IEEE Computer 35 1 (2002), pp. 70–78

Questions ???Questions ???

Thanks