CHAPTER 5 RECONFIGURABLE NETWORK ON CHIP...

116

CHAPTER 5

RECONFIGURABLE NETWORK ON CHIP ROUTER USING SPATIAL DIVISION

MULTIPLEXING TECHNIQUE

5.1 INTRODUCTION

Multi-Processor System-on-Chip (MPSoC) architectures have become a very attractive solution for the new consumer multimedia embedded market (Wolf 2004). Although MPSoCs promise to significantly improve the processing capabilities and versatility of embedded systems, one major

problem in their current and future design is the effectiveness of the interconnection mechanisms between the internal components, as the amount

of components grows with each new technological node. Bus-based designs are not able to cope with the heterogeneous and demanding communication

requirements of MPSoCs (David Atienza 2008).

Networks-on-Chip (NoCs) have been suggested as a promising solution to the aforementioned scalability problem of forthcoming MPSoCs

(Dally and Towles 2001; Benini and Micheli 2002). Unlike traditional bus-based on-chip communication architectures, NoCs use packets to route data

from the source to the destination component, via a network fabric that consists of switches (routers) and interconnection links (group of wires). NoCs build on top of the latest evolutions of bus architectures in terms of advanced protocols and topology design, and, by bringing packet-based

communication paradigms to the on-chip domain, they address many of the upcoming issues of interconnect fabric design better than buses (Angiolini

2006).

117

A NoC consists of several Processing Elements (PEs) connected

together via routers and regular sized wires. A PE (also referred to as a node)

can be any component such as a microprocessor, application-specific

integrated circuit (ASIC) block or memory, or a combination of components

connected together. A Network Interface (NI) at the boundary of each PE is

used to packetize any data generated by the PE. This NI is connected to a

router, which has buffers at its input to accept data packets from a PE or from

other routers connected to it. A switch inside the router is used to route the

data packets from the input buffers to the appropriate output link, based on the

address in the packet header. Currently, NoCs exploit Time Division

Multiplexing (TDM) to share network resources among circuits (Anthony

Leroy 2008). A TDM router is basically composed of a switch, which

connects the router input ports to output ports, and an Output Reservation

Table (ORT).

The ORT contains the switch configuration for each time slot based

on the decisions performed by the routing algorithm (Anthony Leroy 2008). It

is implemented by an SRAM read at each time slot to set up the

corresponding switch configuration. But this typically results in high network

area and energy overhead with long circuit set-up time (Anthony Leroy 2008).

An alternate solution is Spatial Division Multiplexing (SDM) technique. This

exploits the fact that on-chip network links are physically made up of a set of

wires (Anthony Leroy 2008). In SDM only a subset of the link wires is

allocated to a given virtual circuit. Messages are digit-serialized on a portion

of the link (i.e., serialized on a group of wires). The switch configuration is

set once and for all at the connection set up (Anthony Leroy 2008). No inside-

router configuration memory is therefore needed and the constraints on the

reservation of the circuits are relaxed (Anthony Leroy 2008).

118

In this chapter, a reconfigurable NoC router based on SDM

technique is proposed. This proposed router is designed using multiplexer

switch instead of the standard crossbar, since the complexity of the existing

crossbar increases as number of inputs increases. A fixed network topology

can be a disadvantage in NoC platforms due to misalignment with application

requirements (Pau 2008). It is therefore desirable to incorporate a certain level

of configurability and so using the proposed NoC router, a reconfigurable

SDM based router is proposed in this chapter.

In Dally and Towels (2001), the concept of on-chip networks is

introduced. The paper sketches the design of a simple on-chip network, the

design choices made in that simple network, the advantages and

disadvantages of on-chip networks and the challenges and open research

issues in the design of such networks. Benini and Micheli (2002) has

proposed the concept of viewing a SoC as a micronetwork of components.

Benini and Micheli (2002) proposed ideas of borrowing models, techniques,

and tools from the network design field and applying them to SoC design. The

network is the abstraction of the communication among components and must

satisfy quality-of-service requirements—such as reliability, performance, and

energy bounds—under the limitation of intrinsically unreliable signal

transmission and significant communication delays on wires (Luca Benini

2002). In David Atienzaa (2008), an overview of the benefits of state-of-the-

art NoCs using a complete NoC synthesis flow and a detailed scalability

analysis of different NoC implementations for the latest nanometer-scale

technology nodes is presented. In this paper, a thorough study of the current

state-of-the-art of NoC implementations using a design flow targeting the new

trends imposed by deep submicron manufacturing processes is performed.

Also, a comparative analysis of different NoC fabrics ranging from regular

topologies to highly tuned custom NoCs is presented.

119

In Anthony Leroy (2008), two approaches for Switched Virtual

Circuit (SVC) NoCs are compared. The two approaches used are TDM-based

SVC and SDM-based SVC. The paper shows that the virtual-circuit set-up

time, the area overhead, and the energy consumption for the SDM technique

is generally better than for the traditional TDM technique. The paper explains

that SDM technique certainly appears to be a very valuable alternative to

TDM technique. Pau (2008) proposed a configurable TDM router intended as

a dedicated embedded module for NoC support in an FPGA. The configurable

router is packet-switched and it provides five bidirectional ports. Switching is

performed by two 3x3 crossbars instead of a full 5x5 crossbar as a means of

reducing area and power consumption. The control logic in each crossbar

orchestrates all of the switching activities and channel arbitration according to

the routing algorithm of the selected network topology and the traffic

conditions during operation. Therefore, some of the advantages of Pau (2008)

and Anthony Leroy (2008) are combined and a reconfigurable SDM based

NoC router is proposed in this chapter.

The existing SDM router contains a switch and a switch control

unit. The switch is slightly larger than in TDM router as it must be able to

potentially interconnect any group of wires present at the router input port to

another group of wires of any output port. The TDM router offering m time

slots is based on a PxP n-bit wide crossbar. For SDM, an n-bit port is divided

into n/m individually switchable group of m wires (Anthony Leroy 2008).

Therefore, for the same bandwidth and number of segments, at the same clock

frequency, the number of input and output ports of the switch is increased by

a factor m for SDM. However, the ports” bit-width is divided by a factor m.

The SDM router would thus require a (Pxm) x (Pxm) n/m-bit wide crossbar.

The SDM technique combines the simplicity of nonvirtual circuit-switch

implementation (as the switch configuration is fixed for the circuit lifetime)

120

and the flexibility of the bandwidth allocation proper to the Switched Virtual

Circuit technique (Anthony Leroy 2008).

Full-crossbars have a too high complexity to be used as an SDM

router”s switch (Anthony Leroy 2008). An interesting alternative to crossbars

consists of using Multiple stages Interconnection Network (MIN) switches.

MIN switches are available in a wide variety. These can reduce the cross-

points cost. The cost of using such a switch is paid either in bandwidth

(longer clock cycles) or in delay (pipelined stages and multiple cycles to go

through). As the number of cross points in MIN switches is reduced, some

input-output connections can no longer be realized as one cross-point can be

simultaneously required by two connections, resulting in a blocking state

(Anthony Leroy 2008). A classification of MIN switches can be realized

based on how easily those blocking states can be avoided (Anthony Leroy

2008). In Strictly Non Blocking (SNB) switches, any new connection from a

free input to a free output can always be realized (Anthony Leroy 2008). The

same condition applies to Non Blocking (NB) switches but with the

restriction of carefully choosing the path taken in the switch. In Rearrangeable

Non Blocking (RNB) switches, in certain situations an internal switch

rerouting might be necessary to find a non blocking solution but a solution

always exists (Anthony Leroy 2008). Finally, for blocking switches, some

connections can be blocked by others without any alternative solution.

Among the different implementation possibilities, SNB switches

are attractive, but their minimum cross-point cost is still big, which would

lead to an area overhead comparable to the crossbar”s. To reduce the switch

overhead to a minimum, an RNB Benes switch is proposed in Anthony Leroy

(2008). But the delay is systematically larger for the Benes switch than the

crossbar. The reason is that Benes switch is composed of several switch

stages interconnected by complex interconnect patterns that are relatively

121

difficult to route on a 2D plane. Another disadvantage of the Benes switch is,

the control of a Benes switch is more complex than the control of a crossbar.

The Benes switch needs a dedicated switch control unit that allows solving

any potential contention within the router, reserving the route within the

switch, and controlling the corresponding atomic switches (Anthony Leroy

2008). Hence, in order to overcome the disadvantages of the Benes switch, a

simple switch with less area, power, and delay overhead is proposed in this

chapter. The proposed switch does not have complex control logic and so the

configuration set-up of the NoC is easy to achieve.

5.2 PROPOSED WORK

In this section, two designs are proposed, namely, the design of a

router switch and the design of a reconfigurable SDM based router. Firstly,

the design aspects of the proposed router switch is discussed. Secondly, the

five topologies considered for analysis of the NoC are discussed. Thirdly, the

design of the proposed reconfigurable SDM router (incorporating the

proposed router switch) is discussed.

5.2.1 Proposed Router Switch Design

Crossbars are commonly used for routing in SDM switches. But

crossbars need high energy and area overhead. Full-crossbars have high

complexity to be used as an SDM router”s switch. Hence, an alternate switch

for the router using a single multiplexer is proposed in this section. A SDM

router consists of a switch and a switch control unit. Figure 5.1 shows the

structure of the proposed NoC router switch. The proposed router switch has

four I/O ports, namely, EAST, WEST, NORTH, and SOUTH. Each port is 16

bits wide. The 16 lines (wires) of a port (link) are grouped into 4 groups

consisting of 4 wires each. The 4 groups are named as A, B, C and D

respectively, as shown in Figure 5.1.

122

Figure 5.1 Proposed NoC router switch

The switch shown in Figure 5.1 consists of a single multiplexer.

This switch is present inside the router of every node. For each clock cycle,

the control word (i.e. configuration) of the switch is updated. Depending upon

the control word, the multiplexer can switch to any group of wires of any port.

In Figure 5.1, inputs from four sources A, B, C and D are assumed to arrive at

port1 of the router. The bandwidth is 16 since the port size is 16 bits. Since

there are four sources, the bandwidth is divided by four such that each source

gets an equal share of the bandwidth. So, the sources A, B, C, D gets a

bandwidth of 4 bits (i.e. 25% of the total bandwidth). The bandwidth

allocation between the sources can be done in any way depending upon the

specifications. For e.g. the source A can be allocated with bandwidth equal to

8, source B with bandwidth 4, source C with bandwidth 2 and source D with

bandwidth 2. The group of wires allocated for each source during the

initialization of the NoC is used only by the respective source throughout the

lifetime of the NoC. In this section, the bandwidth is assumed to be 4 for all

sources. This switch can also be extended for more than four sources.

123

5.2.2 Design of NoC for five different topologies

Let the sources be named as S1, S2, S3…Sn and the destinations be

named as D1, D2, D3…Dn. In SDM, any two nodes are permanently connected

by a link containing a fixed number of wires based on the bandwidth

(Anthony Leroy 2008). If the bandwidth of the NoC is assumed as 16, then

the links between any two nodes consist of 16 wires. In SDM, each link is

divided into subset of wires whereas in TDM the links between the nodes are

not divided into groups or subsets of wires. In SDM based NoC, inside each

node there is a router switch which uses a control word to select an output

port during data transfer in the NoC. The routing path is referred once during

initialization and the wires are fixed according to it. Hence, there is no need

for storing the routing table in a separate memory block for SDM.

In TDM, there are no separate wires allocated for each source. A

single link (containing 16 wires) is present between any two nodes and each

source is given a distinct time slot only, during which they can transfer

messages. In TDM, two or more data from various sources may arrive at the

router at the same instant, but at a time only one data can be forwarded

through a node. Hence, arbitration is involved in each and every node of

TDM topologies. In this work, arbitration done at every node is based on

priority, the data from source S1 is given the first priority followed by

S2, S3 …Sn.

The routing algorithm is based on the shortest distance path

between two nodes. The routing algorithm used here is static, that is the

routing path between source and destination is fixed initially and after that the

path between a source-destination pair does not change at all. In this proposed

work, the routing path between any source-destination pair is assumed to be

same for both TDM and SDM based NoCs that are modeled.

124

5.2.3 NoC with 16 nodes

The structure and routing table of 16 node NoCs modeled using

various topologies are discussed in this section. The same structure and

routing table are assumed for both TDM and SDM NoCs which are

considered for analysis in the proposed work. The topologies considered for

making analysis in the proposed work are Mesh, Butterfly, Fat Tree, Ring,

and Benes.

5.2.3.1 Mesh Topology

Figure 5.2 shows a 4x4 mesh topology. It is a direct blocking type

topology network (Sudeep Pasricha 2008).

S1, S2… S8: Sources D1, D2…D4: Destination

X: Undefined source N1, N2…N16: Nodes

Figure 5.2 Mesh Topology

There are 16 nodes in the above figure. The sources are denoted by

S1, S2….S8. The undefined sources indicated by X denote that sources can be

added to those nodes in future if required. The 2-D mesh is one of the most

popular NoC topologies because all links have the same length which eases

N1 N2 N3 N4

N5 N6 N7 N8

N9 N10 N11 N12

N13 N14 N15 N16

S1 X X S5

S6

S7

S8

XXS2

S3

S4 X X

XX

D1

D2

D3

D4

125

physical design (Sudeep Pasricha 2008). Every node in a 2-D mesh is

connected to four neighboring nodes, except for the nodes at the edges. The

area of a mesh grows linearly with the number of nodes. Meshes must also be

designed in such a way so as to avoid traffic accumulating in the center of the

mesh, which reduces performance (Sudeep Pasricha 2008).

The common routing path followed by TDM and SDM is shown in

Table 5.1. It shows the path taken by the data in traveling from a particular

source to a particular destination. For example, if source S1 needs to transfer

data to destination D1 it follows the path consisting of nodes numbered

{1, 2, 3, 4} as shown in Table 5.1.

Table 5.1 Routing table used in Mesh topology

Source nodes/ Destination

nodesD1 D2 D3 D4

S1 1,2,3,4 1,6,7,8 1,10,11,12 1,14,15,16S2 5,2,3,4 5,6,7,8 5,10,11,12 5,14,15,16S3 9,2,3,4 9,6,7,8 9,10,11,12 9,14,15,16S4 13,2,3,4 13,6,7,8 13,10,11,12 13,14,15,16 S5 4 4,8 4,8,12 4,8,12,16S6 8,4 8 8,12 8,12,16 S7 12,8,4 12,8 12 12,16S8 16,12,8,4 16,12,8 16,12 16

S1, S2… S8: Sources D1, D2…D4: Destination

5.2.3.2 FAT Tree Topology

Figure 5.3 shows fat tree topology. It is an indirect multistage

topology network (Sudeep Pasricha 2008). It has a tree structure with parent

and child nodes.

126

N1

N2 N3

N6 N7N5

N15N14N13N12N11N10N9N8

N4

N16

N1, N2… N16: Nodes

Figure 5.3 FAT Tree Topology

There are totally 16 sources/destinations (S1/D1, S2/D2…..S16/D16),

one at each node. Links among adjacent switches are increased as they get

closer to the root of the tree (Sudeep Pasricha 2008). Increasing the number of

links near the root of the tree essentially allocates more bandwidth on the

channels that have higher traffic (Sudeep Pasricha 2008).

The common routing path followed by TDM and SDM is shown in

Table 5.2. It explains routing path for transferring data from source to

destination. For example, if source S1 needs to transfer data to Destination D13

it follows the path consisting of nodes numbered {1, 3, 6, 13} as shown in

Table 5.2.

Table 5.2 Routing table used in FAT Tree topology

D16 1,2,4, 8,16 2,4,8, 16 Nil 4,8,16 Nil Nil Nil 8,16 Nil Nil Nil Nil Nil Nil Nil 16 D15 1,3,7,15 Nil 3,7,15 Nil Nil Nil 7,15 Nil Nil Nil Nil Nil Nil Nil 15 M/C D14 1,3,7,14 Nil 3,7,14 Nil Nil Nil 7,14 Nil Nil Nil Nil Nil Nil 14 M/C M/C D13 1,3,6,13 Nil 3,6,13 Nil Nil 6,13 Nil Nil Nil Nil Nil Nil 13 M/C M/C M/C D12 1,3,6,12 Nil 3,6,12 Nil Nil 6,12 Nil Nil Nil Nil Nil 12 M/C M/C M/C M/C D11 1,2,5,11 2,5,11 Nil Nil 5,11 Nil Nil Nil Nil Nil 11 M/C M/C M/C M/C M/C D10 1,2,5,10 2,5,10 Nil Nil 5,10 Nil Nil Nil Nil 10 M/C M/C M/C M/C M/C M/C D9 1,2,4,9 2,4,9 Nil 4,9 Nil Nil Nil Nil 9 M/C M/C M/C M/C M/C M/C M/C D8 1,2,4,8 2,4,8 Nil 4,8 Nil Nil Nil 8 M/C M/C M/C M/C M/C M/C M/C M/C D7 1,3,7 Nil 3,7 Nil Nil Nil 7 M/C M/C M/C M/C M/C M/C M/C M/C M/C D6 1,3,6 Nil 3,6 Nil Nil 6 M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C D5 1,2,5 2,5 Nil Nil 5 M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C D4 1,2,4 2,4 Nil 4 M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C D3 1,3 Nil 3 M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C D2 1,2 2 M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C D1 1 M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13 S14 S15 S16

S1, S2….S16: Sources D1, D2…D16: Destinations

M/C: Mother/Child node Nil: No Connection

128

5.2.3.3 Ring Topology

Figure 5.4 shows uni-directional ring topology network. It is a

direct blocking type of topology.

N16 N1N2

N3

N4

N5

N6

N7N8

N9

N10

N11

N12

N13

N14

N15

N1, N2…N16 - Nodes

Figure 5.4 Ring Topology

There are totally 16 sources/destinations (S1/D1, S2/D2…..S16/D16),

one at each node. Each node can act either as source or as destination (Sudeep

Pasricha 2008). The common routing path followed by TDM and SDM is

shown in Appendix 1. It explains routing path for transferring data from

source to destination. For example, if source S1 needs to transfer data to

Destination D6 it follows the path of nodes numbered as {1, 2, 3, 4, 5, 6}

(Appendix 1).

5.2.3.4 Butterfly Topology

Figure 5.5 shows Butterfly topology network with 8 sources (S1,

S2…..S8), 8 destinations (D1, D2…D8) and 16 nodes (N1, N2…N16). It is an

indirect type topology network (Sudeep Pasricha 2008).

129

S1

S2

N1

S3

S4

S5

S6

S7

S8

N5

N9

N13

N2 N3 N4

N6 N7 N8

N10 N11 N12

N16N15N14

D1

D2

D3

D4

D5

D6

D7

D8

S1, S2…..S8 – Sources

D1, D2…D8- Destinations

N1, N2…N16 - Nodes

Figure 5.5 Butterfly Topology

The butterfly network is a blocking multi-stage network, which

implies that information may be temporarily blocked or dropped in the

network if contention occurs (Sudeep Pasricha 2008). The routing table used

in Butterfly topology network is shown in Appendix 1. It explains routing

path for transferring data from source to destination. For example, if source S1

needs to transfer data to destination D1 it follows the path {1, 2, 3, 4}

(Appendix 1).

5.2.3.5 Benes Topology

Figure 5.6 shows a 16 node Benes topology. The Benes network is

an example of a rearrangeable network in which paths may have to be

rearranged to provide a connection, requiring an appropriate controller. It is a

non-blocking network (Sudeep Pasricha 2008). Hence, an alternative routing

path also exists between any source-destination pair.

130

N1 N2 N3 N4

N5

S1

S1

S1

S1

S1

S1

S1

S1

N9

N13

N6 N7 N8

N10 N11 N12

N14 N15 N16

D1

D2

D3

D4

D5

D6

D7

D8

S1, S2…..S8: Sources D1, D2…D8: Destination

N1, N2…N16: Nodes

Figure 5.6 Benes Topology

There are totally 16 sources (S1, S2…..S16) and 16 nodes

(N1, N2…N16) in the network and D1/D2, D3/D4, D5/D6, D7/D8 are destination

pairs with same routing path. So, the destinations are distinguished only by

looking at the header information stored along with the data. The routing path

followed in Benes network is shown in Appendix 3. It explains routing path

for transferring data from source to destination. For example, if source S1

needs to transfer data to destination D1 it follows the path {1, 2, 3, 4}

(Appendix 1).

5.2.4 Proposed reconfigurable SDM based NoC router

The simulation results given in section 5.5.1 show that SDM based

NoC is better than TDM based NoC in terms of power and execution time

(delay). SDM based NoC is modeled for five different topologies (Mesh,

Ring, Fat tree, Butterfly, and Benes topologies). The simulation results show

that each topology has its own advantage and disadvantage. Hence, a

reconfigurable SDM NoC which can adapt its topology depending on the

application requirements is proposed in this section.

131

In the proposed router, the parameter used for reconfigurability is

topology. The topology is runtime reconfigurable. The two topologies

selected for reconfigurability are Mesh and Benes topologies. During

operation, the proposed router selects any one of the topologies for

transferring data from the source nodes to their corresponding destination

nodes concurrently. The criterion used for reconfiguration (i.e. topology

selection) is the Number of source nodes making request at a time.

Among the five topologies modeled, the simulation results show

that Benes has the minimum Delay, Logic Utilization and I/O Buffer

Utilization factors (refer Table 5.8, Table 5.9, Table 5.10). Therefore, Benes

is the best topology in terms of speed/delay and area. In Benes topology, the

minimum size of the NoC that could be constructed is 16. The disadvantages

of Benes topology are insertion and deletion of new source nodes and

destination nodes are difficult because inserting a single node in the structure

is not possible and the entire Benes structure has to be replicated (refer Figure

5.6). So the size of NoC increases and the area, power, and delay also

increases. The size of the Benes network can be expanded only as an order of

16, 20, 24, 28, 32 nodes etc. This is the problem in inserting or deleting nodes

in Benes network.

Therefore, Benes topology network is suitable only for NoCs with

large number of nodes. The simulation results also show that among the five

topologies modeled, Mesh topology is best in terms of scalability and

simplicity. In this topology, each router node is connected to every other

router nodes in the network. The interconnections between the nodes are

understood easily and routing is done easily. It is scalable because to increase

the size of NoC, any number of new nodes can be easily added in future

without affecting the existing structure or routing table. i.e., deletion and

insertion of a new node/or any number of nodes is easy in the case of Mesh

132

network. The size of Mesh topology is not fixed like Benes topology network

so it is easily scalable to any size. But the disadvantage of a Mesh network is

the linear increase in delay with respect to the number of nodes. This shows

that the performance degrades when the size of the network is increased.

Therefore, Mesh topology is suitable only for NoCs with lesser number of

nodes. Thus the two topologies have both advantages as well disadvantages.

So, it is better if a NoC is designed with these two topologies and if the NoC

is capable of choosing either one topology during its operation, depending

upon the application requirements. Hence, a reconfigurable NoC is proposed

in this section.

The size of the proposed NoC router is 16. If at a time, 3 source

nodes are requesting the NoC for a data transfer, then the proposed router

works with Mesh topology. If the number of source nodes that are requesting

simultaneously is greater than or equal to 4, then the proposed router works

with Benes topology. Figure 5.7 shows how reconfiguration takes place in the

proposed router. The reason why value 4 is taken as limit is discussed in

section 5.5.

Number of masters that sent request

greater than or equal to

4?

Mesh topology is chosen

Benes topology is chosen

YesNo

Figure 5.7 Flow Chart showing reconfigurability

133

5.3 SIMULATION RESULTS

The NoCs are modeled using VHDL and all the simulation results

are obtained using Xilinx ISE 9.2i. The topologies used for analysis are Mesh,

Ring, Fat Tree, Butterfly, and Benes. Two types of NoCs are modeled. They

are TDM based NoCs and SDM based NoCs. Both TDM based NoCs and

SDM based NoCs are modeled for all the five topologies mentioned above.

Both the types of NoCs are modeled for a size of 16 nodes. Firstly, a

comparison is made between TDM and SDM based NoCs. The parameters

used for comparison are Delay and Power. These parameters are obtained

from the simulation results. The results show that SDM based NoC is better

than the TDM based NoC. Therefore, a further analysis of SDM based NoC

router is considered for the proposed work. Further analysis of SDM based

NoC router is done by considering two different sizes of NoCs, such as 16

nodes and 32 nodes. The analysis is carried out for all the five topologies, to

find which topology would give better performance in the proposed SDM

based NoC router.

The parameters considered for analysis are Delay, Buffer

Utilization, and Logic Utilization. It is found from the simulation results that

Benes topology is the best in terms of delay/speed parameter, whereas, Mesh

topology is best in terms of scalability and simplicity factors. Therefore, to

bring about the advantages of both Benes and Mesh topologies into the

proposed NoC, reconfigurability is introduced in the proposed SDM based

NoC. The reconfigurability is such that the proposed NoC can dynamically

reconfigure its topology depending upon the application”s requirements (as

mentioned in section 5.4.4). The various analyses performed are discussed

below.

134

5.3.1 Comparison between TDM and SDM based NoC routers

The NoC size selected for this analysis is 16 nodes. The parameters

considered for this analysis are Delay and Power. The delay and power results

are discussed below in this section.

5.3.1.1 Power results

Table 5.3 shows power results for TDM and SDM based NoCs. The

table shows that SDM based NoCs consume less power than TDM based

NoCs for all the five topologies modeled. A TDM router consists of a switch,

switch control logic, and the memory which stores the routing table. The

power consumption is due to two components. These two components are

present at evey node in the NoC. They are Logic component (due to switches

at each node, switch control logic etc.) and Memory component (where the

routing table is stored). Whenever the data from a source reaches a router

node, the router of that node refers the routing table in the memory to forward

the data to the next node in the path towards its destination. Therefore, it has

to refer to the memory at every node on its communication path. Likewise, at

every node the memory has to be refreshed during every time slot. This

requires lots of switching activities in the NoC. Therefore, power

consumption is very high in TDM based NoC. When the size of the NoC is

increased, the size of the routing table and the size of the memory also

increases. This increases the power consumption greatly.

In SDM router, the NoC is initially configured according to the

contents of the routing table. Once the configuration is done, then it is not

altered throughout the lifetime of the NoC. So, the SDM router does not

require a memory to store the routing table. Unlike TDM router, there is no

requirement to refresh the memory during every time slot. Hence, the power

135

consumption in SDM router is less when compared to TDM router as shown

in Table 5.3.

Table 5.3 Power results for TDM and SDM routers

Topology

TDM (mW) SDM(mW)

Power due to Logic

component

Power due to Memory

component Total power

Total power(due to logic component

only)Mesh 19 34 53 25

FAT Tree 10 35 45 23 Ring 12 44 56 36

Butterfly 7 67 74 42 Benes 13 23 36 16

5.3.1.2 Delay results

The parameter delay refers to the time taken for the data to transfer

from source node to destination node. The results are shown in Table 5.4. For

TDM router the delay is high because at each node the router has to refer to

the routing table in the memory. A delay overhead is introduced at every

node. Also, in TDM, the data of a particular source can be transmitted only

during its respective time slot. Therefore, the wait time is more in TDM

router. But in SDM, the data from each source passes through independent

and preallocated links so the data from all the sources can be transferred

concurrently to their respective destinations. The delay is very less when

compared to TDM. Therefore, SDM based NoCs are faster than TDM based

NoCs. Since, the data from different sources are transferred

concurrently/simultaneously, the throughput of the NoC is increased. Hence,

the performance is high in SDM routers.

136

Table 5.4 Delay results for TDM and SDM routers

TOPOLOGY TDM(ns) SDM(ns)

Mesh 22.438 5.746FAT Tree 18.814 5.506

Ring 32.507 10.165 Butterfly 19.139 5.513

Benes 22.895 5.408

By comparison of power and delay results of TDM and SDM

routers, it is found that SDM routers are more efficient than TDM routers in

terms of power consumption and speed/delay. Inside the TDM router, a

memory is needed to store the routing table. But for SDM router, the memory

is not needed. So, the size of the router and hence, the area overhead is less

for SDM router. Therefore, SDM based NoCs are considered to be efficient

than the TDM based NoCs.

5.3.2 Comparison among the five topologies used in SDM routers

From the previous section, it is considered that SDM based NoCs

are efficient than TDM based NoCs. Inorder to identify the best topology, an

analysis is made using the following parameters: delay, logic utilization, and

buffer utilization. The SDM router is modeled for the five topologies

mentioned and for two different sizes of the NoC, namely, 16 node NoC and

32 node NoC. Table 5.5 shows delay results for various topologies. The

results show that when the number of nodes in NoC increases, the delay also

increases for all the topologies. Out of the five topologies shown in Table 5.5,

Benes topology has the minimum delay and Ring, Mesh topologies have

greater delay values than the remaining topologies. From Table 5.5, it is clear

that Benes has the minimum delay among all topologies.

137

Table 5.5 Delay for various topologies used in SDM router

Topology 16 Nodes (ns) 32 Nodes (ns)

Mesh 5.746 13.784 FAT Tree 5.506 10.393

Ring 10.165 17.108 Butterfly 5.513 11.078

Benes 5.408 10.139

Logic Utilization refers to the total number of logic gates required

to model each topology. These values are shown in Table 5.6. From Table

5.6, it is clear that Benes has the minimum logic utilization. Benes makes use

of lesser logic gates since the switching of data is lesser.

Table 5.6 Logic Utilization for various topologies in SDM

Topology 16 Nodes 32 NodesMesh 52 82

FAT Tree 64 108 Ring 58 92

Butterfly 54 86 Benes 48 74

I/O Buffer Utilization refers to the number of buffers utilized by the

topologies. Table 5.7 shows the buffer utilization values.

138

Table 5.7 I/O Buffer Utilization for Various Topologies in SDM

Topology 16 Nodes 32 NodesMesh 72 139

FAT Tree 144 195 Ring 138 172

Butterfly 103 156

Benes 64 97

From Table 5.7, it is clear that Benes has the minimum I/O Buffer

utilization. Since other topologies have to transfer data using more number of

nodes than Benes (Appendix 3) they require more buffers to store in each

node.

From the values obtained in Table 5.5, Table 5.6, Table 5.7, it is

found that Benes topology has the minimum delay, logic utilization, and I/O

buffer utilization. Therefore, Benes topology is considered to be best in terms

of performance (i.e. speed) and area optimization leading to power

optimization.

5.3.3 Reconfigurability of the proposed router

From the various discussions made in section 5.5.1 and section

5.5.2, Benes and Mesh topologies seem to be better than the remaining

topologies. Hence a reconfigurable NoC is designed based on the above two

topologies with “number of masters that make simultaneous requests” as the

criteria deciding the mode of configuration. Table 5.8 shows the delay

characteristics of the two topologies for a varying number of simultaneous

requests made by the masters. If the number of masters that make

simultaneous requests is less than 4, then the proposed router chooses Mesh

139

topology for its operation. If the number of masters that make simultaneous

requests greater than or equal to 4, then the proposed router chooses Benes

topology for its operation. The reason why value 4 is taken as the limit is

derived from Table 5.8.

Table 5.8 Number of masters that make simultaneous requests Vs Delay

Number of masters that make

simultaneous requests

Delay in Mesh topology

(ns)

Delay in Benes topology

(ns)1 Master 3.758 4.4902 Masters 4.321 4.5313 Masters 4.499 4.6124 Masters 4.780 4.6585 Masters 4.800 4.7266 Masters 5.213 4.9867 Masters 5.572 5.3148 Masters 5.746 5.408

Table 5.8 shows that when number of masters (sources) making

simultaneous requests is less than four, the delay of mesh network is lesser

than the delay of Benes network. When the number of masters making

simultaneous requests becomes greater than or equal to four, the delay of

Mesh becomes much higher than the delay of Benes network. Hence, the

value four is taken as the criteria for making dynamic topology

reconfiguration. When the number of masters is less, the complexity of Mesh

topology is less since the path between a source and a destination contains

only fewer number of intermediate nodes whereas in Benes topology, the data

has to travel more number of intermediate nodes comparatively to reach its

destination. When the number of masters is more, in case of Mesh topology,

the data has to travel through more number of intermediate nodes than in

140

Benes topology. Therefore, Mesh topology is suitable when few number of

masters make request and Benes topology is suitable when more number of

masters make request.

5.3.4 Application of multimedia inputs to the proposed NoC router

The working of the proposed NoC router for multimedia inputs is

verified. The steps in applying the multimedia inputs are discussed in this

section. Multimedia inputs like audio and images are given as inputs to the

NoC using MATLAB (it is assumed that multimedia inputs are generated by

the sources connected in the NoC). At this stage, the multimedia inputs are

present as parallel data streams. Actually, in a practical situation, the network

interface in NoC is used to convert these streams of parallel data to serial data

before transmission through the NoC. Since this proposed work concentrates

only on router design, the preprocessing of parallel data into serial data is

done externally using MATLAB itself. The multimedia inputs are read from

files as decimal values. The decimal values obtained are converted into bits

(parallel stream). Then the parallel streams are converted into serial stream of

bits. Thus, the serial stream of bits from each source enters into the NoC at

different nodes and after traveling through their respective communication

paths they reach their respective destinations.

5.3.4.1 Audio Data

The audio files must be in wave format to be read into MATLAB.

Hence, the computer system default audio wave files are used. The MATLAB

tool stores the input data values in text files. The outputs reaching the

destinations are also written in new text files. The input and output files are

compared for verification. On comparison, it shows that the output files

contain the same bit strings/streams as that of the input files. This shows that

the transmission is done successfully and this proves the working of the

141

proposed NoC router. The system default audio wave files like Chimes, Cord,

Ringout, and Town are used for testing the working of the proposed NoC

router.

5.3.4.2 Image Data

The image files are used as inputs to the NoC. The image files used

are the default image files available in MATLAB. The image data are applied

as inputs through the source nodes of the NoC. The input and output files are

compared with each other for verification. On comparison, it shows that the

output files contain the same bit strings/streams as that of the input files. This

shows that the transmission is done successfully and this verifies the

functionality of the proposed NoC router.

5.4 CONCLUSION

In this proposed work, firstly, 16 node NoCs are modeled based on

TDM and SDM techniques. Both TDM based NoC and SDM based NoC are

modeled for five different topologies namely, Mesh, Ring, FAT Tree,

Butterfly, and Benes topologies. The power and delay results obtained

through simulations show that SDM NoCs are faster and consume lesser

power than the corresponding TDM NoCs for all the five topologies. So,

SDM NoCs are chosen for making further analysis.

Secondly, SDM NoCs are modeled for an increased number of

nodes (32 nodes) and for the above said five topologies. Next, for finding the

best suitable topology for a SDM NoC, an analysis is made to study the

performance and characteristics of the five different topologies. The

parameters considered are as follows: Delay, Logic Utilization, and Buffer

Utilization. The simulation results show that Benes topology has the

minimum delay, Logic Utilization, and Buffer Utilization values when

142

compared to the remaining four topologies. In addition, Benes topology seems

to be more optimized than the remaining topologies when the size of the

network is increased from 16 to 32. Therefore, Benes topology seems to be a

better choice for an increased size of NoCs. But, as mentioned earlier (Section

5.4.3.1), Mesh topology has a simple structure and it eases physical design.

Moreover, it is easily scalable. But the main disadvantage is, the area of the

mesh increases very much with the increase in the number of nodes of the

network. Mesh topology seems to be the best topology when the size of the

NoC is very small with a few nodes only.

Thirdly, a dynamically reconfigurable NoC is proposed. Since both

Mesh and Benes topologies has its own advantages, a reconfigurable NoC

which can dynamically configure either as Benes or Mesh topology is

proposed. The dynamic reconfiguration takes place depending upon the

number of source nodes currently making a request to the NoC. Different

types of data like bit stream, audio and image files are applied to the proposed

reconfigurable NoC and its functionality is verified to be correct.

CHAPTER 5 RECONFIGURABLE NETWORK ON CHIP...

Documents

Transcript of CHAPTER 5 RECONFIGURABLE NETWORK ON CHIP...