A Reconfigurable Network-on-Chip Architecture for Optimal ...
CHAPTER 5 RECONFIGURABLE NETWORK ON CHIP...
Transcript of CHAPTER 5 RECONFIGURABLE NETWORK ON CHIP...
116
CHAPTER 5
RECONFIGURABLE NETWORK ON CHIP ROUTER USING SPATIAL DIVISION
MULTIPLEXING TECHNIQUE
5.1 INTRODUCTION
Multi-Processor System-on-Chip (MPSoC) architectures have become a very attractive solution for the new consumer multimedia embedded market (Wolf 2004). Although MPSoCs promise to significantly improve the processing capabilities and versatility of embedded systems, one major
problem in their current and future design is the effectiveness of the interconnection mechanisms between the internal components, as the amount
of components grows with each new technological node. Bus-based designs are not able to cope with the heterogeneous and demanding communication
requirements of MPSoCs (David Atienza 2008).
Networks-on-Chip (NoCs) have been suggested as a promising solution to the aforementioned scalability problem of forthcoming MPSoCs
(Dally and Towles 2001; Benini and Micheli 2002). Unlike traditional bus-based on-chip communication architectures, NoCs use packets to route data
from the source to the destination component, via a network fabric that consists of switches (routers) and interconnection links (group of wires). NoCs build on top of the latest evolutions of bus architectures in terms of advanced protocols and topology design, and, by bringing packet-based
communication paradigms to the on-chip domain, they address many of the upcoming issues of interconnect fabric design better than buses (Angiolini
2006).
117
A NoC consists of several Processing Elements (PEs) connected
together via routers and regular sized wires. A PE (also referred to as a node)
can be any component such as a microprocessor, application-specific
integrated circuit (ASIC) block or memory, or a combination of components
connected together. A Network Interface (NI) at the boundary of each PE is
used to packetize any data generated by the PE. This NI is connected to a
router, which has buffers at its input to accept data packets from a PE or from
other routers connected to it. A switch inside the router is used to route the
data packets from the input buffers to the appropriate output link, based on the
address in the packet header. Currently, NoCs exploit Time Division
Multiplexing (TDM) to share network resources among circuits (Anthony
Leroy 2008). A TDM router is basically composed of a switch, which
connects the router input ports to output ports, and an Output Reservation
Table (ORT).
The ORT contains the switch configuration for each time slot based
on the decisions performed by the routing algorithm (Anthony Leroy 2008). It
is implemented by an SRAM read at each time slot to set up the
corresponding switch configuration. But this typically results in high network
area and energy overhead with long circuit set-up time (Anthony Leroy 2008).
An alternate solution is Spatial Division Multiplexing (SDM) technique. This
exploits the fact that on-chip network links are physically made up of a set of
wires (Anthony Leroy 2008). In SDM only a subset of the link wires is
allocated to a given virtual circuit. Messages are digit-serialized on a portion
of the link (i.e., serialized on a group of wires). The switch configuration is
set once and for all at the connection set up (Anthony Leroy 2008). No inside-
router configuration memory is therefore needed and the constraints on the
reservation of the circuits are relaxed (Anthony Leroy 2008).
118
In this chapter, a reconfigurable NoC router based on SDM
technique is proposed. This proposed router is designed using multiplexer
switch instead of the standard crossbar, since the complexity of the existing
crossbar increases as number of inputs increases. A fixed network topology
can be a disadvantage in NoC platforms due to misalignment with application
requirements (Pau 2008). It is therefore desirable to incorporate a certain level
of configurability and so using the proposed NoC router, a reconfigurable
SDM based router is proposed in this chapter.
In Dally and Towels (2001), the concept of on-chip networks is
introduced. The paper sketches the design of a simple on-chip network, the
design choices made in that simple network, the advantages and
disadvantages of on-chip networks and the challenges and open research
issues in the design of such networks. Benini and Micheli (2002) has
proposed the concept of viewing a SoC as a micronetwork of components.
Benini and Micheli (2002) proposed ideas of borrowing models, techniques,
and tools from the network design field and applying them to SoC design. The
network is the abstraction of the communication among components and must
satisfy quality-of-service requirements—such as reliability, performance, and
energy bounds—under the limitation of intrinsically unreliable signal
transmission and significant communication delays on wires (Luca Benini
2002). In David Atienzaa (2008), an overview of the benefits of state-of-the-
art NoCs using a complete NoC synthesis flow and a detailed scalability
analysis of different NoC implementations for the latest nanometer-scale
technology nodes is presented. In this paper, a thorough study of the current
state-of-the-art of NoC implementations using a design flow targeting the new
trends imposed by deep submicron manufacturing processes is performed.
Also, a comparative analysis of different NoC fabrics ranging from regular
topologies to highly tuned custom NoCs is presented.
119
In Anthony Leroy (2008), two approaches for Switched Virtual
Circuit (SVC) NoCs are compared. The two approaches used are TDM-based
SVC and SDM-based SVC. The paper shows that the virtual-circuit set-up
time, the area overhead, and the energy consumption for the SDM technique
is generally better than for the traditional TDM technique. The paper explains
that SDM technique certainly appears to be a very valuable alternative to
TDM technique. Pau (2008) proposed a configurable TDM router intended as
a dedicated embedded module for NoC support in an FPGA. The configurable
router is packet-switched and it provides five bidirectional ports. Switching is
performed by two 3x3 crossbars instead of a full 5x5 crossbar as a means of
reducing area and power consumption. The control logic in each crossbar
orchestrates all of the switching activities and channel arbitration according to
the routing algorithm of the selected network topology and the traffic
conditions during operation. Therefore, some of the advantages of Pau (2008)
and Anthony Leroy (2008) are combined and a reconfigurable SDM based
NoC router is proposed in this chapter.
The existing SDM router contains a switch and a switch control
unit. The switch is slightly larger than in TDM router as it must be able to
potentially interconnect any group of wires present at the router input port to
another group of wires of any output port. The TDM router offering m time
slots is based on a PxP n-bit wide crossbar. For SDM, an n-bit port is divided
into n/m individually switchable group of m wires (Anthony Leroy 2008).
Therefore, for the same bandwidth and number of segments, at the same clock
frequency, the number of input and output ports of the switch is increased by
a factor m for SDM. However, the ports” bit-width is divided by a factor m.
The SDM router would thus require a (Pxm) x (Pxm) n/m-bit wide crossbar.
The SDM technique combines the simplicity of nonvirtual circuit-switch
implementation (as the switch configuration is fixed for the circuit lifetime)
120
and the flexibility of the bandwidth allocation proper to the Switched Virtual
Circuit technique (Anthony Leroy 2008).
Full-crossbars have a too high complexity to be used as an SDM
router”s switch (Anthony Leroy 2008). An interesting alternative to crossbars
consists of using Multiple stages Interconnection Network (MIN) switches.
MIN switches are available in a wide variety. These can reduce the cross-
points cost. The cost of using such a switch is paid either in bandwidth
(longer clock cycles) or in delay (pipelined stages and multiple cycles to go
through). As the number of cross points in MIN switches is reduced, some
input-output connections can no longer be realized as one cross-point can be
simultaneously required by two connections, resulting in a blocking state
(Anthony Leroy 2008). A classification of MIN switches can be realized
based on how easily those blocking states can be avoided (Anthony Leroy
2008). In Strictly Non Blocking (SNB) switches, any new connection from a
free input to a free output can always be realized (Anthony Leroy 2008). The
same condition applies to Non Blocking (NB) switches but with the
restriction of carefully choosing the path taken in the switch. In Rearrangeable
Non Blocking (RNB) switches, in certain situations an internal switch
rerouting might be necessary to find a non blocking solution but a solution
always exists (Anthony Leroy 2008). Finally, for blocking switches, some
connections can be blocked by others without any alternative solution.
Among the different implementation possibilities, SNB switches
are attractive, but their minimum cross-point cost is still big, which would
lead to an area overhead comparable to the crossbar”s. To reduce the switch
overhead to a minimum, an RNB Benes switch is proposed in Anthony Leroy
(2008). But the delay is systematically larger for the Benes switch than the
crossbar. The reason is that Benes switch is composed of several switch
stages interconnected by complex interconnect patterns that are relatively
121
difficult to route on a 2D plane. Another disadvantage of the Benes switch is,
the control of a Benes switch is more complex than the control of a crossbar.
The Benes switch needs a dedicated switch control unit that allows solving
any potential contention within the router, reserving the route within the
switch, and controlling the corresponding atomic switches (Anthony Leroy
2008). Hence, in order to overcome the disadvantages of the Benes switch, a
simple switch with less area, power, and delay overhead is proposed in this
chapter. The proposed switch does not have complex control logic and so the
configuration set-up of the NoC is easy to achieve.
5.2 PROPOSED WORK
In this section, two designs are proposed, namely, the design of a
router switch and the design of a reconfigurable SDM based router. Firstly,
the design aspects of the proposed router switch is discussed. Secondly, the
five topologies considered for analysis of the NoC are discussed. Thirdly, the
design of the proposed reconfigurable SDM router (incorporating the
proposed router switch) is discussed.
5.2.1 Proposed Router Switch Design
Crossbars are commonly used for routing in SDM switches. But
crossbars need high energy and area overhead. Full-crossbars have high
complexity to be used as an SDM router”s switch. Hence, an alternate switch
for the router using a single multiplexer is proposed in this section. A SDM
router consists of a switch and a switch control unit. Figure 5.1 shows the
structure of the proposed NoC router switch. The proposed router switch has
four I/O ports, namely, EAST, WEST, NORTH, and SOUTH. Each port is 16
bits wide. The 16 lines (wires) of a port (link) are grouped into 4 groups
consisting of 4 wires each. The 4 groups are named as A, B, C and D
respectively, as shown in Figure 5.1.
122
Figure 5.1 Proposed NoC router switch
The switch shown in Figure 5.1 consists of a single multiplexer.
This switch is present inside the router of every node. For each clock cycle,
the control word (i.e. configuration) of the switch is updated. Depending upon
the control word, the multiplexer can switch to any group of wires of any port.
In Figure 5.1, inputs from four sources A, B, C and D are assumed to arrive at
port1 of the router. The bandwidth is 16 since the port size is 16 bits. Since
there are four sources, the bandwidth is divided by four such that each source
gets an equal share of the bandwidth. So, the sources A, B, C, D gets a
bandwidth of 4 bits (i.e. 25% of the total bandwidth). The bandwidth
allocation between the sources can be done in any way depending upon the
specifications. For e.g. the source A can be allocated with bandwidth equal to
8, source B with bandwidth 4, source C with bandwidth 2 and source D with
bandwidth 2. The group of wires allocated for each source during the
initialization of the NoC is used only by the respective source throughout the
lifetime of the NoC. In this section, the bandwidth is assumed to be 4 for all
sources. This switch can also be extended for more than four sources.
123
5.2.2 Design of NoC for five different topologies
Let the sources be named as S1, S2, S3…Sn and the destinations be
named as D1, D2, D3…Dn. In SDM, any two nodes are permanently connected
by a link containing a fixed number of wires based on the bandwidth
(Anthony Leroy 2008). If the bandwidth of the NoC is assumed as 16, then
the links between any two nodes consist of 16 wires. In SDM, each link is
divided into subset of wires whereas in TDM the links between the nodes are
not divided into groups or subsets of wires. In SDM based NoC, inside each
node there is a router switch which uses a control word to select an output
port during data transfer in the NoC. The routing path is referred once during
initialization and the wires are fixed according to it. Hence, there is no need
for storing the routing table in a separate memory block for SDM.
In TDM, there are no separate wires allocated for each source. A
single link (containing 16 wires) is present between any two nodes and each
source is given a distinct time slot only, during which they can transfer
messages. In TDM, two or more data from various sources may arrive at the
router at the same instant, but at a time only one data can be forwarded
through a node. Hence, arbitration is involved in each and every node of
TDM topologies. In this work, arbitration done at every node is based on
priority, the data from source S1 is given the first priority followed by
S2, S3 …Sn.
The routing algorithm is based on the shortest distance path
between two nodes. The routing algorithm used here is static, that is the
routing path between source and destination is fixed initially and after that the
path between a source-destination pair does not change at all. In this proposed
work, the routing path between any source-destination pair is assumed to be
same for both TDM and SDM based NoCs that are modeled.
124
5.2.3 NoC with 16 nodes
The structure and routing table of 16 node NoCs modeled using
various topologies are discussed in this section. The same structure and
routing table are assumed for both TDM and SDM NoCs which are
considered for analysis in the proposed work. The topologies considered for
making analysis in the proposed work are Mesh, Butterfly, Fat Tree, Ring,
and Benes.
5.2.3.1 Mesh Topology
Figure 5.2 shows a 4x4 mesh topology. It is a direct blocking type
topology network (Sudeep Pasricha 2008).
S1, S2… S8: Sources D1, D2…D4: Destination
X: Undefined source N1, N2…N16: Nodes
Figure 5.2 Mesh Topology
There are 16 nodes in the above figure. The sources are denoted by
S1, S2….S8. The undefined sources indicated by X denote that sources can be
added to those nodes in future if required. The 2-D mesh is one of the most
popular NoC topologies because all links have the same length which eases
N1 N2 N3 N4
N5 N6 N7 N8
N9 N10 N11 N12
N13 N14 N15 N16
S1 X X S5
S6
S7
S8
XXS2
S3
S4 X X
XX
D1
D2
D3
D4
125
physical design (Sudeep Pasricha 2008). Every node in a 2-D mesh is
connected to four neighboring nodes, except for the nodes at the edges. The
area of a mesh grows linearly with the number of nodes. Meshes must also be
designed in such a way so as to avoid traffic accumulating in the center of the
mesh, which reduces performance (Sudeep Pasricha 2008).
The common routing path followed by TDM and SDM is shown in
Table 5.1. It shows the path taken by the data in traveling from a particular
source to a particular destination. For example, if source S1 needs to transfer
data to destination D1 it follows the path consisting of nodes numbered
{1, 2, 3, 4} as shown in Table 5.1.
Table 5.1 Routing table used in Mesh topology
Source nodes/ Destination
nodesD1 D2 D3 D4
S1 1,2,3,4 1,6,7,8 1,10,11,12 1,14,15,16S2 5,2,3,4 5,6,7,8 5,10,11,12 5,14,15,16S3 9,2,3,4 9,6,7,8 9,10,11,12 9,14,15,16S4 13,2,3,4 13,6,7,8 13,10,11,12 13,14,15,16 S5 4 4,8 4,8,12 4,8,12,16S6 8,4 8 8,12 8,12,16 S7 12,8,4 12,8 12 12,16S8 16,12,8,4 16,12,8 16,12 16
S1, S2… S8: Sources D1, D2…D4: Destination
5.2.3.2 FAT Tree Topology
Figure 5.3 shows fat tree topology. It is an indirect multistage
topology network (Sudeep Pasricha 2008). It has a tree structure with parent
and child nodes.
126
N1
N2 N3
N6 N7N5
N15N14N13N12N11N10N9N8
N4
N16
N1, N2… N16: Nodes
Figure 5.3 FAT Tree Topology
There are totally 16 sources/destinations (S1/D1, S2/D2…..S16/D16),
one at each node. Links among adjacent switches are increased as they get
closer to the root of the tree (Sudeep Pasricha 2008). Increasing the number of
links near the root of the tree essentially allocates more bandwidth on the
channels that have higher traffic (Sudeep Pasricha 2008).
The common routing path followed by TDM and SDM is shown in
Table 5.2. It explains routing path for transferring data from source to
destination. For example, if source S1 needs to transfer data to Destination D13
it follows the path consisting of nodes numbered {1, 3, 6, 13} as shown in
Table 5.2.
Table 5.2 Routing table used in FAT Tree topology
D16 1,2,4, 8,16 2,4,8, 16 Nil 4,8,16 Nil Nil Nil 8,16 Nil Nil Nil Nil Nil Nil Nil 16 D15 1,3,7,15 Nil 3,7,15 Nil Nil Nil 7,15 Nil Nil Nil Nil Nil Nil Nil 15 M/C D14 1,3,7,14 Nil 3,7,14 Nil Nil Nil 7,14 Nil Nil Nil Nil Nil Nil 14 M/C M/C D13 1,3,6,13 Nil 3,6,13 Nil Nil 6,13 Nil Nil Nil Nil Nil Nil 13 M/C M/C M/C D12 1,3,6,12 Nil 3,6,12 Nil Nil 6,12 Nil Nil Nil Nil Nil 12 M/C M/C M/C M/C D11 1,2,5,11 2,5,11 Nil Nil 5,11 Nil Nil Nil Nil Nil 11 M/C M/C M/C M/C M/C D10 1,2,5,10 2,5,10 Nil Nil 5,10 Nil Nil Nil Nil 10 M/C M/C M/C M/C M/C M/C D9 1,2,4,9 2,4,9 Nil 4,9 Nil Nil Nil Nil 9 M/C M/C M/C M/C M/C M/C M/C D8 1,2,4,8 2,4,8 Nil 4,8 Nil Nil Nil 8 M/C M/C M/C M/C M/C M/C M/C M/C D7 1,3,7 Nil 3,7 Nil Nil Nil 7 M/C M/C M/C M/C M/C M/C M/C M/C M/C D6 1,3,6 Nil 3,6 Nil Nil 6 M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C D5 1,2,5 2,5 Nil Nil 5 M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C D4 1,2,4 2,4 Nil 4 M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C D3 1,3 Nil 3 M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C D2 1,2 2 M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C D1 1 M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C M/C
S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13 S14 S15 S16
S1, S2….S16: Sources D1, D2…D16: Destinations
M/C: Mother/Child node Nil: No Connection
128
5.2.3.3 Ring Topology
Figure 5.4 shows uni-directional ring topology network. It is a
direct blocking type of topology.
N16 N1N2
N3
N4
N5
N6
N7N8
N9
N10
N11
N12
N13
N14
N15
N1, N2…N16 - Nodes
Figure 5.4 Ring Topology
There are totally 16 sources/destinations (S1/D1, S2/D2…..S16/D16),
one at each node. Each node can act either as source or as destination (Sudeep
Pasricha 2008). The common routing path followed by TDM and SDM is
shown in Appendix 1. It explains routing path for transferring data from
source to destination. For example, if source S1 needs to transfer data to
Destination D6 it follows the path of nodes numbered as {1, 2, 3, 4, 5, 6}
(Appendix 1).
5.2.3.4 Butterfly Topology
Figure 5.5 shows Butterfly topology network with 8 sources (S1,
S2…..S8), 8 destinations (D1, D2…D8) and 16 nodes (N1, N2…N16). It is an
indirect type topology network (Sudeep Pasricha 2008).
129
S1
S2
N1
S3
S4
S5
S6
S7
S8
N5
N9
N13
N2 N3 N4
N6 N7 N8
N10 N11 N12
N16N15N14
D1
D2
D3
D4
D5
D6
D7
D8
S1, S2…..S8 – Sources
D1, D2…D8- Destinations
N1, N2…N16 - Nodes
Figure 5.5 Butterfly Topology
The butterfly network is a blocking multi-stage network, which
implies that information may be temporarily blocked or dropped in the
network if contention occurs (Sudeep Pasricha 2008). The routing table used
in Butterfly topology network is shown in Appendix 1. It explains routing
path for transferring data from source to destination. For example, if source S1
needs to transfer data to destination D1 it follows the path {1, 2, 3, 4}
(Appendix 1).
5.2.3.5 Benes Topology
Figure 5.6 shows a 16 node Benes topology. The Benes network is
an example of a rearrangeable network in which paths may have to be
rearranged to provide a connection, requiring an appropriate controller. It is a
non-blocking network (Sudeep Pasricha 2008). Hence, an alternative routing
path also exists between any source-destination pair.
130
N1 N2 N3 N4
N5
S1
S1
S1
S1
S1
S1
S1
S1
N9
N13
N6 N7 N8
N10 N11 N12
N14 N15 N16
D1
D2
D3
D4
D5
D6
D7
D8
S1, S2…..S8: Sources D1, D2…D8: Destination
N1, N2…N16: Nodes
Figure 5.6 Benes Topology
There are totally 16 sources (S1, S2…..S16) and 16 nodes
(N1, N2…N16) in the network and D1/D2, D3/D4, D5/D6, D7/D8 are destination
pairs with same routing path. So, the destinations are distinguished only by
looking at the header information stored along with the data. The routing path
followed in Benes network is shown in Appendix 3. It explains routing path
for transferring data from source to destination. For example, if source S1
needs to transfer data to destination D1 it follows the path {1, 2, 3, 4}
(Appendix 1).
5.2.4 Proposed reconfigurable SDM based NoC router
The simulation results given in section 5.5.1 show that SDM based
NoC is better than TDM based NoC in terms of power and execution time
(delay). SDM based NoC is modeled for five different topologies (Mesh,
Ring, Fat tree, Butterfly, and Benes topologies). The simulation results show
that each topology has its own advantage and disadvantage. Hence, a
reconfigurable SDM NoC which can adapt its topology depending on the
application requirements is proposed in this section.
131
In the proposed router, the parameter used for reconfigurability is
topology. The topology is runtime reconfigurable. The two topologies
selected for reconfigurability are Mesh and Benes topologies. During
operation, the proposed router selects any one of the topologies for
transferring data from the source nodes to their corresponding destination
nodes concurrently. The criterion used for reconfiguration (i.e. topology
selection) is the Number of source nodes making request at a time.
Among the five topologies modeled, the simulation results show
that Benes has the minimum Delay, Logic Utilization and I/O Buffer
Utilization factors (refer Table 5.8, Table 5.9, Table 5.10). Therefore, Benes
is the best topology in terms of speed/delay and area. In Benes topology, the
minimum size of the NoC that could be constructed is 16. The disadvantages
of Benes topology are insertion and deletion of new source nodes and
destination nodes are difficult because inserting a single node in the structure
is not possible and the entire Benes structure has to be replicated (refer Figure
5.6). So the size of NoC increases and the area, power, and delay also
increases. The size of the Benes network can be expanded only as an order of
16, 20, 24, 28, 32 nodes etc. This is the problem in inserting or deleting nodes
in Benes network.
Therefore, Benes topology network is suitable only for NoCs with
large number of nodes. The simulation results also show that among the five
topologies modeled, Mesh topology is best in terms of scalability and
simplicity. In this topology, each router node is connected to every other
router nodes in the network. The interconnections between the nodes are
understood easily and routing is done easily. It is scalable because to increase
the size of NoC, any number of new nodes can be easily added in future
without affecting the existing structure or routing table. i.e., deletion and
insertion of a new node/or any number of nodes is easy in the case of Mesh
132
network. The size of Mesh topology is not fixed like Benes topology network
so it is easily scalable to any size. But the disadvantage of a Mesh network is
the linear increase in delay with respect to the number of nodes. This shows
that the performance degrades when the size of the network is increased.
Therefore, Mesh topology is suitable only for NoCs with lesser number of
nodes. Thus the two topologies have both advantages as well disadvantages.
So, it is better if a NoC is designed with these two topologies and if the NoC
is capable of choosing either one topology during its operation, depending
upon the application requirements. Hence, a reconfigurable NoC is proposed
in this section.
The size of the proposed NoC router is 16. If at a time, 3 source
nodes are requesting the NoC for a data transfer, then the proposed router
works with Mesh topology. If the number of source nodes that are requesting
simultaneously is greater than or equal to 4, then the proposed router works
with Benes topology. Figure 5.7 shows how reconfiguration takes place in the
proposed router. The reason why value 4 is taken as limit is discussed in
section 5.5.
Number of masters that sent request
greater than or equal to
4?
Mesh topology is chosen
Benes topology is chosen
YesNo
Figure 5.7 Flow Chart showing reconfigurability
133
5.3 SIMULATION RESULTS
The NoCs are modeled using VHDL and all the simulation results
are obtained using Xilinx ISE 9.2i. The topologies used for analysis are Mesh,
Ring, Fat Tree, Butterfly, and Benes. Two types of NoCs are modeled. They
are TDM based NoCs and SDM based NoCs. Both TDM based NoCs and
SDM based NoCs are modeled for all the five topologies mentioned above.
Both the types of NoCs are modeled for a size of 16 nodes. Firstly, a
comparison is made between TDM and SDM based NoCs. The parameters
used for comparison are Delay and Power. These parameters are obtained
from the simulation results. The results show that SDM based NoC is better
than the TDM based NoC. Therefore, a further analysis of SDM based NoC
router is considered for the proposed work. Further analysis of SDM based
NoC router is done by considering two different sizes of NoCs, such as 16
nodes and 32 nodes. The analysis is carried out for all the five topologies, to
find which topology would give better performance in the proposed SDM
based NoC router.
The parameters considered for analysis are Delay, Buffer
Utilization, and Logic Utilization. It is found from the simulation results that
Benes topology is the best in terms of delay/speed parameter, whereas, Mesh
topology is best in terms of scalability and simplicity factors. Therefore, to
bring about the advantages of both Benes and Mesh topologies into the
proposed NoC, reconfigurability is introduced in the proposed SDM based
NoC. The reconfigurability is such that the proposed NoC can dynamically
reconfigure its topology depending upon the application”s requirements (as
mentioned in section 5.4.4). The various analyses performed are discussed
below.
134
5.3.1 Comparison between TDM and SDM based NoC routers
The NoC size selected for this analysis is 16 nodes. The parameters
considered for this analysis are Delay and Power. The delay and power results
are discussed below in this section.
5.3.1.1 Power results
Table 5.3 shows power results for TDM and SDM based NoCs. The
table shows that SDM based NoCs consume less power than TDM based
NoCs for all the five topologies modeled. A TDM router consists of a switch,
switch control logic, and the memory which stores the routing table. The
power consumption is due to two components. These two components are
present at evey node in the NoC. They are Logic component (due to switches
at each node, switch control logic etc.) and Memory component (where the
routing table is stored). Whenever the data from a source reaches a router
node, the router of that node refers the routing table in the memory to forward
the data to the next node in the path towards its destination. Therefore, it has
to refer to the memory at every node on its communication path. Likewise, at
every node the memory has to be refreshed during every time slot. This
requires lots of switching activities in the NoC. Therefore, power
consumption is very high in TDM based NoC. When the size of the NoC is
increased, the size of the routing table and the size of the memory also
increases. This increases the power consumption greatly.
In SDM router, the NoC is initially configured according to the
contents of the routing table. Once the configuration is done, then it is not
altered throughout the lifetime of the NoC. So, the SDM router does not
require a memory to store the routing table. Unlike TDM router, there is no
requirement to refresh the memory during every time slot. Hence, the power
135
consumption in SDM router is less when compared to TDM router as shown
in Table 5.3.
Table 5.3 Power results for TDM and SDM routers
Topology
TDM (mW) SDM(mW)
Power due to Logic
component
Power due to Memory
component Total power
Total power(due to logic component
only)Mesh 19 34 53 25
FAT Tree 10 35 45 23 Ring 12 44 56 36
Butterfly 7 67 74 42 Benes 13 23 36 16
5.3.1.2 Delay results
The parameter delay refers to the time taken for the data to transfer
from source node to destination node. The results are shown in Table 5.4. For
TDM router the delay is high because at each node the router has to refer to
the routing table in the memory. A delay overhead is introduced at every
node. Also, in TDM, the data of a particular source can be transmitted only
during its respective time slot. Therefore, the wait time is more in TDM
router. But in SDM, the data from each source passes through independent
and preallocated links so the data from all the sources can be transferred
concurrently to their respective destinations. The delay is very less when
compared to TDM. Therefore, SDM based NoCs are faster than TDM based
NoCs. Since, the data from different sources are transferred
concurrently/simultaneously, the throughput of the NoC is increased. Hence,
the performance is high in SDM routers.
136
Table 5.4 Delay results for TDM and SDM routers
TOPOLOGY TDM(ns) SDM(ns)
Mesh 22.438 5.746FAT Tree 18.814 5.506
Ring 32.507 10.165 Butterfly 19.139 5.513
Benes 22.895 5.408
By comparison of power and delay results of TDM and SDM
routers, it is found that SDM routers are more efficient than TDM routers in
terms of power consumption and speed/delay. Inside the TDM router, a
memory is needed to store the routing table. But for SDM router, the memory
is not needed. So, the size of the router and hence, the area overhead is less
for SDM router. Therefore, SDM based NoCs are considered to be efficient
than the TDM based NoCs.
5.3.2 Comparison among the five topologies used in SDM routers
From the previous section, it is considered that SDM based NoCs
are efficient than TDM based NoCs. Inorder to identify the best topology, an
analysis is made using the following parameters: delay, logic utilization, and
buffer utilization. The SDM router is modeled for the five topologies
mentioned and for two different sizes of the NoC, namely, 16 node NoC and
32 node NoC. Table 5.5 shows delay results for various topologies. The
results show that when the number of nodes in NoC increases, the delay also
increases for all the topologies. Out of the five topologies shown in Table 5.5,
Benes topology has the minimum delay and Ring, Mesh topologies have
greater delay values than the remaining topologies. From Table 5.5, it is clear
that Benes has the minimum delay among all topologies.
137
Table 5.5 Delay for various topologies used in SDM router
Topology 16 Nodes (ns) 32 Nodes (ns)
Mesh 5.746 13.784 FAT Tree 5.506 10.393
Ring 10.165 17.108 Butterfly 5.513 11.078
Benes 5.408 10.139
Logic Utilization refers to the total number of logic gates required
to model each topology. These values are shown in Table 5.6. From Table
5.6, it is clear that Benes has the minimum logic utilization. Benes makes use
of lesser logic gates since the switching of data is lesser.
Table 5.6 Logic Utilization for various topologies in SDM
Topology 16 Nodes 32 NodesMesh 52 82
FAT Tree 64 108 Ring 58 92
Butterfly 54 86 Benes 48 74
I/O Buffer Utilization refers to the number of buffers utilized by the
topologies. Table 5.7 shows the buffer utilization values.
138
Table 5.7 I/O Buffer Utilization for Various Topologies in SDM
Topology 16 Nodes 32 NodesMesh 72 139
FAT Tree 144 195 Ring 138 172
Butterfly 103 156
Benes 64 97
From Table 5.7, it is clear that Benes has the minimum I/O Buffer
utilization. Since other topologies have to transfer data using more number of
nodes than Benes (Appendix 3) they require more buffers to store in each
node.
From the values obtained in Table 5.5, Table 5.6, Table 5.7, it is
found that Benes topology has the minimum delay, logic utilization, and I/O
buffer utilization. Therefore, Benes topology is considered to be best in terms
of performance (i.e. speed) and area optimization leading to power
optimization.
5.3.3 Reconfigurability of the proposed router
From the various discussions made in section 5.5.1 and section
5.5.2, Benes and Mesh topologies seem to be better than the remaining
topologies. Hence a reconfigurable NoC is designed based on the above two
topologies with “number of masters that make simultaneous requests” as the
criteria deciding the mode of configuration. Table 5.8 shows the delay
characteristics of the two topologies for a varying number of simultaneous
requests made by the masters. If the number of masters that make
simultaneous requests is less than 4, then the proposed router chooses Mesh
139
topology for its operation. If the number of masters that make simultaneous
requests greater than or equal to 4, then the proposed router chooses Benes
topology for its operation. The reason why value 4 is taken as the limit is
derived from Table 5.8.
Table 5.8 Number of masters that make simultaneous requests Vs Delay
Number of masters that make
simultaneous requests
Delay in Mesh topology
(ns)
Delay in Benes topology
(ns)1 Master 3.758 4.4902 Masters 4.321 4.5313 Masters 4.499 4.6124 Masters 4.780 4.6585 Masters 4.800 4.7266 Masters 5.213 4.9867 Masters 5.572 5.3148 Masters 5.746 5.408
Table 5.8 shows that when number of masters (sources) making
simultaneous requests is less than four, the delay of mesh network is lesser
than the delay of Benes network. When the number of masters making
simultaneous requests becomes greater than or equal to four, the delay of
Mesh becomes much higher than the delay of Benes network. Hence, the
value four is taken as the criteria for making dynamic topology
reconfiguration. When the number of masters is less, the complexity of Mesh
topology is less since the path between a source and a destination contains
only fewer number of intermediate nodes whereas in Benes topology, the data
has to travel more number of intermediate nodes comparatively to reach its
destination. When the number of masters is more, in case of Mesh topology,
the data has to travel through more number of intermediate nodes than in
140
Benes topology. Therefore, Mesh topology is suitable when few number of
masters make request and Benes topology is suitable when more number of
masters make request.
5.3.4 Application of multimedia inputs to the proposed NoC router
The working of the proposed NoC router for multimedia inputs is
verified. The steps in applying the multimedia inputs are discussed in this
section. Multimedia inputs like audio and images are given as inputs to the
NoC using MATLAB (it is assumed that multimedia inputs are generated by
the sources connected in the NoC). At this stage, the multimedia inputs are
present as parallel data streams. Actually, in a practical situation, the network
interface in NoC is used to convert these streams of parallel data to serial data
before transmission through the NoC. Since this proposed work concentrates
only on router design, the preprocessing of parallel data into serial data is
done externally using MATLAB itself. The multimedia inputs are read from
files as decimal values. The decimal values obtained are converted into bits
(parallel stream). Then the parallel streams are converted into serial stream of
bits. Thus, the serial stream of bits from each source enters into the NoC at
different nodes and after traveling through their respective communication
paths they reach their respective destinations.
5.3.4.1 Audio Data
The audio files must be in wave format to be read into MATLAB.
Hence, the computer system default audio wave files are used. The MATLAB
tool stores the input data values in text files. The outputs reaching the
destinations are also written in new text files. The input and output files are
compared for verification. On comparison, it shows that the output files
contain the same bit strings/streams as that of the input files. This shows that
the transmission is done successfully and this proves the working of the
141
proposed NoC router. The system default audio wave files like Chimes, Cord,
Ringout, and Town are used for testing the working of the proposed NoC
router.
5.3.4.2 Image Data
The image files are used as inputs to the NoC. The image files used
are the default image files available in MATLAB. The image data are applied
as inputs through the source nodes of the NoC. The input and output files are
compared with each other for verification. On comparison, it shows that the
output files contain the same bit strings/streams as that of the input files. This
shows that the transmission is done successfully and this verifies the
functionality of the proposed NoC router.
5.4 CONCLUSION
In this proposed work, firstly, 16 node NoCs are modeled based on
TDM and SDM techniques. Both TDM based NoC and SDM based NoC are
modeled for five different topologies namely, Mesh, Ring, FAT Tree,
Butterfly, and Benes topologies. The power and delay results obtained
through simulations show that SDM NoCs are faster and consume lesser
power than the corresponding TDM NoCs for all the five topologies. So,
SDM NoCs are chosen for making further analysis.
Secondly, SDM NoCs are modeled for an increased number of
nodes (32 nodes) and for the above said five topologies. Next, for finding the
best suitable topology for a SDM NoC, an analysis is made to study the
performance and characteristics of the five different topologies. The
parameters considered are as follows: Delay, Logic Utilization, and Buffer
Utilization. The simulation results show that Benes topology has the
minimum delay, Logic Utilization, and Buffer Utilization values when
142
compared to the remaining four topologies. In addition, Benes topology seems
to be more optimized than the remaining topologies when the size of the
network is increased from 16 to 32. Therefore, Benes topology seems to be a
better choice for an increased size of NoCs. But, as mentioned earlier (Section
5.4.3.1), Mesh topology has a simple structure and it eases physical design.
Moreover, it is easily scalable. But the main disadvantage is, the area of the
mesh increases very much with the increase in the number of nodes of the
network. Mesh topology seems to be the best topology when the size of the
NoC is very small with a few nodes only.
Thirdly, a dynamically reconfigurable NoC is proposed. Since both
Mesh and Benes topologies has its own advantages, a reconfigurable NoC
which can dynamically configure either as Benes or Mesh topology is
proposed. The dynamic reconfiguration takes place depending upon the
number of source nodes currently making a request to the NoC. Different
types of data like bit stream, audio and image files are applied to the proposed
reconfigurable NoC and its functionality is verified to be correct.