7/30/2019 Biswas4 Seminar 1
1/24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Presented by:
Kevin Biswas
April 1, 2005
Supervisor:
Dr. M. Ahmadi
University of Windsor
Electrical and Computer Engineering
Multiplexer-Based Array Multipliers
7/30/2019 Biswas4 Seminar 1
2/24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Background
In modern digital systems, the component responsible forhandling arithmetic operations is known as the Arithmetic Logic
Unit (ALU) These lie in the critical data path of the core data processing
elements CPU, DSP, and ASIC/FPGA processing andaddressing ICs
Performance of the system, in regards to numerical applications isdirectly related to the structure and design of the ALU whichperforms:
addition/subtraction
multiplication
shift/extension
exponentiation
many others
7/30/2019 Biswas4 Seminar 1
3/24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Background
The most critical function carried out by ALU is multiplication
Digital multiplication is not the most fundamentally complexoperation, but is the most extensively used operation (especiallyin signal processing)
Innumerable schemes have been proposed for realization of theoperation
7/30/2019 Biswas4 Seminar 1
4/24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Basics of Digital Multiplication
Digital multiplication entails a sequence of additions carried out on partialproducts
The method by which this partial product array is summed to give the final
product is the key distinguishing factor amongst the numerous multiplication
schemes
A 16-bit Partial Product Array [Courtesy: Gary Bewick, Stanford]
7/30/2019 Biswas4 Seminar 1
5/24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Multiplication Schemes
Serial Multiplication (Shift-Add)
Standardshift-add multiplier implementation[Courtesy: Gary Bewick, Stanford]
7/30/2019 Biswas4 Seminar 1
6/24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Multiplication Schemes
Serial Multiplication (High Radix Multipliers)
Increased throughput by using more bits of the multiplier
Separate register required storing multiples of the multiplicand
Shift-Add implementation of a basicradix-4 multiplier
7/30/2019 Biswas4 Seminar 1
7/24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
8/24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Multiplication Schemes
Tree Multiplier
Offers potential for multiplicationin time O(log n)
Once partial product array isformed, bits are passed toreduction network
Here column-wise compressionof the bits takes place, yieldingtwo final partial products
Final product is obtained by
addition of these two partialproducts
Considered to be irregular inform and does not permitefficient VLSI realization
[Courtesy: Gary Bewick, Stanford]
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
9/24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Multiplication Schemes
Array Multiplier
Partial products are independently computed in parallel
Consider two binary numbers A and B, of mand nbits, respectively:
1 1
0 0
2 2m n
i j
i j
i j
P A B A B
= =
= = i i
1
0
2m
i
i
i
A A
=
= 1
0
2n
j
j
j
B B
=
=
1 1
0 0
2m n
i j
i j
i j
P A B
+
= =
=
1
0
2mn
k
k
k
P P
=
= Pk is known as the partial product term,also called the summand
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
10/24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Multiplication Schemes
Array Multiplier
There are mnsummands that are produced in parallel by a set ofmnAND gates
n x nmultiplier requires n(n-2) full adders, nhalf-adders and n2
AND gates
Worst case delay would be (2n+1)td
Basic cell of a parallel array
multiplier
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
11/24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Multiplication Schemes
Array structure of parallel multiplier
[Courtesy: Karteek Chada, Texas A&M]
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
12/24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR
Multiplexer-Based Array Multiplier
New multiplication algorithm based on a different mechanism has beenproposed by Dr. Pekmestzi (National Technical University of Athens)
Algorithm is symmetric because at each step one bit of the multiplier and
one bit of the multiplicand are processed
Mathematically, this method can be described:
Consider two positive integer numbers, X and Y:
Also, define:
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
13/24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR
Mathematical Description of
Multiplexer-Based Array Multiplication
Product of the numbers X and Y is therefore:
Now define: and generally,
Where Xj and Yj are the numbers formed by the j least significant bits of X and Y
respectively. The product Pj can then be expressed using the following recursive
equation:
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
14/24
Mathematical Description of
Multiplexer-Based Array MultiplicationP = XY can now be expressed with the following summations:
Where, and therefore:
Above equation can beillustrated in this schematic
Grouping of partial products
are distinguished by connectingthem with solid lines
Folding the array along the lineof symmetry gives the finalform of the algorithm
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
15/24
Mathematical Description of
Multiplexer-Based Array Multiplication Looking at the truth table for Zj, it can be seen that computation is only required
when xj and yj are both equal to 1
At each step, j, only sjand c
j+1are new. The rest of the bits of S
jhave been
formed in the previous j-1 steps according to the relation:
Truth Table for Zj
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
16/24
Mathematical Description of
Multiplexer-Based Array Multiplication Sj can be generated using a carry-propagate adder consisting
of full-adders (FA):
Sj is found to be:
Based on the truth table shown before, each term of the partialproducts Zj can be selected accordingly
This operation can be realized using a 4x1 multiplexer with bits
xj and yj being the selection bits
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
17/24
Parallel Realization of Multiplexer-Based Multiplier
In the parallel realization of the
algorithm, the terms Zj2jandxjyj2
2jare produced in the jth rowof multiplexers, and aresubsequently summed
Implementation of the termsZj2j andxjyj2
2j
[Courtesy: Kiamal Pekmestzi]
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
18/24
Parallel Realization of Multiplexer-Based Multiplier
[Courtesy: Kiamal Pekmestzi]
7/30/2019 Biswas4 Seminar 1
19/24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
20/24
Parallel Realization of Multiplexer-Based Multiplier
Detail of Cell Types
Cell II consists of:
Full-Adder which produces newbits sj=si and cj+1
Bits si along with xj and yj arebroadcast to all first type cells of
the ith row of the array Bits cj+1 are propagated to the next
second type cells
Bottom circuit completes addition
of the term Zj by summing the lastbit of this term while at the sametime generates xjyj which isrequired in computing the finalproduct
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
21/24
Multiplexer-Based Array Multiplier
Small complexity 2-bit carry-lookahead adders are used in the addition of the
outputs of the right boundary cells
Overall the total multiplier circuit consists of:
full-adders
multiplexers
2n AND gates
n-2 two-bit carry-lookahead (CLA) adders
Multiplexer-based array multiplication scheme yields almost the samecomplexity with arrays based on the more common Modified Booths algorithm
while requiring a considerably smaller number of gates or transistors compared
to other arrays
( 1)2 1
2
n nn
+ +
( 1)
2
n n
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
22/24
Circuit Comparison
Exact gate number:
8.5n2+16.5n-22 (Multiplexer-Based)
10n2+23n+13 (Modified Booth)
Transistor number:
21n2+37n-99 (Multiplexer-Based)
21n2+52n+31 (Modified Booth)
Theoretical operation delay:
T = (n+1)tFA (Multiplexer-Based)
T = ntFA/2 + ntFA (Modified Booth)
Literature shows that multiplexer-based array multiplier outperformsmodified Booth multiplier in both speed and power dissipation up to26%!
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
23/24
Conclusions
Multiplexer-based multiplication algorithm compares favourably tothe most common array multipliers
While having a circuit complexity similar to other array multipliers,multiplexer-based multipliers outperform them in terms of speedand power dissipation
Implementation of multiplexer-based multipliers with faster and
lower power consumption full-adder circuits could further improveperformance
The array structure of the multiplexer-based multiplier permitsefficient VLSI implementation. However, the zig-zag shape
could most likely be modified to a more rectangular form whichwould contribute to significant area-savings.
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
7/30/2019 Biswas4 Seminar 1
24/24
References
Kiamal Z. Pekmestzi, Multiplexer-Based Array Multipliers, IEEE Transactions onComputers, Vol. 48, No. 1, January 1999.
Pedram Mokrian, A Reconfigurable Digital Multiplier Architecture A Masters Thesis,
Department of Electrical and Computer Engineering, University of Windsor, April2003.
Gary W. Bewick, Fast Multiplication: Algorithms and Implementation, A Ph.DDissertation, Stanford University, 1994.
D. Radhakrishnan, Low Voltage CMOS Full Adder Cells, Electronics Letters, Vol.35, pp. 1792-94, October 1999.
Anant Yogeshwar Chitari, VLSI Architecture for a 16-bit Multiply-Accumulator (MAC)Operating in Multiplication Time, A Masters Thesis, Department of ElectricalEngineering, Texas A&M University Kingsville, May 2000.
Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Designs, OxfordUniversity Press, New York, 2000.