Download - Biswas4 Seminar 1

7/30/2019 Biswas4 Seminar 1

1/24

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

Presented by:

Kevin Biswas

April 1, 2005

Supervisor:

Dr. M. Ahmadi

University of Windsor

Electrical and Computer Engineering

Multiplexer-Based Array Multipliers


2/24


Background

In modern digital systems, the component responsible forhandling arithmetic operations is known as the Arithmetic Logic

Unit (ALU) These lie in the critical data path of the core data processing

elements CPU, DSP, and ASIC/FPGA processing andaddressing ICs

Performance of the system, in regards to numerical applications isdirectly related to the structure and design of the ALU whichperforms:

addition/subtraction

multiplication

shift/extension

exponentiation

many others


3/24


Background

The most critical function carried out by ALU is multiplication

Digital multiplication is not the most fundamentally complexoperation, but is the most extensively used operation (especiallyin signal processing)

Innumerable schemes have been proposed for realization of theoperation


4/24


Basics of Digital Multiplication

Digital multiplication entails a sequence of additions carried out on partialproducts

The method by which this partial product array is summed to give the final

product is the key distinguishing factor amongst the numerous multiplication

schemes

A 16-bit Partial Product Array [Courtesy: Gary Bewick, Stanford]


5/24


Multiplication Schemes

Serial Multiplication (Shift-Add)

Standardshift-add multiplier implementation[Courtesy: Gary Bewick, Stanford]


6/24



Serial Multiplication (High Radix Multipliers)

Increased throughput by using more bits of the multiplier

Separate register required storing multiples of the multiplicand

Shift-Add implementation of a basicradix-4 multiplier


7/24

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR


8/24



Tree Multiplier

Offers potential for multiplicationin time O(log n)

Once partial product array isformed, bits are passed toreduction network

Here column-wise compressionof the bits takes place, yieldingtwo final partial products

Final product is obtained by

addition of these two partialproducts

Considered to be irregular inform and does not permitefficient VLSI realization

[Courtesy: Gary Bewick, Stanford]



9/24



Array Multiplier

Partial products are independently computed in parallel

Consider two binary numbers A and B, of mand nbits, respectively:

1 1

0 0

2 2m n

i j

i j

i j

P A B A B

= =

= = i i

1

0

2m

i

i

i

A A

=

= 1

0

2n

j

j

j

B B

=

=

1 1

0 0

2m n

i j

i j

i j

P A B

+

= =

=

1

0

2mn

k

k

k

P P

=

= Pk is known as the partial product term,also called the summand



10/24



Array Multiplier

There are mnsummands that are produced in parallel by a set ofmnAND gates

n x nmultiplier requires n(n-2) full adders, nhalf-adders and n2

AND gates

Worst case delay would be (2n+1)td

Basic cell of a parallel array

multiplier



11/24



Array structure of parallel multiplier

[Courtesy: Karteek Chada, Texas A&M]



12/24


Multiplexer-Based Array Multiplier

New multiplication algorithm based on a different mechanism has beenproposed by Dr. Pekmestzi (National Technical University of Athens)

Algorithm is symmetric because at each step one bit of the multiplier and

one bit of the multiplicand are processed

Mathematically, this method can be described:

Consider two positive integer numbers, X and Y:

Also, define:



13/24


Mathematical Description of

Multiplexer-Based Array Multiplication

Product of the numbers X and Y is therefore:

Now define: and generally,

Where Xj and Yj are the numbers formed by the j least significant bits of X and Y

respectively. The product Pj can then be expressed using the following recursive

equation:



14/24


Multiplexer-Based Array MultiplicationP = XY can now be expressed with the following summations:

Where, and therefore:

Above equation can beillustrated in this schematic

Grouping of partial products

are distinguished by connectingthem with solid lines

Folding the array along the lineof symmetry gives the finalform of the algorithm



15/24


Multiplexer-Based Array Multiplication Looking at the truth table for Zj, it can be seen that computation is only required

when xj and yj are both equal to 1

At each step, j, only sjand c

j+1are new. The rest of the bits of S

jhave been

formed in the previous j-1 steps according to the relation:

Truth Table for Zj



16/24


Multiplexer-Based Array Multiplication Sj can be generated using a carry-propagate adder consisting

of full-adders (FA):

Sj is found to be:

Based on the truth table shown before, each term of the partialproducts Zj can be selected accordingly

This operation can be realized using a 4x1 multiplexer with bits

xj and yj being the selection bits



17/24

Parallel Realization of Multiplexer-Based Multiplier

In the parallel realization of the

algorithm, the terms Zj2jandxjyj2

2jare produced in the jth rowof multiplexers, and aresubsequently summed

Implementation of the termsZj2j andxjyj2

2j

[Courtesy: Kiamal Pekmestzi]



18/24


[Courtesy: Kiamal Pekmestzi]


19/24



20/24


Detail of Cell Types

Cell II consists of:

Full-Adder which produces newbits sj=si and cj+1

Bits si along with xj and yj arebroadcast to all first type cells of

the ith row of the array Bits cj+1 are propagated to the next

second type cells

Bottom circuit completes addition

of the term Zj by summing the lastbit of this term while at the sametime generates xjyj which isrequired in computing the finalproduct



21/24

Multiplexer-Based Array Multiplier

Small complexity 2-bit carry-lookahead adders are used in the addition of the

outputs of the right boundary cells

Overall the total multiplier circuit consists of:

full-adders

multiplexers

2n AND gates

n-2 two-bit carry-lookahead (CLA) adders

Multiplexer-based array multiplication scheme yields almost the samecomplexity with arrays based on the more common Modified Booths algorithm

while requiring a considerably smaller number of gates or transistors compared

to other arrays

( 1)2 1

2

n nn

+ +

( 1)

2

n n



22/24

Circuit Comparison

Exact gate number:

8.5n2+16.5n-22 (Multiplexer-Based)

10n2+23n+13 (Modified Booth)

Transistor number:

21n2+37n-99 (Multiplexer-Based)

21n2+52n+31 (Modified Booth)

Theoretical operation delay:

T = (n+1)tFA (Multiplexer-Based)

T = ntFA/2 + ntFA (Modified Booth)

Literature shows that multiplexer-based array multiplier outperformsmodified Booth multiplier in both speed and power dissipation up to26%!



23/24

Conclusions

Multiplexer-based multiplication algorithm compares favourably tothe most common array multipliers

While having a circuit complexity similar to other array multipliers,multiplexer-based multipliers outperform them in terms of speedand power dissipation

Implementation of multiplexer-based multipliers with faster and

lower power consumption full-adder circuits could further improveperformance

The array structure of the multiplexer-based multiplier permitsefficient VLSI implementation. However, the zig-zag shape

could most likely be modified to a more rectangular form whichwould contribute to significant area-savings.



24/24

References

Kiamal Z. Pekmestzi, Multiplexer-Based Array Multipliers, IEEE Transactions onComputers, Vol. 48, No. 1, January 1999.

Pedram Mokrian, A Reconfigurable Digital Multiplier Architecture A Masters Thesis,

Department of Electrical and Computer Engineering, University of Windsor, April2003.

Gary W. Bewick, Fast Multiplication: Algorithms and Implementation, A Ph.DDissertation, Stanford University, 1994.

D. Radhakrishnan, Low Voltage CMOS Full Adder Cells, Electronics Letters, Vol.35, pp. 1792-94, October 1999.

Anant Yogeshwar Chitari, VLSI Architecture for a 16-bit Multiply-Accumulator (MAC)Operating in Multiplication Time, A Masters Thesis, Department of ElectricalEngineering, Texas A&M University Kingsville, May 2000.

Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Designs, OxfordUniversity Press, New York, 2000.