Low Power Functions

download Low Power Functions

of 22

Transcript of Low Power Functions

  • 8/12/2019 Low Power Functions

    1/22

    Approaches to Low-Power

    Implementations of DSP Systems

    Class Advisor : Dr. Fakhraie Presentor : Nariman Moezi DSP Design & Implementation Course Seminar Spring 2004

  • 8/12/2019 Low Power Functions

    2/22

    Out line

    Reduced twos complement representation Low power Scheduling Techniques for embedded DSP softwareLow power multiplier

    - Mitchell-Based logarithm multiplier- Power-Aware pipelined multiplier

  • 8/12/2019 Low Power Functions

    3/22

    Reduced twos complementrepresentation

    twos complement representation is widely used in the implementationof arithmetic operations.

    If X has a small magnitude and switches between a positive and anegative value,its sign extension changes between strings of zeros

    and ones.

    If X has magnitude less than 2 m-1 (m

  • 8/12/2019 Low Power Functions

    4/22

    APPLICATION : Low power FIR filter using Reduced TwosComplement Representation Consider a hybrid-form adaptive FIR filter ,where the inputsare 5-level data symbols and take values in {-2,-1,0,-1,2} . Assuming coefficients are N- bit twos complement numbers

    Such multiplications are simply shift and complementoperations

    Assume that we detect that the maximum magnitude of acoefficient H is less than 2 m-2 .We know that correspondingpartial product P has a magnitude less than 2 m-1 .

  • 8/12/2019 Low Power Functions

    5/22

    - Coefficient MaximumMagnitude Detection

    (An example with two taps and6 bit coefficients)

    - Partial-Product generation usingreduced twos complementrepresentation

  • 8/12/2019 Low Power Functions

    6/22

    -As the adaptive filter updates thecoefficients, the word-length of thereduced representation will change. So

    does the error introduced by using thereduced representation.We can build acompensation vector correction paththat imitates the error propagation inthe accumulation path.

    -A test chip was implemented in 0.25 um CMOS technology.There were useda hybrid-form filter of 160 taps and having 8 taps per hybrid section.Thecoefficient word-length is 10 bits.when operating at 2.5V with a 100MHzclock, a 32% power saving has been measured as summarized in this table :

  • 8/12/2019 Low Power Functions

    7/22

    Low-Power Scheduling Techniques forEmbedded DSP Software

    This section describes an instructional-level power model for a processor (Fujitsu) ,and techniques to reduce the power of this processor.The DSP processor has a special architecture that allows instructions to be packedinto pairs.The Booth multiplier on this processor is a major source of energy consumption forDSP programs.So a micro-architectural power model for the on chip Booth-multiplier is developed andanalyzed for further power minimization.Based on this model, an effective technique of local code modification by operandswapping is used to further reduce power consumption.

    (S. Malik, IEEE Trans 1997 )

  • 8/12/2019 Low Power Functions

    8/22

    The sum of measured current for the four instructions is 204 mA. The sum of the base costs (37.2+14.4+36.6+14.4) and the overhead costs of adjacent

    instructions (18.4+18.4+18.4+18.4) is only 176.2 ,which under estimates the actualcost by 13.6%.

    The difference ,27.8,in the two estimates comes from the circuit state overheadbetween non-adjacent instructions 1&3.

    This is due to a special design at the inputs of the multiplier.there is a latch between

    each operand and multiplier to retain the the old values until the next multiplyinstruction is executed. This overhead is dependent on the previous and current values of input latches

    for each multiply operation .

    An example of a sequence four instructions where theoverhead cost between 1 and 3 can nat be ignored

  • 8/12/2019 Low Power Functions

    9/22

    Instruction packing for lowpower

    A special architecture of the target DSP processor is the capability ofpacking an ALU-type instruction and a data transfer instructioncodeword for simultaneous execution .

    The average current for packed instructions is only slightly more thanthe average current for a sequence of the two unpacked instructions.

    Comparision of energy consumed by packedand unpacked instructions

  • 8/12/2019 Low Power Functions

    10/22

    As to the overhead cost of MAC instructions, when MAC is packed witha data transfer instruction, especially LAB ,which changes data valuesin registers A and B used by MAC as inputs, significantly wide variation

    of overhead cost is observed(from 1.4mA to 33.0mA).

    Such wide variation is mainly due to the complex booth multiplierimplemented in the MAC unit.

    The fundamental idea behind boothmultiplier is to recode B by skipping over1s technique.

    For example a 7-digit B value 0011110

    that would need four additions of shifted A,can be recoded to a new value whichrequires one addition and a subtraction

    weight=4 weight=2

    Micro architectural model forthe booth multiplier

    0101000 0011110 _

    recode

  • 8/12/2019 Low Power Functions

    11/22

    we can reduce the number of additions and subtractions by justswapping the operands in registers A and B , which can result in current

    reduction. The table gives three experiments where swapping :

    Another that determines power consumption of the multiplier,isswitching activity

    For the booth multiplier the characteristic of A is its switching activityand for B, weight factor and switching activity

    Variation of measured current by swapping operands op1 andop2 in registers A and B for MAC:LAB instructions.

  • 8/12/2019 Low Power Functions

    12/22

    Average current drawn by MAC:LAB for different characteristics ofconsecutive values in A and B.

    For a typical DSP application MAC:LAB instructions are usually applied to asequence data for filter operations such as

    As we know only C and there is no information about X we , consider C as thevalue B .If switching activity or weight factor of value C is high we can swapoperands.

    ii X c

    Comparison of power consumption for 5 DSP

    programs by different scheduling techniques

  • 8/12/2019 Low Power Functions

    13/22

    Improved Mitchell-Based Logarithmic Multiplier for Low- power DSP Applications

    The technique of multiplying two numbers using logarithms is simple. Take the logarithms of twomultiplicands, add the logarithms together and then take the antilogarithm of the resultingsummation.

    Mitchell method of calculating logarithms :assume N = 2510 = 110012The MSB is bit 4,that gives a characteristic of 1002 and the retaining bits(10012) gives the fraction.This gives a value for the logarithm of 100.10012 (=4.562510).The correct value of log2(25) is 4.6439.

    (Duncan J. McLaren et al IEEE 2003)

  • 8/12/2019 Low Power Functions

    14/22

    A binary number N ,can be written as:

    Note thatk represents the characteristic and xthe binary fraction,with x in the range 0< x < 1.The true logarithm and the approximation

    using the Mitchell method are:

    The logarithm of a product is equal to the sum

    of the logarithms of the multiplicands

    Antilogarithms of this two equations are:

    To correct the error the following is used:

  • 8/12/2019 Low Power Functions

    15/22

    This shows that to provide the correct answer, an error correction factor should be added to thesummation before the antilogarithm is calculated.

    however this would be impractical. The approach is to average the value of the correction factorover a range of x values, and add this to the summation. This results in a multiplier of improvedaccuracy.

    multiplier of improved accuracy. The two fractional parts are split into 8 ranges, from 0 to 1 in stepsof 0.125. This means that the 3 most significant bits of x can be used to determine the errorcorrection factor (which is pre calculated).

  • 8/12/2019 Low Power Functions

    16/22

    To test the multiplier further, it was usedas part of a real application, in this case aFinite Impulse Response (FIR) Filter. Thefilter was an 11-tap low-pass FIR, with anormalized cut-off frequency of 0.25. Thefilter was implemented in Verilog usingthe standard multiplier, the un-modifiedMitchell multipliers and the ImprovedMitchell multipliers. The input was 16-bitand the output was 32-bit. The figure

    below shows the magnitude responsefrom each of the three implementations.

  • 8/12/2019 Low Power Functions

    17/22

    Power-aware Pipelined Multiplier Design Based On

    2-Dimensional Pipeline Gating

    Although Boolean multipliers have natural power awareness to the changing of inputprecision, deeply pipelined designs do not have this benefit.

    In Boolean unpipelined multipliers, low input precision calculation (like 0001 0001)

    dissipates much less power than high input precision calculation (like 1111 1111). SoBoolean unpipelined multipliers are naturally power aware to the changing of inputprecision.

    In deeply pipelined designs, the number

    of registers is much larger than that ofother elements, these designs do not havethe natural power awareness to thechanging of input precision.

    (Jia Di, J. S. Yuan et al GLSVLSI 2003)

  • 8/12/2019 Low Power Functions

    18/22

    To solve this problem and improve the power awareness of deeply pipelinedmultipliers,a novel technique,2-dimensional pipeline gating is proposed.Thistechnique is to gate the clock to the registers in both vertical and horizontaldirection.

  • 8/12/2019 Low Power Functions

    19/22

    In a 4*4 multiplier , when the input precision is 4, for example, calculating 1111 1111,S is generated based on all inner partial products. If the inputprecision is 2,for example, calculating 00110011, the partial productscontaining X2 or Y2 (the ones enclosed by a rectangular) canalso be disabled.

  • 8/12/2019 Low Power Functions

    20/22

  • 8/12/2019 Low Power Functions

    21/22

  • 8/12/2019 Low Power Functions

    22/22

    References

    M. T. Lee, V. Tiwari, S. Malik, and M. Fujita, Power analysis andminimization techniques for embedded DSP software," IEEE Trans.VLSI Syst. , vol. 5, pp. 123-135, Mar. 1997.

    Jia Di, J. S. Yuan et al,Power -aware Pipelined Multiplier Design Based On 2-Dimensional Pipeline Gating GLSVLSI03 , April 28-29, 2003

    Zhan Yu et al,A Low Power Adaptive Filter Using Dynamic Reduced 2SCRepresentation, IEEE Custom Integrated Circuits Conference 2002

    Duncan J. McLaren et al,Improved Mitchell -Based Logarithmic Multiplierfor Low Power DSP Applications IEEE 2003