conf-socc-2004

download conf-socc-2004

of 14

Transcript of conf-socc-2004

  • 8/6/2019 conf-socc-2004

    1/14

  • 8/6/2019 conf-socc-2004

    2/14

    Mrinmoy Ghosh2

    Hot CachesBus

    Interface

    Unit

    12% DataCache

    14%

    Integer

    Units

    16%

    Data Path

    32%

    Mem.

    Controller

    19%

    Instructio

    n Cache

    7%

    I Cache

    25%

    D MMU

    5%I MMU

    4%

    ARM 9

    25%

    PATag

    RAM

    1%

    CP15

    2%

    BIU

    8%

    SysCtl

    3%

    Clocks4%

    Other

    4%

    D Cache

    19%

    Alpha 21264

    ARM 920T

    Intel Pentium 4 (Willamette)

  • 8/6/2019 conf-socc-2004

    3/14

    Mrinmoy Ghosh3

    Motivation

    1

    10

    100

    1000

    10000

    100000

    1000000

    10000000

    100000000

    1000000000

    1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63

    Max

    Min

    Avg

  • 8/6/2019 conf-socc-2004

    4/14

    Mrinmoy Ghosh4

    Salient Features

    Reuse most significant byte

    Counting granularity in bits rather thanbytes

    Compression scheme is a hybrid of twoschemes, where the better scheme ischosen dynamically

    Each scheme gives power savings of around30-45% if applied independently, and thehybrid scheme saves around 3-5% more

    than either of the schemes.

  • 8/6/2019 conf-socc-2004

    5/14

    Mrinmoy Ghosh5

    CoolPression Cache

    32- count

    bits

    Step 2a: Read only 32 count bits and

    append with leading zeroes or ones

    CE Bit

    CoolCountCircuit

    Sense Amps37

    32

    SRAM Cell Arraycountbits

    Data Out33

    32

    SRAM Cell Array

    count bits

    Data from Cache

    CoolCountCircuit

    Sense Amps37

    33

    32Data Out

    CoolCount Circuit 33

    32

    Bitline Enable Lines

    36 bits6 bits

    ZIBs

    CE Bit

    Step 2b: Read Bytes that are not zeroes

    Data from Cache

    CoolCount Circuit

    Sense Amps37

    32Data Out

    33

    32

    SRAM Cell Array

    36 bits

    Step 1: Read In First 7 bits and the ZIBs

    SRAM Cell Array

    6 bits

    CE Bit

  • 8/6/2019 conf-socc-2004

    6/14

  • 8/6/2019 conf-socc-2004

    7/14

    Mrinmoy Ghosh7

    Counting Leading Zeroes And

    Ones7

    Priority Encoder

    5 34 2 1

    0

    2 1 0

    6

    No of Leading Zeroes or Ones -1

  • 8/6/2019 conf-socc-2004

    8/14

    Mrinmoy Ghosh8

    Bitline Precharge Enabling Circuit

  • 8/6/2019 conf-socc-2004

    9/14

    Mrinmoy Ghosh9

    Read Data From Cache

    Read in Count Enable (CE)

    Bit and First 6 bits of data

    CE ==1 Enable Least Significant~countbit lines

    Read Data From LeastSignificant ~countbit lines

    and append with countleading zeroes or ones

    Read Data for byteswhere ZIB is not enabledand make the other bytes

    zero

    Yes

    No

  • 8/6/2019 conf-socc-2004

    10/14

    Mrinmoy Ghosh10

    Write Data To CacheCount Number of Leading

    Zeroes or Ones

    Check for Bytes which arezero

    Count >8

    Set CE bit to one and

    Enable Most Significant 6bits lines and Least

    Significant ~countbit lines

    Write Encoded Data toCache

    Set CE bit to 0 and WriteData to Cache settingZIBs where necessary

    Yes

    No

  • 8/6/2019 conf-socc-2004

    11/14

    Mrinmoy Ghosh11

    Effect on Performance

    0

    0.5

    1

    1.5

    2

    2.5

    Crafty Gcc Gzip Mcf Parser Twolf Vortex VPR Avg

    I

    P

    C

    Normal Cache Coolcount Cache

  • 8/6/2019 conf-socc-2004

    12/14

    Mrinmoy Ghosh12

    Results

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    Bzip2 Crafty GCC GZIP M CF Parser Vortex Vpr Avg

    Dcache Base Dcache CoolCountDcache DZC Dcache CoolPression

    16K Data Cache

    16K Instruction Cache

    0.75

    0.8

    0.85

    0.9

    0.95

    1

    1.05

    Bzip2 Crafty GCC GZIP M CF Parser Vortex Vpr Avg

    Icache Base Icache CoolCount Icache DZC Icache CoolPression

  • 8/6/2019 conf-socc-2004

    13/14

    Mrinmoy Ghosh13

    ConclusionsSystem Transparent Hybrid Zero

    Compression SchemeBit level and Byte level compressibility

    used to save power

    Energy Savings of over 35% over baselinecache

    Potential Use at other places where datatransfer takes place like L2 Cache toMemory

  • 8/6/2019 conf-socc-2004

    14/14

    Mrinmoy Ghosh14

    Thank You