A Dictionary Construction Technique for Code Compression Systems with Echo Instructions Embedded and...

Post on 15-Dec-2015

225 views 1 download

Tags:

Transcript of A Dictionary Construction Technique for Code Compression Systems with Echo Instructions Embedded and...

A Dictionary Construction Technique for Code

Compression Systems with Echo Instructions

Embedded and Reconfigurable Systems LabComputer Science Department

University of California, Los Angeles

{philip, macbeth, ani, majid}@cs.ucla.edu

LCTES ’05. June 16, 2005. Chicago, IL

Philip Brisk Jamie Macbeth Ani Nahapetian Majid Sarrafzadeh

Outline

• Introduction: Code Compression

• Dictionary Compression

• Dictionary Construction

• Overview of the Algorithm

• Experimental Methodology and Results

• Summary

Why Reduce Program Size?

• Reduces Memory Requirements• Silicon Cost of Program Storage in on-chip ROMs• As Embedded Systems Become More Complex,

Ever-More Functionality Will Migrate to Software

Costs of Runtime Decompression

• Performance Overhead• Area of the Decoder Circuitry

Introduction: Code CompressionFor Embedded Systems

Dictionary Compression

1. Find Repeated Code Sequences

2. Place Each Sequence Into a Dictionary

3. Replace Each Sequence in the Program with a Codeword that Accesses the Dictionary

Program

Dictionary

CALD Instructions

• Place each sequence in a dictionary

• All Codewords Point to the Dictionary

Echo Instructions

• Leave one Instance of the Sequence Inline

• All Codewords Point to the Sequence

CALD and Echo Instructions

Program

Dictionary

Program

The Traditional Approach: Compression Performed at Link Time

• Substring Matching [Fraser et al., 1984]+ Register Renaming [Cooper and McIntosh, 1999]

[Debray et al., 2000]+ Instruction Rescheduling [De Sutter et al., 2002]

Our Approach is Somewhat Different…

• Identify Repeated Isomorphic Patterns that Occur within the Intermediate Representation PRIOR TO Register Allocation [Brisk et al., 2004]

Compression Algorithms

Dictionary Construction

A: R1 ← R2 + R3B: R4 ← R5 + R6C: R7 ← R1 + R4

A: R1 ← R2 + R3C: R7 ← R1 + R4

A: R1 ← R2 + R3B: R4 ← R5 + R6C: R7 ← R1 + R4

B: R4 ← R5 + R6

A: R1 ← R2 + R3C: R7 ← R1 + R4

A: R1 ← R2 + R3C: R7 ← R1 + R4

Dictionary 1

Dictionary 2

Sequence 1

Sequence 2

2 Schedules Exist for DAG 1

DAG 1

DAG 2

DAG 2 is isomorphic to a subgraph of DAG 1

5

3

Isomorphic Pattern Generation

Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)

Isomorphic Pattern Generation

Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)

T1

T1SH

Isomorphic Pattern Generation

Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)

T1

T1SH

Isomorphic Pattern Generation

Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)

T1

T1SH

T2

Isomorphic Pattern Generation

Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)

T1

T1SH

T2

T2

Isomorphic Pattern Generation

Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)

T1

T1SH

T2

T2

Isomorphic Pattern Generation

Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)

T1

T1SH

T2

T2

Isomorphic Pattern Generation

Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)

T1

T1SH

T2

T2

Isomorphic Pattern Generation

Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)

T1

T1SH T2

T2

T3

T2

T2 T3 SHT4

Isomorphic Pattern Generation

Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)

T1

T1SH T2

T2

T3

T2

T2 T3 SH

T3

Isomorphic Pattern Generation

Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)

T1 T2

T1 T2SH

T2

T4

T2 T3 T4 SH

T3

Isomorphic Pattern Generation

Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)

T1 T2

T1 T2SH

T2

T4

T2 T3 T4 SH

An SH Grammar

The SH is also a DAG

• Generate a pattern Tk from sub-patterns Ti and Tj; • Contract edge (Ti, Tj)

• Create a Production: Tk → TiTj

T3T1 T2

T2

T4

T2 → xT1

x

T4 → T3T2

x

Derivations and Scheduling

a

b c

d

e f

g

a

b c

d

e f

g

a

c

d

e

d

e f

d

f

G1 G2

G3

G4

G6

G5

G7

G1→ G2G3 G2→ G4b G3→ G5gG4→ ac G5→ G6f G5→ G7eG6→ de G7→ df

Grammar

G1

G3

G4

ac G7

df

G5

e

gb

G2

G1

G3

G4

ac G6

de

G5

f

gb

G2

acbdefg acbdfeg

Derivations

Compatibility

Ti, Tj – patterns Si, Sj – schedules for Ti, TjAssume Ti is a Subgraph of Tj

We want Ti and Tj to Share the Same Dictionary EntryThen Si must be a Contiguous Subsequence of Sj.

A: R1 ← R2 + R3B: R4 ← R5 + R6C: R7 ← R1 + R4

A: R1 ← R2 + R3C: R7 ← R1 + R4

B: R4 ← R5 + R6

A: R1 ← R2 + R3C: R7 ← R1 + R4

AC is a Contiguous Subsequence of BAC but not ABC

Convex Cuts in DAGs

Let G = (V, E) be a DAG• A Cut is a Partition of V• A Convex Cut cannot have edges that cross the

boundary of a cut in BOTH directions• SH Construction Ensures Convex Cuts

DAG Non-Convex Cut Convex Cut / Scheduling

Convex Cuts and Compatibility

a

b c

d

e f

g

G1

a

b c

d

e f

g

G2

G3b

d

f

a

c

e

g

G4

G5

a

b

c

d

e

f

g

a

b

c

d

e

f

g

a

b c

d

e f

g

G1→(2,3)

b

d

f

a

c

e

g

G1→(4,5)

a

b c

d

e f

g

G1→(2,3),(4,5)

CYCLE!

G1 → G2G3

G1 → G4G5

Generalized Compatibility

Given a Set of Productions with G1 on the LHS… G1 → G2G3

G1 → G4G5 … G1 → G2kG2k+1

How can we Tell if they are Compatible?

,

Three Criteria Equivalent to Compatibility1. G1→(2,3),(4,5),…,(2k,2k+1) is Acyclic2. G2 G4 … G2k 3. G2k+1 … G5 G3

The Pragmatic Question:

If all Productions are NOT Compatible, what is the Largest Compatible Subset?

The Subset/Subgraph Viewof Compatibility and Scheduling

Gi

Gj

Gi Gj

Gj - Gi

Si

Sj-i

Si

Sj-i

1. Construct a Schedule Si for Gi

2. Construct a Schedule Sj-i for Gj-i

3. Construct a Schedule Sj = SiSj-i for Gj

A Production Compatibility Graph

Represent the Subgraph Relation as a DAG called the Production Compatibility Graph (PCG)• Productions G1 → Gi… and G1 → Gj… create vertices

Gi and Gj • Add an Edge (Gi, Gj) to the PCG if

1. Gi Gj

2. There is no Gk such that Gj Gk Gi

Any PATH in the PCG Corresponds to a Subset of Patterns that can be Scheduled Contiguously within a Dictionary entry for G1.

PCG Examplea

b c

d

e f

g

G1a

b c

d

e f

g

G2

G3 b

d

f

a

c

e

g

G4

G5

a

b c

d

e f

g

G6

G7

a

b

cd

e f

g

G8

G9

a

b c

d

e

f

g

G10

G11

G8

G2 G4

G6 G10

PCG

Algorithm Overview

Recall that the Subgraph Hierarchy is a DAG• Process SH Entries in Topological Order

• All Sub-Patterns Processed Before Each Pattern

Construct a PCG for each SH Entry• Assign Vertex Weights to Each Pattern based on the

Number of Sub-Patterns in the Dictionary Entry• Find Max Vertex-Weighted Path in the PCG

Determine the Maximum Gain Pattern in the SH• Remove the Max Gain Pattern – and all Sub-Patterns

Selected for its Dictionary Entry• Repeat until the SH is Empty

Experimental Framework

Algorithm Built into the Machine SUIF Compiler1. Consolidate Each Application using link_suif Pass

• All Unrolled Loops Manually Re-rolled

2. Standard Front End Compilation Script• One Round of Constant Folding/DCE

3. Instruction Selection for Alpha Architecture• ARM Back End Recently Released…

4. Detect Recurring Isomorphic Patterns in the IR• Analysis described in [Brisk et al., 2004]

5. Dictionary Construction as Described Here

Experimental Methodology

Cannot Compare with Substring Matching

• Many Schedules Exist for Each DAG • Substring Matching Assumes Scheduled Code

• How to Determine the Best Schedule for Each DAG?• Our Algorithm Determines a Schedule for the

Entire Set of DAGs to Maximize Pattern Overlap

Naïve Approach – Each Pattern Gets Its Own Dictionary Entry

Our Approach - Isomorphism/Scheduling

Experimental Results

Dictionary Size

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Epic G.721 GSM PGP(RSA)

Rasta JPEG MPEG2Dec

MPEG2Enc

Pegwit PGP

Nu

mb

er o

f Op

erat

ion

s

Naive

Heuristic

Applications Taken from MediaBench [Lee et al., 1997]

Compilation Time

BenchmarkTotal (sec)

Dictionary (sec) (%)

EpicG.721GSMJPEGMPEG2 DecMPEG2 EncPegwitPGPPGP (RSA)Rasta

9.882.7133.636232.365.132.61989.0618.1

0.5240.1960.82116.11.311.991.105.64

0.5200.871

5.30%7.23%2.44%4.45%4.06%3.06%3.37%2.85%5.74%4.81%

Conclusion

Algorithm Given for Dictionary Construction• What Is Built is Actually an Intermediate Representation

of a Dictionary

• Combination of 3 Classically Hard Problems• Graph/Subgraph Isomorphism• Scheduling• Dictionary Construction/Compression

Future Work: Register Allocation and Assignment• Make a Best Effort to Assign Registers So that

Isomorphic Patterns have Identical Register Usage

1. Brisk, P., Nahapetian, A., and Sarrafzadeh, M. Instruction Selection for Compilers that Target Architectures with Echo Instructions, SCOPES 2004.

2. Fraser, C. W., Myers, E., and Wendt, A. Analyzing and Compressing Assembly Code. Symposium on Compiler Construction, 1984.

3. Cooper, K. D., and McIntosh, N. Enhanced Code Compression for Embedded RISC Processors, PLDI 1999.

4. De Sutter, B., De Bus, B., and De Bosschere, K. Sifting out the Mud: Low-Level C++ Code Reuse, OOPSLA 2002.

5. Debray, S., Evans, W., Muth, R., and De Sutter, B. Compiler Techniques for Code Compaction, TOPLAS, 2000.

6. Lee, C., Potkonjak, M., and Mangione-Smith, W. H. MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems, MICRO-30, 1997.

References

Questions

?