A Dictionary Construction Technique for Code Compression Systems with Echo Instructions Embedded and...
-
Upload
frances-wiglesworth -
Category
Documents
-
view
225 -
download
1
Transcript of A Dictionary Construction Technique for Code Compression Systems with Echo Instructions Embedded and...
A Dictionary Construction Technique for Code
Compression Systems with Echo Instructions
Embedded and Reconfigurable Systems LabComputer Science Department
University of California, Los Angeles
{philip, macbeth, ani, majid}@cs.ucla.edu
LCTES ’05. June 16, 2005. Chicago, IL
Philip Brisk Jamie Macbeth Ani Nahapetian Majid Sarrafzadeh
Outline
• Introduction: Code Compression
• Dictionary Compression
• Dictionary Construction
• Overview of the Algorithm
• Experimental Methodology and Results
• Summary
Why Reduce Program Size?
• Reduces Memory Requirements• Silicon Cost of Program Storage in on-chip ROMs• As Embedded Systems Become More Complex,
Ever-More Functionality Will Migrate to Software
Costs of Runtime Decompression
• Performance Overhead• Area of the Decoder Circuitry
Introduction: Code CompressionFor Embedded Systems
Dictionary Compression
1. Find Repeated Code Sequences
2. Place Each Sequence Into a Dictionary
3. Replace Each Sequence in the Program with a Codeword that Accesses the Dictionary
Program
Dictionary
CALD Instructions
• Place each sequence in a dictionary
• All Codewords Point to the Dictionary
Echo Instructions
• Leave one Instance of the Sequence Inline
• All Codewords Point to the Sequence
CALD and Echo Instructions
Program
Dictionary
Program
The Traditional Approach: Compression Performed at Link Time
• Substring Matching [Fraser et al., 1984]+ Register Renaming [Cooper and McIntosh, 1999]
[Debray et al., 2000]+ Instruction Rescheduling [De Sutter et al., 2002]
Our Approach is Somewhat Different…
• Identify Repeated Isomorphic Patterns that Occur within the Intermediate Representation PRIOR TO Register Allocation [Brisk et al., 2004]
Compression Algorithms
Dictionary Construction
A: R1 ← R2 + R3B: R4 ← R5 + R6C: R7 ← R1 + R4
A: R1 ← R2 + R3C: R7 ← R1 + R4
A: R1 ← R2 + R3B: R4 ← R5 + R6C: R7 ← R1 + R4
B: R4 ← R5 + R6
A: R1 ← R2 + R3C: R7 ← R1 + R4
A: R1 ← R2 + R3C: R7 ← R1 + R4
Dictionary 1
Dictionary 2
Sequence 1
Sequence 2
2 Schedules Exist for DAG 1
DAG 1
DAG 2
DAG 2 is isomorphic to a subgraph of DAG 1
5
3
Isomorphic Pattern Generation
Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)
Isomorphic Pattern Generation
Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)
T1
T1SH
Isomorphic Pattern Generation
Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)
T1
T1SH
Isomorphic Pattern Generation
Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)
T1
T1SH
T2
Isomorphic Pattern Generation
Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)
T1
T1SH
T2
T2
Isomorphic Pattern Generation
Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)
T1
T1SH
T2
T2
Isomorphic Pattern Generation
Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)
T1
T1SH
T2
T2
Isomorphic Pattern Generation
Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)
T1
T1SH
T2
T2
Isomorphic Pattern Generation
Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)
T1
T1SH T2
T2
T3
T2
T2 T3 SHT4
Isomorphic Pattern Generation
Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)
T1
T1SH T2
T2
T3
T2
T2 T3 SH
T3
Isomorphic Pattern Generation
Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)
T1 T2
T1 T2SH
T2
T4
T2 T3 T4 SH
T3
Isomorphic Pattern Generation
Edge Contraction• Add an Operation to a Pattern• Combine 2 Patterns into a Larger One• Build a Subgraph Hierarchy (SH)
T1 T2
T1 T2SH
T2
T4
T2 T3 T4 SH
An SH Grammar
The SH is also a DAG
• Generate a pattern Tk from sub-patterns Ti and Tj; • Contract edge (Ti, Tj)
• Create a Production: Tk → TiTj
T3T1 T2
T2
T4
T2 → xT1
x
T4 → T3T2
x
Derivations and Scheduling
a
b c
d
e f
g
a
b c
d
e f
g
a
c
d
e
d
e f
d
f
G1 G2
G3
G4
G6
G5
G7
G1→ G2G3 G2→ G4b G3→ G5gG4→ ac G5→ G6f G5→ G7eG6→ de G7→ df
Grammar
G1
G3
G4
ac G7
df
G5
e
gb
G2
G1
G3
G4
ac G6
de
G5
f
gb
G2
acbdefg acbdfeg
Derivations
Compatibility
Ti, Tj – patterns Si, Sj – schedules for Ti, TjAssume Ti is a Subgraph of Tj
We want Ti and Tj to Share the Same Dictionary EntryThen Si must be a Contiguous Subsequence of Sj.
A: R1 ← R2 + R3B: R4 ← R5 + R6C: R7 ← R1 + R4
A: R1 ← R2 + R3C: R7 ← R1 + R4
B: R4 ← R5 + R6
A: R1 ← R2 + R3C: R7 ← R1 + R4
AC is a Contiguous Subsequence of BAC but not ABC
Convex Cuts in DAGs
Let G = (V, E) be a DAG• A Cut is a Partition of V• A Convex Cut cannot have edges that cross the
boundary of a cut in BOTH directions• SH Construction Ensures Convex Cuts
DAG Non-Convex Cut Convex Cut / Scheduling
Convex Cuts and Compatibility
a
b c
d
e f
g
G1
a
b c
d
e f
g
G2
G3b
d
f
a
c
e
g
G4
G5
a
b
c
d
e
f
g
a
b
c
d
e
f
g
a
b c
d
e f
g
G1→(2,3)
b
d
f
a
c
e
g
G1→(4,5)
a
b c
d
e f
g
G1→(2,3),(4,5)
CYCLE!
G1 → G2G3
G1 → G4G5
Generalized Compatibility
Given a Set of Productions with G1 on the LHS… G1 → G2G3
G1 → G4G5 … G1 → G2kG2k+1
How can we Tell if they are Compatible?
,
Three Criteria Equivalent to Compatibility1. G1→(2,3),(4,5),…,(2k,2k+1) is Acyclic2. G2 G4 … G2k 3. G2k+1 … G5 G3
The Pragmatic Question:
If all Productions are NOT Compatible, what is the Largest Compatible Subset?
The Subset/Subgraph Viewof Compatibility and Scheduling
Gi
Gj
Gi Gj
Gj - Gi
Si
Sj-i
Si
Sj-i
1. Construct a Schedule Si for Gi
2. Construct a Schedule Sj-i for Gj-i
3. Construct a Schedule Sj = SiSj-i for Gj
A Production Compatibility Graph
Represent the Subgraph Relation as a DAG called the Production Compatibility Graph (PCG)• Productions G1 → Gi… and G1 → Gj… create vertices
Gi and Gj • Add an Edge (Gi, Gj) to the PCG if
1. Gi Gj
2. There is no Gk such that Gj Gk Gi
Any PATH in the PCG Corresponds to a Subset of Patterns that can be Scheduled Contiguously within a Dictionary entry for G1.
PCG Examplea
b c
d
e f
g
G1a
b c
d
e f
g
G2
G3 b
d
f
a
c
e
g
G4
G5
a
b c
d
e f
g
G6
G7
a
b
cd
e f
g
G8
G9
a
b c
d
e
f
g
G10
G11
G8
G2 G4
G6 G10
PCG
Algorithm Overview
Recall that the Subgraph Hierarchy is a DAG• Process SH Entries in Topological Order
• All Sub-Patterns Processed Before Each Pattern
Construct a PCG for each SH Entry• Assign Vertex Weights to Each Pattern based on the
Number of Sub-Patterns in the Dictionary Entry• Find Max Vertex-Weighted Path in the PCG
Determine the Maximum Gain Pattern in the SH• Remove the Max Gain Pattern – and all Sub-Patterns
Selected for its Dictionary Entry• Repeat until the SH is Empty
Experimental Framework
Algorithm Built into the Machine SUIF Compiler1. Consolidate Each Application using link_suif Pass
• All Unrolled Loops Manually Re-rolled
2. Standard Front End Compilation Script• One Round of Constant Folding/DCE
3. Instruction Selection for Alpha Architecture• ARM Back End Recently Released…
4. Detect Recurring Isomorphic Patterns in the IR• Analysis described in [Brisk et al., 2004]
5. Dictionary Construction as Described Here
Experimental Methodology
Cannot Compare with Substring Matching
• Many Schedules Exist for Each DAG • Substring Matching Assumes Scheduled Code
• How to Determine the Best Schedule for Each DAG?• Our Algorithm Determines a Schedule for the
Entire Set of DAGs to Maximize Pattern Overlap
Naïve Approach – Each Pattern Gets Its Own Dictionary Entry
Our Approach - Isomorphism/Scheduling
Experimental Results
Dictionary Size
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Epic G.721 GSM PGP(RSA)
Rasta JPEG MPEG2Dec
MPEG2Enc
Pegwit PGP
Nu
mb
er o
f Op
erat
ion
s
Naive
Heuristic
Applications Taken from MediaBench [Lee et al., 1997]
Compilation Time
BenchmarkTotal (sec)
Dictionary (sec) (%)
EpicG.721GSMJPEGMPEG2 DecMPEG2 EncPegwitPGPPGP (RSA)Rasta
9.882.7133.636232.365.132.61989.0618.1
0.5240.1960.82116.11.311.991.105.64
0.5200.871
5.30%7.23%2.44%4.45%4.06%3.06%3.37%2.85%5.74%4.81%
Conclusion
Algorithm Given for Dictionary Construction• What Is Built is Actually an Intermediate Representation
of a Dictionary
• Combination of 3 Classically Hard Problems• Graph/Subgraph Isomorphism• Scheduling• Dictionary Construction/Compression
Future Work: Register Allocation and Assignment• Make a Best Effort to Assign Registers So that
Isomorphic Patterns have Identical Register Usage
1. Brisk, P., Nahapetian, A., and Sarrafzadeh, M. Instruction Selection for Compilers that Target Architectures with Echo Instructions, SCOPES 2004.
2. Fraser, C. W., Myers, E., and Wendt, A. Analyzing and Compressing Assembly Code. Symposium on Compiler Construction, 1984.
3. Cooper, K. D., and McIntosh, N. Enhanced Code Compression for Embedded RISC Processors, PLDI 1999.
4. De Sutter, B., De Bus, B., and De Bosschere, K. Sifting out the Mud: Low-Level C++ Code Reuse, OOPSLA 2002.
5. Debray, S., Evans, W., Muth, R., and De Sutter, B. Compiler Techniques for Code Compaction, TOPLAS, 2000.
6. Lee, C., Potkonjak, M., and Mangione-Smith, W. H. MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems, MICRO-30, 1997.
References
Questions
?