The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1...

28
The Use of Traces for Inlining in Java Programs Borys J. Bradel Tarek S. Abdelrahman Edward S. Rogers Sr.Department of Electrical and Computer Engineering University of Toronto Toronto, Ontario, Canada

Transcript of The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1...

Page 1: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

The Use of Traces for Inlining in Java Programs

Borys J. Bradel

Tarek S. Abdelrahman

Edward S. Rogers Sr.Department of Electrical and Computer Engineering

University of Toronto

Toronto, Ontario, Canada

Page 2: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

2

Introduction

Feedback-directed systems provide information to a compiler regarding program behaviour

Examples: Jikes RVM [AFG+00] Open Runtime Platform [Mic03]

Source Code

Compiler

Program

Feedback

Page 3: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

3

Work Overview

Explore whether traces are useful in offline feedback directed systems

Create trace collection system for Jikes Use traces to guide Jikes’s built in optimizing

compiler Help with a single optimization, inlining Improves execution time

Page 4: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

4

Outline

Background

Implementation

Results

Related work

Conclusion

Page 5: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

5

Trace Definition

A trace is a frequently executed sequence of unique basic blocks or instructions

a=0i=0

goto B2

a+=ii++

if (i<5) goto B1

return a

B0

B1

B2

B3

Trace 1

public static int foo() {

int a=0;

for (int i=0;i<5;i++)

a++;

return a;

}

Page 6: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

6

Traces and Optimization

Traces may offer a better opportunity for optimization: Enable inter-procedural analysis Reduce the amount of instructions optimized Simplify the control flow graph, allowing for more

optimization

Page 7: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

7

Multiple Methods

Inter-procedural analysis without an additional framework

Increase possibility of optimization B1,A1,B2 can be

simplified to two instructions a+=(5+i) i++

B0

t=returned valuea+=ti++

B3

B4

B1 call g(i)

B2

t=5+ireturn t

A1

Trace 1

Page 8: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

8

Fewer Instructions

Fewer instructions to optimize

May allow for extra optimization If know that B3 is

executed then know that t=5

B0

B6

B1

B5

B6

B2: t=f(...)

Trace 1

B3: t=5

B4

Page 9: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

9

Trace Exits

Traces usually contain many basic blocks

Traces may not execute completely Unlike basic blocks

B0

B6

B1

B5

B6

B2

Trace 1

B3

B4

Page 10: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

10

Trace Collection System

Monitor program execution Record traces Start traces at frequently

occurring events Backward branches Trace exits Returns

Stop at backward branches and trace starts

Captures frequently executed loops and functions

a=0i=0

goto B2

a+=ii++

if (i<5) goto B1

return a

B0

B1

B2

B3

Trace 1

Page 11: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

11

Jikes

BaselineCompiler

OptimizingCompiler

Program

AdaptiveSystem

Page 12: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

12

Jikes and our TCS

BaselineCompiler

OptimizingCompiler

Program

AdaptiveSystem

TCS

Inform TCS

TraceInformation

Page 13: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

13

Jikes – Second Phase

BaselineCompiler

OptimizingCompiler

Program

AdaptiveSystem

TraceInformation

Page 14: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

14

Inlining and Traces

Traces are executed frequently

Therefore invocations on traces should be inlined Reduce invocation

overhead Allow for more

opportunities for optimization

May lead to large code expansion

a:call b()

b: …

method a()…invoke b()…

method b()…

Page 15: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

15

Code Expansion Control

There are ways to control inline expansion

Inline sequences [HG03,BB04]

Selectively inlining: What if compile method a()? What if compile method b()?

a:call b()

b:call c()

c:…

Page 16: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

16

Code Expansion Control

Compile method a() Inline methods b() and c()

Compile method b() No inlining

method a()…invoke b()…

method c()…

method b()…invoke c()…

method b()…invoke c()…

method c()…

Page 17: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

17

Results

Provide inline information to Jikes based on previous executions

Compare our approach to two others: Inline information provided by the Adaptive system

of Jikes A greedy algorithm based on work by Arnold et al.

[Arn00] Evaluate two approaches: Just in Time and

Ahead of Time Measure overhead of system

Page 18: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

18

JIT Inlining – Execution Time

0

0.2

0.4

0.6

0.8

1

1.2

201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean

No

rmal

ized

Tim

e

Adaptive 25.6s Greedy 23.3s Trace 22.7s

0

Page 19: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

19

JIT Inlining – Compilation Time

0

0.5

1

1.5

2

2.5

201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean

No

rmal

ized

Tim

e

Adaptive 0.52s Greedy 0.61s Trace 0.69s

0

Page 20: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

20

JIT Inlining – Code Expansion

0

0.5

1

1.5

2

2.5

201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean

No

rmal

ized

Siz

e

Adaptive 21.3kb Greedy 22.8kb Trace 27.7kb

0

Page 21: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

21

AOT Inlining – Execution Time

0

0.2

0.4

0.6

0.8

1

1.2

201 202 209 222 228 2a1 2a2 2a3 2a4 2a5 mean

No

rmal

ized

Tim

e

Adaptive 29.3s Trace 21.8s

0

Page 22: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

22

AOT Inlining – Compilation Time

0

0.5

1

1.5

2

2.5

3

3.5

4

201 202 209 222 228 2a1 2a2 2a3 2a4 2a5 mean

No

rmal

ized

Tim

e

Adaptive 3.8s Trace 5.6s

0

Page 23: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

23

Overhead

0

0.5

1

1.5

2

2.5

3

201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean

No

rmali

zed

Tim

e

Base 77s Base+ 90s Base+ and TCS 174s

0

Page 24: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

24

Related Work

Arnold et al. [Arn00] Feedback-directed inlining in Java Collected edge counts at method invocations Used a greedy algorithm to select inlines that

maximize invocations relative to code expansion Dynamo [BDB99] Trace collection system PA-RISC architecture Assembly Instructions Compiled traces

Page 25: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

25

Conclusions

Traces are beneficial for inlining: Decreased execution time compared to one

approach Decrease competitive with another approach Increases compilation time and code size

A potential avenue of future research

Page 26: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

26

Future Work

Different trace collection strategies Trace based compilation and execution Reduction of code size Application of traces to other optimizations Usage of an online feedback directed system

Page 27: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

27

References

[MSD00] Matthew Arnold, Stephen Fink, David Grove, Michael Hind, and Peter F. Sweeney. Adaptive optimization in the Jalapeno JVM. ACM SIGPLAN Notices, 35(10):47-65, 2000.

[Mic03] Michael Cierniak et al. The open runtime platform: A flexible high-performance managed runtime environment. Intel Technology Journal, February 2003.

[HG03] Kim Hazelwood and David Grove. Adaptive online context-sensitive inlining. International Symposium on Code Generation and Optimization, p 253-264, 2003.

[BB04] Bradel, B.J.: The use of traces in optimization. Master’s thesis, University of Toronto (2004).

[Arn00] Matthew Arnold et al: A comparative study of static and profile-based heuristics for inlining. SIGPLAN Workshop on Dynamic and Adaptive Compilation and Optimization. (2000) 52-64.

[BDB99] Vasanth Bala, Evelyn Duesterwald, and Sanjeev Banerjia. Transparent dynamic optimization: The design and implementation of dynamo. HP Laboratories Technical Report HPL1999 –78, 1999.

Page 28: The Use of Traces in Optimizationamaral/cascon/CDP04/...JIT Inlining – Compilation Time 0 0.5 1 1.5 2 2.5 201 202 209 213 222 228 2a1 2a2 2a3 2a4 2a5 mean N o r m a l i z e d T i

28

AOT – Compilation Time (Wall Time)

0

0.5

1

1.5

2

2.5

201 202 209 222 228 2a1 2a2 2a3 2a4 2a5 mean

No

rmal

ized

Tim

e

Adaptive 7.3sTrace 8.2s

0