10. The PyPy translation tool chain Toon Verwaest Thanks to Carl Friedrich Bolz for his kind...
-
Upload
preston-owens -
Category
Documents
-
view
217 -
download
0
Transcript of 10. The PyPy translation tool chain Toon Verwaest Thanks to Carl Friedrich Bolz for his kind...
10. The PyPy translation tool chain
Toon Verwaest
Thanks to Carl Friedrich Bolz for his kind permission to reuse and adapt his notes.
© Toon Verwaest
The PyPy tool chain
22
Roadmap
> What is PyPy?> The PyPy interpreter> The PyPy translation tool chain
© Toon Verwaest
The PyPy tool chain
33
Roadmap
> What is PyPy?> The PyPy Interpreter> The PyPy translation tool chain
© Toon Verwaest
The PyPy tool chain
44
What is PyPy?
> Reimplementation of Python in Python
> Framework for building interpreters and VMs
> L * O * P configurations— L dynamic languages— O optimizations— P platforms
© Toon Verwaest
The PyPy tool chain
6
Roadmap
> What is PyPy?> The PyPy interpreter> The PyPy translation tool chain
© Toon Verwaest
The PyPy tool chain
7
The PyPy Interpreter
> Python: imperative, object-oriented dynamic language
> Stack-based bytecode interpreter (like JVM, Smalltalk)
def f(x):return x + 1
>>> dis.dis(f)
2 0 LOAD_FAST 0 (x)
3 LOAD_CONST 1 (1)
6 BINARY_ADD
7 RETURN_VALUE
© Toon Verwaest
The PyPy tool chain
8
The PyPy Bytecode Compiler
> Written in Python
> .py to .pyc
> Standard, flexible compiler— Lexer— Parser— AST builder— Bytecode generator
> You only have to build this once
© Toon Verwaest
The PyPy tool chain
9
Bytecode interpreter
> Focuses on language semantics. No low-level details!
> Written in RPython— This makes it very slow! About 2000x slower than CPython
> PyPy's Python bytecode compiler and interpreter are not the hot topic of the PyPy project!
© Toon Verwaest
The PyPy tool chain
10
Roadmap
> What is PyPy?> The PyPy interpreter> The PyPy translation tool chain
© Toon Verwaest
The PyPy tool chain
11
The PyPy Translation Tool Chain
> Model-driven interpreter (VM) development— Focus on language model rather than implementation details— Executable models (meta-circular Python)
> Translate models to low-level (LL) back-ends— Considerably lower than Python— Weave in implementation details (GC, JIT)— Allow compilation to different back-ends (OO, procedural)
© Toon Verwaest
The PyPy tool chain
14
PyPy “Parser”
> Tool chain starts from loaded Python bytecode> Translator shares Python environment with the target> Relies on Python's reflective capabilities> Allows meta-programming (runtime initialization)
def a_decorator(an_f):def g(b):
an_f(b+10)return g
@a_decoratordef f(a):
print a
f(4) -> 14
© Toon Verwaest
The PyPy tool chain
16
PyPy Control-Flow Graph
> Consists of Blocks and Links
> Starting from entry_point
> “Single Static Information” form
def f(n):
return 3*n+2
Block(v1): # input argument
v2 = mul(Constant(3), v1)
v3 = add(v2, Constant(2))
© Toon Verwaest
The PyPy tool chain
17
PyPy CFG: “Static Single Information”
> Remember SSA: PHIs at dominance frontiers
© Toon Verwaest
The PyPy tool chain
18
PyPy CFG: “Static Single Information”
def test(a):if a > 0:
if a > 5:return 10
return 4if a < - 10:
return 3return 10
> SSI: “PHIs” for all used variables– Blocks as “functions without branches”
© Toon Verwaest
The PyPy tool chain
20
Why type inference?
> Python is dynamically typed
> We want to translate to statically typed code— For efficiency reasons
What do we need to infer?
> Type for every variable
> Messages sent to an object must be defined in the compile-time type or a supertype
© Toon Verwaest
The PyPy tool chain
© Toon Verwaest
The PyPy tool chain
22
How to infer types?
> Starting from entry_point— Can reach the whole program— We know type of arguments and
return-value
> Forward propagation— Iteratively, until all links in
the CFG have been followed at least once
— Results in a large dictionary mapping variables to types
© Toon Verwaest
The PyPy tool chain
23
Implications of applying type inference
Applying type inference restrictstype of input programs
© Toon Verwaest
The PyPy tool chain
24
RPython: Demo
def plus(a, b):return a + b
def entry_point(arv=None):print plus(20, 22)print plus(“4”, “2”)
© Toon Verwaest
The PyPy tool chain
25
RPython: Demo
@objectmodel.specialize.argtype(0)def plus(a, b):
return a + b
def entry_point(arv=None):print plus(20, 22)print plus(“4”, “2”)
RPython is Zen
> Subset of Python
> Informally: The subset of Python which is type inferable
> Actually: type inferable stabilized bytecode— Allows load-time meta-programming (see parser)— Messages sent to an object must be defined in the compile-time
type or supertype
© Toon Verwaest
The PyPy tool chain
26
© Toon Verwaest
The PyPy tool chain
28
RTyper
> Bridge between annotator and low-level code generators
> Different low-level models for different target groups— LLTypeSystem C-style (structures, pointers and arrays)
— OOTypeSystem JVM, CLI, Squeak (trace-off: single inheritance, )
> Does not need to iterate until a fixpoint is reached
> Replaces all operations by low-level ones
© Toon Verwaest
The PyPy tool chain
30
Back-end Optimizations
> Some general optimizations — Inlining— Constant folding— Escape analysis (allocating objects on the stack)
> Partly assume code generation for optimizing back-end
© Toon Verwaest
The PyPy tool chain
31
Back-end Optimizations: “Object Explosion”
> OO: lots of helper objects
> Allocating objects is expensive
> Replace unneeded objects with direct calls
© Toon Verwaest
The PyPy tool chain
33
Exception Handling and Memory Management
> C has no support for:— automatic memory management— exception handling
> Translate explicit exception handling to flags and if/else
> Memory management in PyPy spirit:— not language specific— weave garbage collector in during translation
© Toon Verwaest
The PyPy tool chain
34
JIT Compiler
> Makes VMs fast— Dynamic information is key
> Is an implementation detail
> Still under development
> “As you surely know, the key idea of PyPy is that we are too lazy to write a JIT of our own: so, instead of passing nights writing a JIT, we pass years coding a JIT generator that writes the JIT for us :-)”
Weave in while translating to low-level!
© Toon Verwaest
The PyPy tool chain
36
Code Generation
> One C-function per Control-Flow Graph
> All low-level statements can be translated directly
> Gets compiled to binary format with C compiler
© Toon Verwaest
The PyPy tool chain
38
PyPy Performance
> Translator— Slow— Uses quite some memory— Produces lots of source code (200 kloc for 5 kloc source)
— But: our models are executable (2000x slower than CPython)
> Resulting Interpreter— Currently: two times slower to two times faster than CPython— First experiments with JIT: up to 500x faster for special cases— But most importantly: very adaptable!
© Toon Verwaest
The PyPy tool chain
39
More PyPy & Getting Involved
> http://codespeak.net/pypy> http://morepypy.blogspot.com> irc://irc.freenode.org/pypy> PyPy sprints
© Toon Verwaest
The PyPy tool chain
40
Summary
> PyPy project has two main parts— Language interpreter models— PyPy translation tool chain
> PyPy translation tool chain— Has no typical parser— Uses SSI— Applies type inference
– Limits input from Python to RPython— Compiles to low-level and object-oriented back-ends— Weaves in implementation details
© Toon Verwaest
The PyPy tool chain
4242
What you should know!
What is the goal of the PyPy project?What are the main steps of the PyPy toolchain?
When is a program RPython?
© Toon Verwaest
The PyPy tool chain
4343
Can you answer these questions?
> Why do we want to keep the language model separated from implementation details?
> Why wouldn't we want to keep those details separated?> Why is it not really a problem that the tool chain can only
compile RPython code?
© Toon Verwaest
The PyPy tool chain
44
xxx
44
License
> http://creativecommons.org/licenses/by-sa/2.5/
Attribution-ShareAlike 2.5You are free:• to copy, distribute, display, and perform the work• to make derivative works• to make commercial use of the work
Under the following conditions:
Attribution. You must attribute the work in the manner specified by the author or licensor.
Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one.
• For any reuse or distribution, you must make clear to others the license terms of this work.• Any of these conditions can be waived if you get permission from the copyright holder.
Your fair use and other rights are in no way affected by the above.