JVM: A Platform for Multiple Languages

56
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 1

description

My presentation in SDCC 2012 (http://sdcc.csdn.net/index_en.html). The video recording of this session is available at http://v.csdn.hudong.com/s/article.html?arcid=2810640

Transcript of JVM: A Platform for Multiple Languages

Page 1: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.1

Page 2: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.2

Insert Picture HereJVM: A Platform for Multiple LanguagesKrystal MoMember of Technical StaffHotSpot JVM Compiler Team

Page 3: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.3

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Page 4: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.4

Once there was a time…

Source: http://www.tiobe.com

Java

Page 5: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.5

Oh wait…

Source: http://www.tiobe.com

JavaTM

Page 6: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.6

But really, what we meant was…

Source: http://www.tiobe.com

JavaTM

Page 7: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.7

Fortunately, clearer minds prevailLanguage Implementations on JVM

Fantom

Fortress

(and many more…)

BeanShell

Jaskell

ANTLR

JudoScript

ABCL

Erjang

X10

myForth

C

jdartjgo

Nice

Gosu

Jacl

Page 8: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.8

Agenda

Why make a language on JVM

Language features by emulation

What we did in JDK 7

Building the future

Page 9: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.9

Agenda

Why make a language on JVM

Language features by emulation

What we did in JDK 7

Building the future

Page 10: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.10

Why Make a Language At All? (1)

Syntax– the “easy” part

– pick one that fits your eyes

Semantics and Capabiliies– static vs. dynamic

– sequential vs. parallel

– …one that fits the problem domain

Page 11: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.11

Why Make a Language At All? (2)

Language can use alternative syntax– where as library has to adhere to some host language

Language can impose more restrictions– e.g. controlling capability

– where as library has no control over host language’s capabilities

Versus writing a library

Page 12: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.12

Why on JVM?

Mature low-level services– Dynamic (“JIT”) compilation

– Garbage collection

– Threading

– Debugging Support

Cross-platform Vast array of libraries

Page 13: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.13

JVM StrengthsCompiler Optimizations

compiler tactics delayed compilation tiered compilation on-stack replacement delayed reoptimization program dependence graph representation static single assignment representationproof-based techniques exact type inference memory value inference memory value tracking constant folding reassociation operator strength reduction null check elimination type test strength reduction type test elimination algebraic simplification common subexpression elimination integer range typingflow-sensitive rewrites conditional constant propagation dominating test detection flow-carried type narrowing dead code elimination

language-specific techniques class hierarchy analysis

devirtualization symbolic constant propagation autobox elimination escape analysis lock elision lock fusion de-reflectionspeculative (profile-based) techniques optimistic nullness assertions optimistic type assertions optimistic type strengthening optimistic array length strengthening untaken branch pruning optimistic N-morphic inlining branch frequency prediction call frequency predictionmemory and placement transformation expression hoisting expression sinking redundant store elimination adjacent store fusion card-mark elimination merge-point splitting

loop transformations loop unrolling loop peeling safepoint elimination

iteration range splitting range check elimination loop vectorizationglobal code shaping inlining (graph integration) global code motion heat-based code layout switch balancing throw inliningcontrol flow graph transformation local code scheduling local code bundling delay slot filling graph-coloring register allocation linear scan register allocation live range splitting copy coalescing constant splitting copy removal address mode matching instruction peepholing DFA-based code generator

Page 14: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.14

Why in Java?

Robustness: Runtime exceptions not fatal Reflection: Annotations instead of macros Tooling: Java IDEs speed up the development process etc.

Instead of C/C++?

Page 15: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.15

Good IDEs Good Profilers Good tooling for developing

parsers and other language support

Excellent Tooling Support

Ease of Development

ANTLR

Page 16: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.16

Developing a Language on JVM

Backed by JVM

SemanticsSyntax Low-level Details

•(your work goes here)•Backed by various libraries•e.g. ASM, dynalink

•Mature parser libraries•e.g. ANTLR, JavaCC

Page 17: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.17

Case Study: Writing a Compiler in JavaUsing Reflection

for (IfNode n : graph.getNodes(IfNode.class)) { ... }

class CompareNode extends FloatingNode, implements ValueNumberable, Canonicalizable { @Input ValueNode x; @Input ValueNode y; @Data Condition condition;

public Node canonical(CanonicalizerTool t) { return this; }}

Page 18: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.18

Agenda

Why make a language on JVM

Language features by emulation

What we did in JDK 7

Building the future

Page 19: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.19

It can always be done

Java and JVM provide a rich set of primitives to build on Almost any language feature can be implemented on JVM

– albeit not necessarily efficient

Even without direct native support from JVM

Page 20: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.20

David Wheeler

“All problems in computer science can be solved by another level of indirection.”

Page 21: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.21

Kevlin Henney

“… except for the problem of too many layers of indirection.”

Page 22: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.22

Case Study: A Bytecode Interpreter in Java“Double interpretation”

Java Source Program

Bytecode Interpreter in Java

Host JVM(also an interpreter)

Bytecode

i = j + 1

iload_2iconst_1iaddistore_1

while (true) { byte opcode = code[pc++]; switch (opcode) { // ... case ILOAD_2: int i = locals[2]; stack[sp++] = i; break; // ... }}

……………………………

Page 23: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.23

Case Study: A JVM in JavaIdeally, redundant indirections are squeezed out

Java Source Program

Compiler in Java

Host MachineBytecode

i = j + 1

iload_2iconst_1iaddistore_1

lea eax, [edx+1]

Page 24: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.24

Alternative Method Dispatching

e.g. prototype-based dispatch, metaclass, etc. Emulate with reflection

– Custom lookup / binding

– Then java.lang.reflect.Method.invoke()

– Reflective invocation overhead Security checking Argument boxing / unboxing

Page 25: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.25

Tail-calls

Often seen in functional languages Emulate with trampoline loop Special case:

– Direct tail-recursions can easily be transformed into loops

– e.g. Scala does this

Page 26: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.26

Case Study: tail-callRewrite into trampoline

static int a() { return b();}

static int b() { return c();}

static int c() { return 42;}

static int trampolineLoop(Task t) { Context ctx = new Context(); while (t != null) { t = t.invoke(ctx); } return ctx.value;}

static Task a(Context ctx) { return new Task(#b);}

static Task b(Context ctx) { return new Task(#c);}

static Task c(Context ctx) { ctx.value = 42; return null;}

Page 27: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.27

Case Study: tail-recursionRewrite into loop

static int fib(int n) { return fibInner(n, 0, 1);}

static int fibInner(int n, int a, int b) { if (n < 2) return b; return fibInner(n - 1, b, a + b);}

static int fib(int n) { int a = 0, b = 1; while (n >= 2) { n = n - 1; int temp = a + b; a = b; b = temp; } return b;}

Page 28: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.28

Coroutines

Emulate with threads– Can implement full (“stackful”) coroutine semantics

– Often use thread pooling as an optimization

– Waste (virtual) memory

– Could leak memory

– e.g. used by JRuby on stock JVMs

Page 29: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.29

Coroutines

Emulate with Finite State Machines– Compile-time transformation

– Can only implement “stackless coroutines” Can only yield from the main method

– e.g. C# does this with its iterator

– e.g. there’s a coroutines library for Java that does the same

Page 30: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.30

Case Study: C#’s iteratorOriginal source

static IEnumerable<int> GetNaturals() { int i = 1; while (true) { yield return i++; }}

Page 31: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.31

Case Study: C#’s iteratorTransformed into FSM (simplied from actual generated code)

static IEnumerable<int> GetNaturals() { return new NaturalsIterator(0);}

sealed class NaturalsIterator : IEnumerable<int>, IEnumerable, IEnumerator<int>, IEnumerator, IDisposable { int _current, _state, _i; public NaturalsIterator(int state) { _state = state; } int IEnumerator<int>.Current { get { return _current; }

} bool IEnumerator.MoveNext() { switch (_state) { case 0: _i = 1; break; case 1: break; default: return false; } _current = _i++; _state = 1; return true; }}

Page 32: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.32

Reference Counting

Emulate with reified reference– Boxing overhead

– You really don’t want to do this…

public class CountedReference<T> { private volatile int refCount = 1; private final T target; public CountedReference(T target) { this.target = target; } public T addRef() { refCount++; return target; } public void release() { if (refCount >= 1) refCount--; if (refCount < 1 && target != null) { target.finalize(); target = null; } }}

Page 33: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.33

Infinite Precision Integer

Emulate with java.math.BigInteger– Boxing overhead

Performance impact Heap bloat

Page 34: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.34

Closer to the Metal

Less indirections Better performance

Reducing redundant emulation

Page 35: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.35

The “J” in JVM

Semantics match => good perf, easy to impl Closer to Java => closer to the metal on JVM

Geared towards Java semantics

Page 36: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.36

The “J” in JVM

Languages on JVM shouldn’t be forced to be like Java to be performant

Improve the VM to accommodate non-Java language features– In turn, benefits Java itself, e.g. lambdas

Should optimize for non-Java features, too

Page 37: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.37

Agenda

Why make a language on JVM

Language features by emulation

What we did in JDK 7

Building the future

Page 38: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.38

Why dynamic languages?

Fast turnaround time for simple programs– no compile step required

– direct interpretation possible

– loose binding to the environment

Data-driven programming– program shape can change along with data shape

– radically open-ended code (plugins, aspects, closures)

Page 39: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.39

Dynamic languages are here to stay

Source: http://www.tiobe.com

Page 40: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.40

What slows down a JVM

Non-Java languages require special call sites.– e.g.: Smalltalk message sending (no static types).

– e.g.: JavaScript or Ruby method call (different lookup rules).

In the past, special calls required simulation overheads– ...such as reflection and/or extra levels of lookup and indirection

– ...which have inhibited JIT optimizations.

Result: Pain for non-Java developers. Enter Java 7.

Page 41: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.41

Key Features

New bytecode instruction: invokedynamic.– Linked reflectively, under user control.

– User-visible object: java.lang.invoke.CallSite

– Dynamic call sites can be linked and relinked, dynamically.

New unit of behavior: method handle– The content of a dynamic call site is a method handle.

– Method handles are function pointers for the JVM.

– (Or if you like, each MH implements a single-method interface.)

Page 42: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.42

Dynamic Program Composition

Method Handles

BytecodesDynamic Call

Sites

JVM JIT

A dynamic call site is created for each invokedynamic call bytecode

Each call site is bound to one or more method handles, which point back to bytecoded methods

Bytecodes are created by Java compilers or dynamic runtimes

The JVM seamlessly integrates execution, optimizing to native code as necessary

Page 43: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.43

Passing the burden to the JVM

Non-Java languages require special call sites. In the past, special calls required simulation overheads

Now, invokedynamic call sites are fully user-configurable– ...and are fully optimizable by the JIT.

Result: Much simpler code for language implementors– ...and new leverage for the JIT.

Page 44: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.44

What’s in a method call? (before invokedynamic)

Source code Bytecode Linking Executing

Naming Identifiers Utf8 constants JVM “dictionary”

Selecting Scopes Class names Loaded classes

V-table lookup

Adapting Argument conversion

C2I / I2C adapters

Receiver narrowing

Calling Jump with arguments

Page 45: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.45

What’s in a method call? (using invokedynamic)

Source code Bytecode Linking Executing

Naming ∞ ∞ ∞ ∞

Selecting ∞ Bootstrap methods

Bootstrap method call

Adapting ∞ Method handles

Calling Jump with arguments

Page 46: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.46

Charles NutterJRuby Lead, Red Hat

“Invokedynamic is the most important addition to Java in years. It will change the face of the platform.”

Page 47: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.47

Agenda

Why make a language on JVM

Language features by emulation

What we did in JDK 7

Building the future

Page 48: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.48

Loose ends in the Java 7 API

Method handle introspection (reflection) Generalized proxies (more than single-method intfs) Class hierarchy analysis (override notification) Smaller issues:

– usability (MethodHandle.toString, polymorphic bindTo)

– sharp corners (MethodHandle.invokeWithArguments)

– repertoire (tryFinally, more fold/spread/collect options)

Integration with other APIs (java.lang.reflect)

Page 49: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.49

Support for Lambda in OpenJDK8

More transforms for SAM types (as needed). Faster bindTo operation to create bound MHs

– No JNI calls.

– Maybe multiple-value bindTo.

Faster inexact invoke (as needed).

Page 50: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.50

Let’s continue building our “future VM”

Da Vinci Machine Project: an open source incubator for JVM futures

Contains code fragments (patches). Movement to OpenJDK requires:

– a standard (e.g., JSR 292)

– a feature release plan (7 vs. 8 vs. ...)

bsd-port for developer friendliness. [email protected]

http://hg.openjdk.java.net/mlvm/mlvm/hotspot/

Page 51: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.51

Current Da Vinci Machine Patches

MLVM patches

meth method handles implementation

indy invokedynamic

coro light weight coroutines (Lukas Stadler)

inti interface injection (Tobias Ivarsson)

tailc hard tail call optimization (Arnold Schwaighofer)

tuple integrating tuple types (Michael Barker)

hotswap online general class schema updates (Thomas Wuerthinger)

anonk anonymous classes; light weight bytecode loading

Page 52: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.52

Caveat: Change is hard and slow

Hacking code is relatively simple. Removing bugs is harder. Verifying is difficult (millions of users). Integrating to a giant system very hard.

– interpreter, multiple compilers

– managed heap (multiple GC algos.)

– debugging, monitoring, profiling machinery

– security interactions

Specifying is hard (the last 20%...). Running process is time-consuming.

(especially the “last 20%”)

Page 53: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.53

Further Reading

Multi-Language VM (MLVM) Project on OpenJDK JVM Language Summit JSR 292 Cookbook

Page 54: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.54

References

VM Optimizations for Language Designers, John Pampuch, JVM Language Summit 2008

Method Handles and Beyond, Some basis vectors, John Rose, JVM Language Summit 2011

Page 55: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.55

Page 56: JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.56