JVM: A Platform for Multiple Languages

Post on 10-May-2015

1.661 views 1 download

Tags:

description

My presentation in SDCC 2012 (http://sdcc.csdn.net/index_en.html). The video recording of this session is available at http://v.csdn.hudong.com/s/article.html?arcid=2810640

Transcript of JVM: A Platform for Multiple Languages

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.1

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.2

Insert Picture HereJVM: A Platform for Multiple LanguagesKrystal MoMember of Technical StaffHotSpot JVM Compiler Team

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.3

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.4

Once there was a time…

Source: http://www.tiobe.com

Java

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.5

Oh wait…

Source: http://www.tiobe.com

JavaTM

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.6

But really, what we meant was…

Source: http://www.tiobe.com

JavaTM

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.7

Fortunately, clearer minds prevailLanguage Implementations on JVM

Fantom

Fortress

(and many more…)

BeanShell

Jaskell

ANTLR

JudoScript

ABCL

Erjang

X10

myForth

C

jdartjgo

Nice

Gosu

Jacl

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.8

Agenda

Why make a language on JVM

Language features by emulation

What we did in JDK 7

Building the future

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.9

Agenda

Why make a language on JVM

Language features by emulation

What we did in JDK 7

Building the future

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.10

Why Make a Language At All? (1)

Syntax– the “easy” part

– pick one that fits your eyes

Semantics and Capabiliies– static vs. dynamic

– sequential vs. parallel

– …one that fits the problem domain

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.11

Why Make a Language At All? (2)

Language can use alternative syntax– where as library has to adhere to some host language

Language can impose more restrictions– e.g. controlling capability

– where as library has no control over host language’s capabilities

Versus writing a library

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.12

Why on JVM?

Mature low-level services– Dynamic (“JIT”) compilation

– Garbage collection

– Threading

– Debugging Support

Cross-platform Vast array of libraries

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.13

JVM StrengthsCompiler Optimizations

compiler tactics delayed compilation tiered compilation on-stack replacement delayed reoptimization program dependence graph representation static single assignment representationproof-based techniques exact type inference memory value inference memory value tracking constant folding reassociation operator strength reduction null check elimination type test strength reduction type test elimination algebraic simplification common subexpression elimination integer range typingflow-sensitive rewrites conditional constant propagation dominating test detection flow-carried type narrowing dead code elimination

language-specific techniques class hierarchy analysis

devirtualization symbolic constant propagation autobox elimination escape analysis lock elision lock fusion de-reflectionspeculative (profile-based) techniques optimistic nullness assertions optimistic type assertions optimistic type strengthening optimistic array length strengthening untaken branch pruning optimistic N-morphic inlining branch frequency prediction call frequency predictionmemory and placement transformation expression hoisting expression sinking redundant store elimination adjacent store fusion card-mark elimination merge-point splitting

loop transformations loop unrolling loop peeling safepoint elimination

iteration range splitting range check elimination loop vectorizationglobal code shaping inlining (graph integration) global code motion heat-based code layout switch balancing throw inliningcontrol flow graph transformation local code scheduling local code bundling delay slot filling graph-coloring register allocation linear scan register allocation live range splitting copy coalescing constant splitting copy removal address mode matching instruction peepholing DFA-based code generator

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.14

Why in Java?

Robustness: Runtime exceptions not fatal Reflection: Annotations instead of macros Tooling: Java IDEs speed up the development process etc.

Instead of C/C++?

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.15

Good IDEs Good Profilers Good tooling for developing

parsers and other language support

Excellent Tooling Support

Ease of Development

ANTLR

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.16

Developing a Language on JVM

Backed by JVM

SemanticsSyntax Low-level Details

•(your work goes here)•Backed by various libraries•e.g. ASM, dynalink

•Mature parser libraries•e.g. ANTLR, JavaCC

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.17

Case Study: Writing a Compiler in JavaUsing Reflection

for (IfNode n : graph.getNodes(IfNode.class)) { ... }

class CompareNode extends FloatingNode, implements ValueNumberable, Canonicalizable { @Input ValueNode x; @Input ValueNode y; @Data Condition condition;

public Node canonical(CanonicalizerTool t) { return this; }}

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.18

Agenda

Why make a language on JVM

Language features by emulation

What we did in JDK 7

Building the future

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.19

It can always be done

Java and JVM provide a rich set of primitives to build on Almost any language feature can be implemented on JVM

– albeit not necessarily efficient

Even without direct native support from JVM

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.20

David Wheeler

“All problems in computer science can be solved by another level of indirection.”

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.21

Kevlin Henney

“… except for the problem of too many layers of indirection.”

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.22

Case Study: A Bytecode Interpreter in Java“Double interpretation”

Java Source Program

Bytecode Interpreter in Java

Host JVM(also an interpreter)

Bytecode

i = j + 1

iload_2iconst_1iaddistore_1

while (true) { byte opcode = code[pc++]; switch (opcode) { // ... case ILOAD_2: int i = locals[2]; stack[sp++] = i; break; // ... }}

……………………………

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.23

Case Study: A JVM in JavaIdeally, redundant indirections are squeezed out

Java Source Program

Compiler in Java

Host MachineBytecode

i = j + 1

iload_2iconst_1iaddistore_1

lea eax, [edx+1]

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.24

Alternative Method Dispatching

e.g. prototype-based dispatch, metaclass, etc. Emulate with reflection

– Custom lookup / binding

– Then java.lang.reflect.Method.invoke()

– Reflective invocation overhead Security checking Argument boxing / unboxing

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.25

Tail-calls

Often seen in functional languages Emulate with trampoline loop Special case:

– Direct tail-recursions can easily be transformed into loops

– e.g. Scala does this

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.26

Case Study: tail-callRewrite into trampoline

static int a() { return b();}

static int b() { return c();}

static int c() { return 42;}

static int trampolineLoop(Task t) { Context ctx = new Context(); while (t != null) { t = t.invoke(ctx); } return ctx.value;}

static Task a(Context ctx) { return new Task(#b);}

static Task b(Context ctx) { return new Task(#c);}

static Task c(Context ctx) { ctx.value = 42; return null;}

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.27

Case Study: tail-recursionRewrite into loop

static int fib(int n) { return fibInner(n, 0, 1);}

static int fibInner(int n, int a, int b) { if (n < 2) return b; return fibInner(n - 1, b, a + b);}

static int fib(int n) { int a = 0, b = 1; while (n >= 2) { n = n - 1; int temp = a + b; a = b; b = temp; } return b;}

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.28

Coroutines

Emulate with threads– Can implement full (“stackful”) coroutine semantics

– Often use thread pooling as an optimization

– Waste (virtual) memory

– Could leak memory

– e.g. used by JRuby on stock JVMs

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.29

Coroutines

Emulate with Finite State Machines– Compile-time transformation

– Can only implement “stackless coroutines” Can only yield from the main method

– e.g. C# does this with its iterator

– e.g. there’s a coroutines library for Java that does the same

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.30

Case Study: C#’s iteratorOriginal source

static IEnumerable<int> GetNaturals() { int i = 1; while (true) { yield return i++; }}

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.31

Case Study: C#’s iteratorTransformed into FSM (simplied from actual generated code)

static IEnumerable<int> GetNaturals() { return new NaturalsIterator(0);}

sealed class NaturalsIterator : IEnumerable<int>, IEnumerable, IEnumerator<int>, IEnumerator, IDisposable { int _current, _state, _i; public NaturalsIterator(int state) { _state = state; } int IEnumerator<int>.Current { get { return _current; }

} bool IEnumerator.MoveNext() { switch (_state) { case 0: _i = 1; break; case 1: break; default: return false; } _current = _i++; _state = 1; return true; }}

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.32

Reference Counting

Emulate with reified reference– Boxing overhead

– You really don’t want to do this…

public class CountedReference<T> { private volatile int refCount = 1; private final T target; public CountedReference(T target) { this.target = target; } public T addRef() { refCount++; return target; } public void release() { if (refCount >= 1) refCount--; if (refCount < 1 && target != null) { target.finalize(); target = null; } }}

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.33

Infinite Precision Integer

Emulate with java.math.BigInteger– Boxing overhead

Performance impact Heap bloat

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.34

Closer to the Metal

Less indirections Better performance

Reducing redundant emulation

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.35

The “J” in JVM

Semantics match => good perf, easy to impl Closer to Java => closer to the metal on JVM

Geared towards Java semantics

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.36

The “J” in JVM

Languages on JVM shouldn’t be forced to be like Java to be performant

Improve the VM to accommodate non-Java language features– In turn, benefits Java itself, e.g. lambdas

Should optimize for non-Java features, too

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.37

Agenda

Why make a language on JVM

Language features by emulation

What we did in JDK 7

Building the future

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.38

Why dynamic languages?

Fast turnaround time for simple programs– no compile step required

– direct interpretation possible

– loose binding to the environment

Data-driven programming– program shape can change along with data shape

– radically open-ended code (plugins, aspects, closures)

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.39

Dynamic languages are here to stay

Source: http://www.tiobe.com

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.40

What slows down a JVM

Non-Java languages require special call sites.– e.g.: Smalltalk message sending (no static types).

– e.g.: JavaScript or Ruby method call (different lookup rules).

In the past, special calls required simulation overheads– ...such as reflection and/or extra levels of lookup and indirection

– ...which have inhibited JIT optimizations.

Result: Pain for non-Java developers. Enter Java 7.

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.41

Key Features

New bytecode instruction: invokedynamic.– Linked reflectively, under user control.

– User-visible object: java.lang.invoke.CallSite

– Dynamic call sites can be linked and relinked, dynamically.

New unit of behavior: method handle– The content of a dynamic call site is a method handle.

– Method handles are function pointers for the JVM.

– (Or if you like, each MH implements a single-method interface.)

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.42

Dynamic Program Composition

Method Handles

BytecodesDynamic Call

Sites

JVM JIT

A dynamic call site is created for each invokedynamic call bytecode

Each call site is bound to one or more method handles, which point back to bytecoded methods

Bytecodes are created by Java compilers or dynamic runtimes

The JVM seamlessly integrates execution, optimizing to native code as necessary

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.43

Passing the burden to the JVM

Non-Java languages require special call sites. In the past, special calls required simulation overheads

Now, invokedynamic call sites are fully user-configurable– ...and are fully optimizable by the JIT.

Result: Much simpler code for language implementors– ...and new leverage for the JIT.

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.44

What’s in a method call? (before invokedynamic)

Source code Bytecode Linking Executing

Naming Identifiers Utf8 constants JVM “dictionary”

Selecting Scopes Class names Loaded classes

V-table lookup

Adapting Argument conversion

C2I / I2C adapters

Receiver narrowing

Calling Jump with arguments

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.45

What’s in a method call? (using invokedynamic)

Source code Bytecode Linking Executing

Naming ∞ ∞ ∞ ∞

Selecting ∞ Bootstrap methods

Bootstrap method call

Adapting ∞ Method handles

Calling Jump with arguments

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.46

Charles NutterJRuby Lead, Red Hat

“Invokedynamic is the most important addition to Java in years. It will change the face of the platform.”

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.47

Agenda

Why make a language on JVM

Language features by emulation

What we did in JDK 7

Building the future

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.48

Loose ends in the Java 7 API

Method handle introspection (reflection) Generalized proxies (more than single-method intfs) Class hierarchy analysis (override notification) Smaller issues:

– usability (MethodHandle.toString, polymorphic bindTo)

– sharp corners (MethodHandle.invokeWithArguments)

– repertoire (tryFinally, more fold/spread/collect options)

Integration with other APIs (java.lang.reflect)

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.49

Support for Lambda in OpenJDK8

More transforms for SAM types (as needed). Faster bindTo operation to create bound MHs

– No JNI calls.

– Maybe multiple-value bindTo.

Faster inexact invoke (as needed).

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.50

Let’s continue building our “future VM”

Da Vinci Machine Project: an open source incubator for JVM futures

Contains code fragments (patches). Movement to OpenJDK requires:

– a standard (e.g., JSR 292)

– a feature release plan (7 vs. 8 vs. ...)

bsd-port for developer friendliness. mlvm-dev@openjdk.java.net

http://hg.openjdk.java.net/mlvm/mlvm/hotspot/

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.51

Current Da Vinci Machine Patches

MLVM patches

meth method handles implementation

indy invokedynamic

coro light weight coroutines (Lukas Stadler)

inti interface injection (Tobias Ivarsson)

tailc hard tail call optimization (Arnold Schwaighofer)

tuple integrating tuple types (Michael Barker)

hotswap online general class schema updates (Thomas Wuerthinger)

anonk anonymous classes; light weight bytecode loading

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.52

Caveat: Change is hard and slow

Hacking code is relatively simple. Removing bugs is harder. Verifying is difficult (millions of users). Integrating to a giant system very hard.

– interpreter, multiple compilers

– managed heap (multiple GC algos.)

– debugging, monitoring, profiling machinery

– security interactions

Specifying is hard (the last 20%...). Running process is time-consuming.

(especially the “last 20%”)

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.53

Further Reading

Multi-Language VM (MLVM) Project on OpenJDK JVM Language Summit JSR 292 Cookbook

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.54

References

VM Optimizations for Language Designers, John Pampuch, JVM Language Summit 2008

Method Handles and Beyond, Some basis vectors, John Rose, JVM Language Summit 2011

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.55

Copyright © 2012, Oracle and/or its affiliates. All rights reserved.56