Post on 29-Dec-2015
Interlanguage Working Without Tears:
Blending SML with Java
Andrew Kennedy
Nick Benton
Microsoft Research Cambridge
Goal
Fun for functional programmers: GUIs, 3-d, sound, video, email, crypto,
imaging, server-side code, phone, TV, … Achieved by an interface between SML and
Java. Implemented in MLj, a compiler that
generates Java class files from SML’97 source.
Three approaches to interop
1. Bilateral interface with marshalling and explicit calling conventions (e.g. JNI, O’Caml interface for C).
2. Multilateral interface with IDL (e.g. COM, CORBA) together with particular language mappings (e.g. H/Direct, Caml COM, MCORBA).
3. Language integration (MLj).
1. Explicit Bilateral Interface
Two languages have distinct type systems and calling conventions.
Interface by: Marshalling data between “compatible” types (e.g. java.lang.String to const char* by copying). Often restricted to a subset of the type system.
Giving directives for exporting and importing functions with language-specific calling conventions (e.g. _pascal _cdecl).
Usually tied to particular compiler implementations (e.g. SML/NJ and MLWorks have different C interfaces).
Realistically used only by experts.
Example: JNI
JNIEXPORT jstring JNICALL Java_Prompt_getLine(JNIEnv *env, jobject obj, jstring prompt) { char buf[128]; const char *str = (*env)->GetStringUTFChars(env,prompt,0);
printf("%s", str); (*env)->ReleaseStringUTFChars(env, prompt, str); … scanf("%s", buf); return (*env)->NewStringUTF(env, buf);}
2. IDL-based interop
Idea: Use a language-independent interface definition
language (IDL) to describe the signatures of functions that are to be called across the border.
Generate stub code using a language-specific tool. Good because it separates the interface from
the language and supports multilateral interop. But: the programmer has to write IDL code.
3. Our approach: Integration
Idea: When the “semantic gap” between two languages is
small, integrate features of one language into the other.
If done well, can be used by novices. But: language (and perhaps implementation)
specific.
Our languages: SML & Java
Both languages are strongly typed with good correspondences: Numeric types match closely Strings are immutable vectors Arrays have run-time sizing and bounds checking Neither language has explicit pointer types
Both languages have automatic storage management.
Exception handling in both languages is similar. But: there are significant differences too.
Interop in MLj 0.1
Name Java types (Java.int, “java.util.Vector”) and provide coercions between ML and Java types (e.g. Java.fromInt, Java.toInt)
Provide new constructs corresponding to some Java language constructs (in fact, often closer to JVM bytecodes)
MLj 0.1
The bolt-it-on approach:
Example
_public _method "handleEvent" (e : Event option) : Java.boolean = if Java.toInt(_getfield "id" (valOf e)) = Java.toInt(_getfield Event "WINDOW_DESTROY") then OS.Process.terminate OS.Process.success else _invoke "handleEvent" (_super, e)
Response from some users:
Ugh!
New design for MLj
The blending approach:
MLj 1.0
Don’t just attempt to replicate Java constructs Instead:
re-use SML concepts where appropriate invent clean new syntax elsewhere
Design goals
Simplicity Lightweight syntax Easy to convert Java code into MLj
Compatibility: SML’97 programs typecheck and run without change
Safety: Java-style type safety + avoid NullPointerException
Power: Improve on Java where possible
A non-goal
To pass ML-specific values into Java It’s less useful – write code in MLj instead It could compromise safety (e.g. by mutating
ML values) It requires uniform data representations, but
we want the chance to optimise the representations
Example codeopen javax.swing java.awt java.awt.event_classtype SampleApplet () : JApplet () with local val prefix = "Counter: “ val count = ref 0 val label = JLabel (prefix ^ "0", JLabel.CENTER) fun makeButton (title, increment) = let val button = JButton (title:string) val listener = ActionListener () with actionPerformed (e : ActionEvent option) = (count := !count + increment; label.#setText(prefix ^ Int.toString (!count))) end in button.#addActionListener(listener); button endin init () = let val SOME pane = this.#getContentPane () val button1 = makeButton ("Add One", 1) val button2 = makeButton ("Add Two", 2) in pane.#add(button1, BorderLayout.WEST); pane.#add(label, BorderLayout.CENTER); pane.#add(button2, BorderLayout.EAST) end end
Analogies between ML and Java
static field
static method
package
void
null
multiple args
mutability
non-static methods
ref
open
unit
structure
NONE
val binding
fun binding
tuple
type identifier
Java SML
class name
import
casts
instanceof
private fields local decs
class defs
non-static fields
Types
Java type ML typeboolean boolbyte Int8.intchar chardouble realfloat Real32.realint intlong Int64.intshort Int16.int
java.lang.String stringjava.lang.Exception exnjava.math.BigInteger IntInf.intjava.util.Calendar Date.date
X[] X array
Null values
Java reference values (arrays & objects) can take the value null
ML doesn’t have this notion, so values of array and class types are interpreted as “non-null instance” Then
datatype ‘a option = NONE | SOME of ‘a
is used for possibly-null objects and arrays
Fields and methods
Final fields (Java’s “const”) = ML values Non-final fields = ML refs Methods are given function types with
Tuples for multiple args Unit for void arg and result Implicit Java-style casts on arguments + T to
T option
Fields and methods, cont.
Static fields & methods are just bindings in ML structures (= Java class) embedded in a hierarchy of structures (= Java packages)
Non-static members are accessed through .# notation Constructors are just bindings with the same name as
the type (= Java new C) Improving on Java: first-class fields and methods e.g.
val colours = map (valOf o java.awt.Color.getColor) [“red”,“green”]
val labels = map javax.swing.JLabel [“ICFP”,”PLI”]
Casts and typecase
Java-style upcasts, using Caml-like syntax val c = Jbutton (“My button”) :> Component
Also used for downcasts, but neater alternative is “cast patterns”:
case (e : Expr) of ce :> CondExpr => … | ae :> AssignExpr => …
Creating Java classes in ML
1. Export an ML structure as a class, with functional values interpreted as static methods, non-functional values interpreted as static fields
2. New _classtype construct
Example
_classtype Point(xinit, yinit)With local val x = ref xinit val y = ref yinitin getX() = !x and getY() = !y and move(xinc,yinc) = (x := !x+xinc; y := !y+yinc) and moveHoriz xinc = this.#move(0, yinc) and moveVert yinc = this.#move(xinc, 0)end
Example
Single constructor, with args used throughout definition (as in O’Caml)
No fields! (Instead, use local definitions) Use of this as in Java
_classtype Point(xinit, yinit)With local val x = ref xinit val y = ref yinitin getX() = !x and getY() = !y and move(xinc,yinc) = (x := !x+xinc; y := !y+yinc) and moveHoriz xinc = this.#move(0, yinc) and moveVert yinc = this.#move(xinc, 0)end
Example, cont.
_classtype ColouredPoint(x,y,c) : Point(x,y)with getColour() = c : java.awt.Color and move (xinc,yinc) = this.##move(xinc*2, yinc*2)end
Example, cont.
_classtype ColouredPoint(x,y,c) : Point(x,y)with getColour() = c : java.awt.Color and move (xinc,yinc) = this.##move(xinc*2, yinc*2)end
Superclass specified with arguments to its constructor Overriding of methods Special syntax for superclass method invocation
Finale
Classic functional techniques: Backtracking & lazy lists to solve Eight
Queens Combinators for music (à la Hudak)
Interpreted using Java multimedia libraries…
Conclusion
Language interop is hard to get right – it’s a language design problem like any other
We think we’ve done a good job! See the paper for formalisation in the style of the
Definition of Standard ML Main line of future work: better inference
Currently, some programs with unique typings are rejected because types are inferred on-the-fly
Instead, first do pass over term generating constraints, then solve them.
Available soon in MLj – for now, see http://www.dcs.ed.ac.uk/~mlj