Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter...
Transcript of Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter...
![Page 1: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/1.jpg)
Renjin: The new R interpreter built on the JVM
Alexander Bertram
BeDataDriven
![Page 2: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/2.jpg)
What?
Renjin is a new interpreter for the R language.
Core & Builtins
Written in Java
Base
Stats
Graphics
GNU R Language Packages
![Page 3: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/3.jpg)
Why?
Memory Easier Integration
Speed
Performance
Java Virtual Machine
GC 500k libs
JIT tools
Parallelism
![Page 4: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/4.jpg)
Sure, but why Renjin?
Packages
Forks
biglm
bigvis
scaleR
scaleR
+ High performance for specific applications
- Require rewriting existing code - Limited applicability
pqR
+ Marginal improvements for all code - Unable to address underlying limitations
of the GNU R interpreter
![Page 5: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/5.jpg)
What do I get, like, today?
![Page 6: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/6.jpg)
Flexible
> renjin –f myscript.R
Command-Line Interpreter
Embeddable Java Library
Web-based REPL
![Page 7: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/7.jpg)
Java Virtual Machine
Renjin Session 1 Renjin Session 2 Renjin Session 3
Vector
Web Request Web Request Web Request
Immutable Data
Structures
Multiple In-process sessions, Shared Data
![Page 8: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/8.jpg)
Memory Efficiency
# GNU R Renjin
x <- runif(1e8) # +721 MB + 721 MB
y <- x + 1 # +761 MB
comment(y) <- "important!" # +763 MB
- getAttributes() Vector Interface - length() - getElement(int index)
![Page 9: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/9.jpg)
packages.renjin.org
Proper Dependency Management
Pre-built Package
Repository
Automated Testing of
Renjin
Translation of C/Fortran to
JVM Bytecode
![Page 10: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/10.jpg)
Seamless Access to Java/Scala Classes
import(com.acme.Customer)
bob <- Customer$new(name='Bob', age=36)
carol <- Customer$new(name='Carole', age=41)
bob$name <- "Bob II"
cat(c("Name: ", bob$name, "; Age: ", bob$age))
![Page 11: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/11.jpg)
Simple to embed in larger systems
// create a script engine manager
ScriptEngineManager factory = new ScriptEngineManager();
// create an R engine
ScriptEngine engine = factory.getEngineByName("Renjin");
// load package from classpath
engine.eval(“library(survey)");
// evaluate R code from String
engine.eval("print('Hello, World')");
// evaluate R script on disk
engine.eval(new FileReader("myscript.R"));
// evaluate R script from classpath
engine.eval(new InputStreamReader(
getClass().getResourceAsStream("myScript.R")));
![Page 12: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/12.jpg)
Package Development in Java
@DataParallel @Deferrable public static String chartr( String oldChars, String newChars, @Recycle String x) { StringBuilder translation = new StringBuilder(x.length()); for(int i=0;i!=x.length();++i) { int codePoint = x.codePointAt(i); int charIndex = oldChars.indexOf(codePoint); if(charIndex == -1) { translation.appendCodePoint(codePoint); } else { translation.appendCodePoint( newChars.codePointAt(charIndex)); } } return translation.toString(); }
![Page 13: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/13.jpg)
Under the hood
![Page 14: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/14.jpg)
Specialized Execution Modes
“Slow” AST
Interpreter
Vector Pipeliner
Scalar Compiler
- Supports full dynamism of R - Compute on the language
- Acts like a query planner - Batches, auto-parallelizes vector
workflows
- Partially evaluates & compiles loop bodies, apply functions to JVM byte code
![Page 15: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/15.jpg)
Queuing up work for the Vector Pipeliner
x <- runif(1e6)
y <- sqrt(x + 1)
z <- mean(y) - mean(x)
attr(z, 'comments') <- 'still not computed'
print(length(z)) # prints "1"
# but doesn't
#evaluate the mean
print(z) # triggers computation
![Page 16: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/16.jpg)
x <- runif(1e6)
y <- sqrt(x + 1)
z <- mean(y) - mean(x)
![Page 17: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/17.jpg)
Real-world case study:
Distance Correlation in the Energy Package
![Page 18: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/18.jpg)
Distance correlation: robust measure of association. Zero if and only if variables are independent.
![Page 19: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/19.jpg)
dcor <- function (x, y, index = 1) {
x <- as.matrix(dist(x))
y <- as.matrix(dist(y))
n <- nrow(x)
m <- nrow(y)
dims <- c(n, ncol(x), ncol(y))
Akl <- function(x) {
d <- as.matrix(x)^index
m <- rowMeans(d)
M <- mean(d)
a <- sweep(d, 1, m)
b <- sweep(a, 2, m)
return(b + M)
}
A <- Akl(x)
B <- Akl(y)
dCov <- sqrt(mean(A * B))
dVarX <- sqrt(mean(A * A))
dVarY <- sqrt(mean(B * B))
V <- sqrt(dVarX * dVarY)
if (V > 0)
dCor <- dCov/V
else dCor <- 0
return(list(dCov = dCov, dCor = dCor, dVarX =
dVarX, dVarY = dVarY))
}
dist(x) Evaluates as a
view
Defer rowMeans(x)
until later
Need to evaluate
![Page 20: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/20.jpg)
![Page 21: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/21.jpg)
0
20
40
60
80
100
120
140
160
180
200
1000 2000 5000 10000
Number of Observations
GNU R C Renjin
Run time of distance correlation of 10 pairs of variables
![Page 22: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/22.jpg)
Where do we go from here?
![Page 23: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/23.jpg)
Inspired by…
![Page 24: Renjin: The new R interpreter built on the JVMuseR-2013/slides/47.pdfRenjin: The new R interpreter built on the JVM Alexander Bertram BeDataDriven . What? Renjin is a new interpreter](https://reader033.fdocuments.us/reader033/viewer/2022053014/5f1290604de4c000c74a6257/html5/thumbnails/24.jpg)
Join us! Download
& Test
Contribute! Contract us
for Commercial
Support Sponsor
Development!
> Renjin.org