Economy Julia Shukhman,287 English supervisor Julia Shtaltovna
Julia - Easier, Better, Faster, Stronger
-
Upload
kenta-sato -
Category
Technology
-
view
2.308 -
download
1
Transcript of Julia - Easier, Better, Faster, Stronger
Easier, Better, Faster, Stronger
Kenta Sato
July 02, 2014
1 / 30
Agenda1. The Julia Language
2. Easier
Familiar Syntax
Just-In-Time Compiler
3. Better
Types for Technical Computing
Library Support
Type System
4. Faster
Benchmark
N Queens Puzzle
5. Stronger
Multiple Dispatch
Macros2 / 30
NotationsHere I use the following special notation in examples.
<expression> #> <value>: The <expression> is evaluated to the <value>.
<expression> #: <output>: When the <expression> is evaluated, it prints the<output> to the screen.
<expression> #! <error>: When the <expression> is evaluated, it throws the<error>.
Examples:
42 #> 422 + 3 #> 5"hello, world" #> "hello, world"println("hello, world") #: hello, world42 + "hello, world" #! ERROR: no method +(Int64, ASCIIString)
3 / 30
The Julia Language
Julia is a highlevel, highperformance dynamicprogramming language for technical computing, withsyntax that is familiar to users of other technicalcomputing environments. It provides a sophisticatedcompiler, distributed parallel execution, numerical accuracy,and an extensive mathematical function library.
The core of the Julia implementation is licensed under the MITlicense. Various libraries used by the Julia environmentinclude their own licenses such as the GPL, LGPL, and BSD(therefore the environment, which consists of the language,user interfaces, and libraries, is under the GPL).— http://julialang.org/
“
4 / 30
Easier - Familiar SyntaxAt a glance, you will feel familiar with the syntax of Julia.
The usage of for, while, and if is very close to that of Ruby or Python.
continue, break, and return work as you expect.
Defining function is also straightforward, the function name is followed by itsarguments.
You can specify the types of arguments, which is actually optional.
@inbounds is a kind of macros, and macros always start with the @ character.
function sort!(v::AbstractVector, lo::Int, hi::Int, ::InsertionSortAlg, o::Ordering) @inbounds for i in lo+1:hi j = i x = v[i] while j > lo if lt(o, x, v[j-1]) v[j] = v[j-1] j -= 1 continue end break end v[j] = x end return vend base/sort.jl5 / 30
Easier - Familiar Syntaxn:m creates a range data which is inclusive on both sides.
Python's range(n, m) includes the left side, but doesn't the right side,which is often confusing.
[... for x in xs] creates an array from xs, which is something iterable.
This notation is known as list comprehension in Python and Haskell.
4:8 #> 4:8[x for x in 4:8] #> [4,5,6,7,8][4:8] #> [4,5,6,7,8][x * 2 for x in 4:8] #> [8,10,12,14,16]
6 / 30
Easier - Familiar SyntaxThe index of an array always starts with 1, not 0.
That means when you allocate an array with size n, all indices in 1:n areaccessible.
You can use a range data to copy a part of an array.
The step of a range can be placed between the start and stop. (i.e.start:step:stop)
You can also specify negative step, which creates a reversed range.
There is a special index - end - indicating the last index of an array.
xs = [8, 6, 4, 2, 0]xs[1:3] #> [8,6,4]xs[4:end] #> [2,0]xs[1:2:end] #> [8,4,0]xs[end:-2:1] #> [0,4,8]
7 / 30
Easier - Just-In-Time CompilerTo run your program written in Julia, there is no need to compile it beforehand.You only have to give the entry point file to the Julia's JIT (Jist-In-Time) compiler:
% cat myprogram.jln = 10xs = [1:n]println("the total between 1 and $n is $(sum(xs))")% julia myprogram.jlthe total between 1 and 10 is 55
From version 0.3, the standard libraries are precompiled when you build Julia,which saves much time to start your program.
% time julia myprogram.jlthe total between 1 and 10 is 55 0.80 real 0.43 user 0.10 sys
8 / 30
Better - Types for Technical ComputingJulia supports various numerical types with different sizes.
Integer types
Type Signed? Number of bits Smallest value Largest value
Int8 ✓ 8 -2^7 2^7 - 1
Uint8 8 0 2^8 - 1
Int16 ✓ 16 -2^15 2^15 - 1
Uint16 16 0 2^16 - 1
Int32 ✓ 32 -2^31 2^31 - 1
Uint32 32 0 2^32 - 1
Int64 ✓ 64 -2^63 2^63 - 1
Uint64 64 0 2^64 - 1
Int128 ✓ 128 -2^127 2^127 - 1
Uint128 128 0 2^128 - 1
Bool N/A 8 false (0) true (1)
Char N/A 32 '\0' '\Uffffffff'
9 / 30
Better - Types for Technical ComputingFloating-point types
Type Precision Number of bits
Float16 half 16
Float32 single 32
Float64 double 64
10000 #> 10000typeof(10000) #> Int640x12 #> 0x12typeof(0x12) #> Uint80x123 #> 0x0123typeof(0x123) #> Uint161.2 #> 1.2typeof(1.2) #> Float641.2e-10 #> 1.2e-10
Complex numbers and rational numbers are also available:
1 + 2im # 1 + 2i6//9 # 2/3
http://julia.readthedocs.org/en/latest/manual/integers-and-floating-point-numbers/#integers-and-floating-point-numbers10 / 30
Better - Types for Technical ComputingIf you need more precise values, arbitrary-precision arithmetic is supported. Thereare two data types to offer this arithmetic operation:
BigInt - arbitrary precision integer
BigFloat - arbitrary precision floating point numbers
big_prime = BigInt("5052785737795758503064406447721934417290878968063369478337")typeof(big_prime) #> BigInt
precise_pi = BigFloat("3.14159265358979323846264338327950288419716939937510582097")typeof(precise_pi) #> BigFloat
And if you need customized types, you can create a new type. The user-definedtypes are instantiated by their type name functions called constructors:
type Point x::Float64 y::Float64end
# Point is the constructor.p1 = Point(1.2, 3.4)p2 = Point(0.2, -3.1) 11 / 30
Better - Library SupportJulia bundles various libraries in it. These libraries are incorporated into thestandard library, thus almost no need to know the details of the underlying APIs.
Numerical computing
OpenBLAS ― basic linear algebra subprograms
LAPACK ― linear algebra routines for solving systems
Intel® Math Kernel Library (optional) ― fast math library for Intelprocessors
SuiteSparse ― linear algebra routines for sparse matrices
ARPACK ― subroutines desined to solve large scale eigenvalue problems
FFTW ― library for computing the discrete Fourier transformations
Other tools
PCRE ― Perl-compatible regular expressions library
libuv ― asynchronous IO library
12 / 30
Better - Library SupportHere some functions of linear algebra library.
a = randn((50, 1000)) # 50x1000 matrixb = randn((50, 1000)) # 50x1000 matrixx = randn((1000, 1000)) # 1000x1000 matrix
# dot productdot(vec(a), vec(b))# matrix multiplicationa * x# LU factorizationlu(x)# eigen values and eigen vectorseig(x)
The vec function converts a multi-dimensional array into a vector without copy.❏
13 / 30
Better - Type SystemThe type system of Julia is categorized as dynamic type-checking, in whichthe type safety is verified at runtime.
But each value has a concrete type and its type is not implicitly converted toother type at runtime.
You can almost always think that types should be converted explicitly.
There are two notable exceptions: arithmetic operators andconstructors.
x = 12typeof(x) #> Int64y = 12.0typeof(y) #> Float64
# this function only accepts an Int64 argumentfunction foo(x::Int64) println("the value is $x")end
foo(x) #: the value is 12foo(y) #! ERROR: no method foo(Float64)
14 / 30
x = 12y = 12.0x + y #> 24.0x - y #> 0.0x * y #> 144.0x / y #> 1.0
promotion rule is defined as:
promote_rule(::Type{Float64}, ::Type{Int64}) = Float64
type Point x::Float64 y::Float64end
Point(x, y) #> Point(12.0, 12.0)
Better - Type SystemArithmetic operators are functions in Julia.
For example, addition of Float64 is defined as +(x::Float64,y::Float64)at float.jl:125.
But you can use these operators for differently typed values.
This automatic type conversion is called promotion, which is defined by thepromote_rule function.
Constructors also do type conversion implicitly.
15 / 30
Better - Type SystemTypes can be parameterized by other types or values. This is called typeparameters.
For example, an array has two type parameters - the element type and thedimensions.
The Array{T,D} type contains elements typed as T, and is a Ddimensional array.
typeof([1, 2, 3]) #> Array{Int64,1}typeof([1.0, 2.0, 3.0]) #> Array{Float64,1}typeof(["one", "two", "three"]) #> Array{ASCIIString,1}typeof([1 2; 3 4]) #> Array{Int64,2}
Julia allows you to define parameterized types as follows:
type Point{T} x::T y::Tend
Point{Int}(1, 2) #> Point{Int64}(1,2)Point{Float64}(4.2, 2.1) #> Point{Float64}(4.2,2.0)
16 / 30
Faster - BenchmarkThe performance of Julia is comparable to other compiled languages like C andFortran, and much faster than other interpreted languages.
101
102
10-2
107
10-3
108
100
10-1
106
104
103
105
MatlabGo RMathematicaPythonFortran OctaveJavaScriptJulia
benchmark
fib
mandel
pi_sum
rand_mat_mulrand_mat_stat
printfd
quicksort
parse_int
Figure: benchmark times relative to C (smaller is better, C performance = 1.0).17 / 30
Faster - BenchmarkThe performance of Julia is comparable to other compiled languages like C andFortran, and much faster than other interpreted languages.
Figure: benchmark times relative to C (smaller is better, C performance = 1.0).
Fortran Julia Python R Matlab Octave Mathe-matica JavaScript Go
gcc4.8.1 0.2 2.7.3 3.0.2 R2012a 3.6.4 8.0 V8
3.7.12.22 go1
fib 0.26 0.91 30.37 411.36 1992.00 3211.81 64.46 2.18 1.03
parse_int 5.03 1.60 13.95 59.40 1463.16 7109.85 29.54 2.43 4.79
quicksort 1.11 1.14 31.98 524.29 101.84 1132.04 35.74 3.51 1.25
mandel 0.86 0.85 14.19 106.97 64.58 316.95 6.07 3.49 2.36
pi_sum 0.80 1.00 16.33 15.42 1.29 237.41 1.32 0.84 1.41
rand_mat_stat 0.64 1.66 13.52 10.84 6.61 14.98 4.52 3.28 8.12
rand_mat_mul 0.96 1.01 3.41 3.98 1.10 3.41 1.16 14.60 8.51
C compiled by gcc 4.8.1, taking best timing from all optimization levels (-O0 through -O3). C, Fortran and Julia use OpenBLASv0.2.8. The Python implementations of rand_mat_stat and rand_mat_mul use NumPy (v1.6.1) functions; the rest are purePython implementations.
❏
18 / 30
Faster - N Queens PuzzlePlace N queens on an N × N chessboard so that no queens cut in each other, andreturn the number of possible cases.
These are part of solutions when N = 8.
Weisstein, Eric W. "Queens Problem." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/QueensProblem.html 19 / 30
Faster - N Queens PuzzleWhen N gets bigger, the number of solutions grows drastically.
It may take a long time to get the answer when N is sufficiently large.
The algorithm uses a bunch of arithmetic, iteration, recursive function call,and branching.
So this puzzle would be suitable for trying the efficiency of a programminglanguage.
The number of solutions of N queens puzzle.
N 4 5 6 7 8 9 10 11 12 13 14 15
#Solutions 2 10 4 40 92 352 724 2,680 14,200 73,712 365,596 2,279,183
20 / 30
Program in Julia.
solve(n::Int): put n queens on aboard, then return the number ofsolutions.
search(places, i, n): put a queenon the ith row.
isok(places, i, j): checkwhether you can put a queen at(i, j).
This algorithm is not optimal; you canexploit the symmetry of position, butthis is enough to time the speed ofJulia.
Faster - N Queens Puzzle
In isok, you can iterate over enumerate(places) instead.But that killed the performance of the code.
❏
function solve(n::Int) places = zeros(Int, n) search(places, 1, n)end
function search(places, i, n) if i == n + 1 return 1 end
s = 0 @inbounds for j in 1:n if isok(places, i, j) places[i] = j s += search(places, i + 1, n) end end send
function isok(places, i, j) qi = 1 @inbounds for qj in places if qi == i break elseif qj == j || abs(qi - i) == abs(qj - j) return false end qi += 1 end trueend
Julia
21 / 30
Python and C++ are competitors in ourbenchmark.
Faster - N Queens Puzzle
def solve(n): places = [-1] * n return search(places, 0, n)
def search(places, i, n): if i == n: return 1
s = 0 for j in range(n): if isok(places, i, j): places[i] = j s += search(places, i + 1, n) return s
def isok(places, i, j): for qi, qj in enumerate(places): if qi == i: break elif qj == j or abs(qi - i) == abs(qj - j): return False return True
Python int solve(int n){ std::vector<int> places(n, -1); return search(places, 0, n);}
int search(std::vector<int>& places, int i, int n){ if (i == n) return 1;
int s = 0; for (int j = 0; j < n; j++) { if (isok(places, i, j)) { places[i] = j; s += search(places, i + 1, n); } } return s;}
bool isok(const std::vector<int>& places, int i, int j){ int qi = 0; for (int qj : places) { if (qi == i) break; else if (qj == j || abs(qi - i) == abs(qj - j)) return false; qi++; } return true;}
C++
22 / 30
Faster - N Queens PuzzleI measured the total time to get the answers corresponding to N = 4, 5, ..., 14.
Julia - v0.3 (commit: da158df6b5b7f918989a73317a799c909d639e5f)
% time julia.jl eightqueen.jl 14 > /dev/null 10.05 real 9.89 user 0.11 sys
Python - v3.4.1
% time python3 eightqueen.py 14 > /dev/null 1283.34 real 1255.18 user 2.67 sys
C++ - v503.0.40
% clang++ -O3 --std=c++11 --stdlib=libc++ eightqueen.cpp% time ./a.out 14 > /dev/null 8.24 real 8.17 user 0.01 sys
23 / 30
Faster - N Queens PuzzleAnd N = 15.
Julia
% time julia.jl eightqueen.jl 15 > /dev/null 64.75 real 63.73 user 0.17 sys
C++
% time ./a.out 15 > /dev/null 54.31 real 53.89 user 0.05 sys
Note that the result of Julia included JIT compiling time whereas C++ was compiledbefore execution.
The execution time of Python is not measured because Python took too much time when N = 15.❏
Platform Info: System: Darwin (x86_64-apple-darwin13.2.0) CPU: Intel(R) Core(TM) i5-2435M CPU @ 2.40GHz❏
24 / 30
Stronger - Multiple DispatchWe often want to use a single function name to handle different types.
Additions of floats and integers are completely different procedures, but wealways want to use the + operator in both cases.
Leaving some parameters as optional is useful.
maximum(A, dims) computes the maximum value of an array A over thegiven dimensions.
maximum(A) computes the maximum value of an array A, ignoringdimensions.
Unified API will save your memory.
fit(model, x, y) trains model based on the input x and the output y.
The model may be Generalized Linear Model, Lasso, Random Forest, SVM,and so on.
Julia satisfies these demands using multiple dispatch; multiple methods aredispatched according to their arity and argument types.
25 / 30
Stronger - Multiple DispatchWhen the foo function is called, one of the following methods is actually selectedbesed on the number of arguments.
function foo() println("foo 0:")end
function foo(x) println("foo 1: $x")end
function foo(x, y) println("foo 2: $x $y")end
foo() #: foo 0:foo(100) #: foo 1: 100foo(100, 200) #: foo 2: 100 200
26 / 30
Stronger - Multiple DispatchMultiple dispatch discerns the types of arguments - a suitable method which hasthe matching type spec to the values is selected.
function foo(x::Int, y::Int) println("foo Int Int: $x $y")end
function foo(x::Float64, y::Float64) println("foo Float64 Float64: $x $y")end
function foo(x::Int, y::Float64) println("foo Int Float64: $x $y")end
foo(1, 2) #: foo Int Int: 1 2foo(1.0, 2.0) #: foo Float64 Float64: 1.0 2.0foo(1, 2.0) #: foo Int Float64: 1 2.0
27 / 30
Stronger - MacrosMacros allows you to get or modify your code from Julia itself.
In the following example, the assert macro gets given expression (x > 0), thenevaluates the expression in that place. When the evaluated result is false, itthrows an assertion error. Note that the error message contains acquiredexpression (x > 0) which is evaluated as false; this information is useful fordebugging purpose.
x = -5@assert x > 0 #! ERROR: assertion failed: x > 0
Instead of an expression, you can specify an error message:
x = -5@assert x > 0 "x must be positive" #! ERROR: assertion failed: x must be positive
28 / 30
Stronger - MacrosThe assert macro is defined as follows in the standard library.
The macro is called with an expression (ex) and zero or more messages(msg...).
If the messages are empty, the expression itself becomes the error message(msg).
Then the error message is constructed.
Finally, an assertion code is spliced into the calling place.
macro assert(ex,msgs...) msg = isempty(msgs) ? ex : msgs[1] if !isempty(msgs) && isa(msg, Expr) # message is an expression needing evaluating msg = :(string("assertion failed: ", $(esc(msg)))) elseif isdefined(Base,:string) msg = string("assertion failed: ", msg) else # string() might not be defined during bootstrap msg = :(string("assertion failed: ", $(Expr(:quote,msg)))) end :($(esc(ex)) ? $(nothing) : error($msg))end base/error.jl
29 / 30
Future - :)
https://twitter.com/stuinzuri/status/45935285512452505630 / 30