Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image...
-
Upload
hubert-small -
Category
Documents
-
view
218 -
download
2
Transcript of Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image...
![Page 1: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/1.jpg)
![Page 2: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/2.jpg)
Amis Consulting LLP
1977-1981 : Research in Combustion/Fluids.
1983-1991 : Scientific computing image processing
1992-1997 : UK Healthcare / Imperial College
1997-2003 : Dotcom Boom (and bust) !!!
2003- : Financial Systems.
Currently involved in High Performance Computing and (of course) Big Data and well as all the other stuff.
![Page 3: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/3.jpg)
Worked with a variety of technologies
● Languages (in anger) : Fortran / C / Ada / Perl / Python / Lisp / Java / PHP / Groovy / NodeJS
… our GOTO languages remain C and Perl but ???
● Back-ends:Unix (not just Linux) and Windows (so some .NET)
● Databases : Both relational and the NoSQL (Redis, Mongo Neo4J)
● Moving into the cloud: AWS: Map-reduce, Redshift, Google App Server
![Page 4: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/4.jpg)
Then along came R …● At Kings in late-2000’s● Interest was in HPC (mainly CUDA) applied to
financial systems. ● Started using Matlab but was looking for a
similar type package for personal/company usage .
● Gnu/Octave and R both fitted the bill, R won – at the time.
● Looked at (and impressed by) Python
![Page 5: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/5.jpg)
History
● Gang of “four”:– Jeff Bezanson, Virah Shah– Stefan Karpinski, Alan Edelman
● Started at MIT in 2010● First release February, 2012● Still actively maintained by G4● MIT using Julia in courses (on youtube)
![Page 6: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/6.jpg)
What happened to Ada?
● Designed 1977/83 for US DoD in order to supercede 100’s of languages DoD used.
● Mandated its use in 1987.● Dropped the mandate in 1997.● Still used in air traffic control systems such as
iFacts, GNATS.● Nearest meetup group is in Stockholm.
![Page 7: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/7.jpg)
Runners and Riders
Current field:
1. Runners: Matlab, R, Python
2. Riders: C/C++, Java
3. Outsiders: Scala, Clojure
4. Non-starter: Perl
![Page 8: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/8.jpg)
![Page 9: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/9.jpg)
What makes a good Data Science Language? (1)
● Be a general purpose language with a sizable user community and an array of general purpose libraries, including good GUI libraries, networking and web frameworks.
● Be free, open-source and platform independent.
● Be fast and efficient.
● Have a good, well-designed library for scientific computing, including non-uniform random number generation and linear algebra.
● Have a strong type system, and be statically typed with good compile-time type checking and type safety.
● Have reasonable type inference.
● Have a REPL for interactive use
![Page 10: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/10.jpg)
What makes a good Data Science Language? (2)
● Have good tool support - including build tools, doc tools, testing tools, and an intelligent IDE.
● Have excellent support for functional programming, including support for immutability and immutable data structures and “monadic” design
● Allow imperative programming for occasions where it makes sense.
● Be designed with concurrency and parallelism in mind, having excellent language and library support for building really scalable concurrent and parallel applications.
● Have excellent built-in data capabilities.
● Have comprehensive math and statistical routines.
![Page 11: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/11.jpg)
Comparison with Matlab● Julia syntax is similar to Matlab but its
construction is purposely very different.● Matlab has only one data structure (the matrix)
and is optimised for matrix operations. Other native computations can be very slow.
● The focus on matrices lead to some important differences in MATLAB’s design compared to GP programming languages such as Julia.
● Julia uses similar matrix syntax to Matlab but also incorporates list comprehensions.
![Page 12: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/12.jpg)
Comparison with R● Origins as open-source clone of S+.● Still seen as a “statistical” DSL.● R is single threaded and hard to speed up.● Introduced the data frame structure which is
also present in Julia● Julia also has an RDatasets package.● R has very good graphic and data visualisation
support.● Julia has a Google group: julia-stats.● Julia can call R modules using the Rif package.
![Page 13: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/13.jpg)
Comparison with Python● Python now seen by many as the Data Science
language.● Strength lies in its community support.● Modules such as numpy, scipy, matplotlib and
pandas are very powerful.● Speed up using PyPy● Mature frameworks such as Django● Julia approach is co-operation not confrontation
via the PyCall and also IJulia IPython
![Page 14: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/14.jpg)
What makes Julia special?
● It is written in Julia, apart from a small core, and the code is available to look at.
● The designers are data scientists and not tied to companies such as Google (Go) or Mozilla (Rust).
● It has been designed for parallelism / distributed computation
● It takes every opportunity to cooperate rather than confront.
● Julia intends to combine the best from MATLAB, R and Python into one language that is to be consistent, well designed and fast.
![Page 15: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/15.jpg)
Special features
• Easy installation• JIT compilation• Built-in package manager• Coroutines and green threads• Multiple dispatch• Dynamic type system• Meta programming with Lisp-like macros• Call C functions directly• Call Python functions: (PyCall)• Best-of-breed C and Fortran libraries• Unicode support
![Page 16: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/16.jpg)
The ones to read …
● Parallel computing– http:// julia.readthedocs/en/latest/manual/parallel-computing
● Metaprogramming– http://docs.julialang.org/en/latest/manual/metaprogramming
● Networking and streams– http://docs.julialang.org/en/latest/manual/networking-and-streams
● Calling C and Fortran code– http:// julia.readthedocs.org/en/latest/manual/calling-c-and-fortran-code
![Page 17: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/17.jpg)
Modules and packages● Julia has its own built-in package manager
● There are (now) 250+ packages.
● These include:– Statistics
– Graphics
– System tools
– Database
– Web and Cloud
– Simulation
● Its quite easy to add your own package (via GITHub)
![Page 18: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/18.jpg)
100+ contributors, 1000+ mailing list subscribers, 175+ packagesAWS, ArgParse, BSplines, Benchmark, BinDeps, BioSeq, BloomFilters, Cairo, Calculus, Calendar, Cartesian, Catalan, ChainedVectors, ChemicalKinetics, Clang, Clp, ClusterManagers, Clustering, Codecs, CoinMP, Color, Compose, ContinuedFractions, Cpp, Cubature, Curl, DICOM, DWARF, DataFrames, DataStructures, Datetime, Debug, DecisionTree, Devectorize, DictUtils, DictViews, DiscreteFactor, Distance,Distributions, DualNumbers, ELF, Elliptic, Example, ExpressionUtils, FITSIO, FactCheck, FastaIO, FastaRead, FileFind, FunctionalCollections, FunctionalUtils, GLFW, GLM, GLPK, GLUT, GSL,GZip, Gadfly, Gaston, GeoIP, GeometricMCMC,GetC, GoogleCharts, Graphs, Grid, Gtk, Gurobi, HDF5, HDFS, HTTP, HTTPClient, Hadamard, HttpCommon, HttpParser, HttpServer,HypothesisTests, ICU, ImageView,Images, ImmutableArrays, IniFile, Iterators, Ito, JSON, JudyDicts, JuliaWebRepl, KLDivergence, LIBSVM, Languages, LazySequences, LibCURL, LibExpat, LinProgGLPK, Loss, MAT, MATLAB, MCMC, MDCT, MLBase,MNIST, MarketTechnicals, MathProg, MathProgBase, Meddle, Memoize, Meshes, Metis, MixedModels,Monads, Mongo, Mongrel2, Morsel, Mustache, NHST, NIfTI, NLopt, Named, NetCDF, NumericExtensions, NumericFunctors, ODBC, ODE, OpenGL, OpenSSL, Optim, Options, PLX, PTools, PatternDispatch, Phylo,Phylogenetics, Polynomial, Profile, ProgressMeter, ProjectTemplate, PyCall, PyPlot, PySide, Quandl,QuickCheck, RDatasets, REPL, RNGTest, RPMmd, RandomMatrices, Readline, Regression, Resampling, Rif, Rmath, RobustStats, Roots, SDE, SDL, SVM, SemidefiniteProgramming, SimJulia, SimpleMCMC, Sims,Sodium, Soundex, Sqlite, Stats, StrPack, Sundials, SymPy, TOML, Terminals, TextAnalysis, TextWrap, TimeModels, TimeSeries, Tk, TopicModels, TradingInstrument, Trie, URLParse, UTF16, Units, ValueDispatch,WAV, WebSockets, Winston, YAML, ZMQ, Zlib
![Page 19: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/19.jpg)
Julia does have graphics!
● Winston (Standard 2D graphics)
● Gadfly (Like 'gg2plot')
● Gaston (Uses gnuplot as graphics engine)
● PyPlot (Uses IPython/matplotlib.py)
● Plotly (http://plot.ly/api)
![Page 20: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/20.jpg)
Simulated Stock Marketjulia> plothist(randn(100000), 100) julia> plot(cumsum(randn(10000)))
![Page 21: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/21.jpg)
What’s missing?
● Cached package loading– At present all modules are compiled on the fly– Preloading would reduce startup times
● Better database connectivity– Uses ODBC– Simple d/b support via SQLite– No native Oracle, MySQL or Postgresql
● More comprehensive NoSQL support– Packages for Mongo, Redis.– JSON package helps with CouchDB, Neo4j
![Page 22: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/22.jpg)
Familiar syntax for Matlab/Octave users
function randmatstat (t; n=10) v = zeros(t) w = zeros(t) for i = 1:t a = randn(n,n) b = randn(n,n) c = randn(n,n) d = randn(n,n) P = [a b c d] Q = [a b; c d] v[i] = trace((P'*P)^4) w[i] = trace((Q'*Q)^4) end std(v)/mean(v), std(w)/mean(w)end
![Page 23: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/23.jpg)
Simulating an Asian Option
S0 = 100; # Spot priceK = 102; # Strike pricer = 0.05; # Risk free rateq = 0.0; # Dividend yieldv = 0.2; # Volatilitytma = 0.25; # Time to maturityT = 100; # Number of time stepsdt = tma/T; # Time increment
S = zeros(Float64,T); S[1] = S0;dW = randn(T)*sqrt(dt);[ S[t] = S[t-1] * (1 + (r - q - 0.5*v*v)*dt + v*dW[t] +
0.5*v*v*dW[t]*dW[t]) for t=2:T ]x = linspace(1, T, length(T));p = FramedPlot(title = "Random Walk, drift 5%, volatility 2%")add(p, Curve(x,S,color="red"))display(p)
![Page 24: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/24.jpg)
Random Walk on Julia Studio
![Page 25: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/25.jpg)
Going further …● Start with the julia.org website● Install Julia and read the documentation● Look at the training material
– http://julialang.org/teaching/
● Try the Julia Studio● Read/subscribe to Google-groups sites
– julia-users, julia-stats, julia-opt, julia-dev
● Join the LJuUG– http://www.meetup.com/London-Julia-User-Group
![Page 26: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/26.jpg)
My Benchmarks
Language Timing (c = 1) Asian Optionc 1.0 1.681
julia 1.41 1.680
python (v3)
32.67 1.671
R 154.3 1.646
Octave 789.3 1.632
Results for 100,000 runs of 100 steps, (c ~ 0.73 s)
Samsung RV711 laptop with an i5 processor and 4Gb RAM running Centos 6.5 (Final)
![Page 27: Amis Consulting LLP 1977-1981: Research in Combustion/Fluids. 1983-1991: Scientific computing image processing 1992-1997: UK Healthcare / Imperial College.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d155503460f949eab5c/html5/thumbnails/27.jpg)