Effective use of Python in Transportation Engineering Teaching
Python for Sciences and Engineering
Transcript of Python for Sciences and Engineering
-
7/30/2019 Python for Sciences and Engineering
1/89
Python for Science
and EngineeringDr Edward SchofieldA*STAR / Singapore Computational Sciences Club Seminar
June 14, 2011
-
7/30/2019 Python for Sciences and Engineering
2/89
Scientific programming in 2011
Most scientists and engineers are:
programming for 50+% of their work time (and rising)
self-taught programmers
using inefficient programming practices
using the wrong programming languages: C++,
FORTRAN, C#, PHP, Java, ...
-
7/30/2019 Python for Sciences and Engineering
3/89
Scientific programming needs
Rapid prototyping
Efficiency for computational kernels
Pre-written packages!
Vectors, matrices, modelling, simulations, visualisation
Extensibility; web front-ends; database backends; ...
-
7/30/2019 Python for Sciences and Engineering
4/89
Ed's story:
How I found PythonPhD in statistical pattern recognition: 2001-2006
Needed good tools for my research!
Discovered Python in 2002 after frustration with C++, Matlab,Java, Perl
Contributed to NumPy and SciPy:
maxent, sparse matrices, optimization, Monte Carlo, etc.
Managed six releases of SciPy in 2005-6
-
7/30/2019 Python for Sciences and Engineering
5/89
1. Why Python?
-
7/30/2019 Python for Sciences and Engineering
6/89
Introducing Python
What is it?
What is it good for?
Who uses it?
-
7/30/2019 Python for Sciences and Engineering
7/89
What is Python?
interpreted
strongly but dynamically typedobject-oriented
intuitive, readable
open source, free
batteries included
-
7/30/2019 Python for Sciences and Engineering
8/89
batteries included
Pythons standard library
is:
very large
well-supported
well-documented
-
7/30/2019 Python for Sciences and Engineering
9/89
Pythons standard library
data types strings networking threads
operating
system compression GUI arguments
CGIcomplex
numbersFTP cryptography
testing multimedia databases CSV files
calendar email XML serialization
-
7/30/2019 Python for Sciences and Engineering
10/89
What is an efficient
programming language?
Native Python codeexecutes 10x more slowlythan C and FORTRAN
-
7/30/2019 Python for Sciences and Engineering
11/89
Would you build a racing car ...... to get to Kuala Lumpur ASAP?
-
7/30/2019 Python for Sciences and Engineering
12/89
Date Cost per GFLOPS (US $) Technology
1961 US $1.1 trillion 17 million IBM 1620s
1984 US $15,000,000 Cray X-MP
1997 US $30,000Two 16-CPU clusters of
Pentiums
2000, Apr $1000 Bunyip Beowulf cluster
2003, Aug $82 KASY0
2007, Mar $0.42 Ambric AM2045
2009, Sep $0.13 ATI Radeon R800
Source: Wikipedia: FLOPS
-
7/30/2019 Python for Sciences and Engineering
13/89
Unit labor cost growthProxy for cost of programmer time
-
7/30/2019 Python for Sciences and Engineering
14/89
Efficiency
When FORTRAN was invented, computer time was moreexpensive than programmer time.
In the 1980s and 1990s that reversed.
-
7/30/2019 Python for Sciences and Engineering
15/89
Efficient programming
Python code is 10x fasterto write than C andFORTRAN
-
7/30/2019 Python for Sciences and Engineering
16/89
What if ...... you now need to reach Sydney?
-
7/30/2019 Python for Sciences and Engineering
17/89
Advantages of Python
Easy to write
Easy to maintain
Great standard libraries
Thriving ecosystem ofthird-party packages
Open source
-
7/30/2019 Python for Sciences and Engineering
18/89
Batteries included
Pythons standard library is:very large
well supported
well documented
-
7/30/2019 Python for Sciences and Engineering
19/89
Pythons standard library
data types strings networking threads
operating
system compression GUI arguments
CGIcomplex
numbersFTP cryptography
testing multimedia databases CSV files
calendar email XML serialization
-
7/30/2019 Python for Sciences and Engineering
20/89
QuestionWhat is the date 177 days from now?
-
7/30/2019 Python for Sciences and Engineering
21/89
Natural applications of Python
Rapid prototyping
Plotting, visualisation, 3D
Numerical computing
Web and database
programmingAll-purpose glue
-
7/30/2019 Python for Sciences and Engineering
22/89
Python vs other languages
-
7/30/2019 Python for Sciences and Engineering
23/89
Languages used at CSIRO
Python Fortran Java
Matlab C VB.net
IDL C++ R
Perl C# +5-10 others!
-
7/30/2019 Python for Sciences and Engineering
24/89
Which language do I choose?
A different language for each task?
A language you know?
A language others in your team are using: support and help?
-
7/30/2019 Python for Sciences and Engineering
25/89
Python Matlab
Interpreted Yes Yes
Powerful data input/output Yes Yes
Great plotting Yes Yes
General-purpose language Powerful Limited
Cost Free $$$
Open source Yes No
-
7/30/2019 Python for Sciences and Engineering
26/89
Python C++
Powerful Yes Yes
Portable Yes In theory
Standard libraries Vast Limited
Easy to write and maintain Yes No
Easy to learn Yes No
-
7/30/2019 Python for Sciences and Engineering
27/89
Python C
Fast to write Yes No
Good for embedded systems, devicedrivers and operating systems
No Yes
Good for most other high-level tasks Yes No
Standard library Vast Limited
-
7/30/2019 Python for Sciences and Engineering
28/89
-
7/30/2019 Python for Sciences and Engineering
29/89
Open source
Python is open source software
Benefits:No vendor lock-in
Cross-platform
Insurance against bugs in the platformFree
-
7/30/2019 Python for Sciences and Engineering
30/89
Python success stories
Computer graphics:
Industrial Light & Magic
Web:
Google: News, Groups, Maps, Gmail
Legacy system integration:
AstraZeneca - collaborative drug discovery
-
7/30/2019 Python for Sciences and Engineering
31/89
Python success stories (2)
Aerospace:
NASAResearch:
universities worldwide ...
Others:YouTube, Reddit, BitTorrent, Civilization IV,
-
7/30/2019 Python for Sciences and Engineering
32/89
-
7/30/2019 Python for Sciences and Engineering
33/89
United Space Alliance
A common sentiment:
We achieve immediate functioning code so much faster inPython than in any other language that its staggering.
- Robin Friedrich, Senior Project Engineer
-
7/30/2019 Python for Sciences and Engineering
34/89
Case study: air-traffic control
Eric Newton, Python forCritical Applications: http://metaslash.com/brochure/
recall.html
Metaslash, Inc: 1999 to 2001
Mission-critical system for
air-traffic controlReplicated, fault-tolerantdata storage
-
7/30/2019 Python for Sciences and Engineering
35/89
Case study: air-traffic control
Python prototype -> C++ implementation -> Python again
Why?
C++ dependencies were buggy
C++ threads, STL were not portable enough
Pythons advantages over C++
More portable
75% less code: more productivity, fewer bugs
-
7/30/2019 Python for Sciences and Engineering
36/89
More case studies
See for lots more casestudies and success stories
http://www.python.org/about/success/http://www.python.org/about/success/ -
7/30/2019 Python for Sciences and Engineering
37/89
2. The scientific Python ecosystem
-
7/30/2019 Python for Sciences and Engineering
38/89
-
7/30/2019 Python for Sciences and Engineering
39/89
NumPyAn n-dimensional array/matrix package
-
7/30/2019 Python for Sciences and Engineering
40/89
NumPyCentre of Pythons numerical computing ecosystem
-
7/30/2019 Python for Sciences and Engineering
41/89
NumPy
The most fundamental tool for numerical computing inPython
Fast multi-dimensional array capability
-
7/30/2019 Python for Sciences and Engineering
42/89
What NumPy defines:
Two fundamental objects:
1. n-dimensional array
2. universal function
a rich set of numerical data types
nearly 400 functions and methods on arrays:
type conversions
mathematical
logical
-
7/30/2019 Python for Sciences and Engineering
43/89
NumPy's features
Fast. Written in C with BLAS/LAPACK hooks.
Rich set of data types
Linear algebra: matrix inversion, decompositions,
Discrete Fourier transforms
Random number generation
Trig, hypergeometric functions, etc.
-
7/30/2019 Python for Sciences and Engineering
44/89
Elementwise array operations
Loops are mostly unnecessary
Operate on entire arrays!>>> a = numpy.array([20, 30, 40, 50])>>> a < 35array([True, True, False, False], dtype=bool)>>> b = numpy.arange(4)>>> a - barray([20, 29, 38, 47])>>> b**2array([0, 1, 4, 9])
-
7/30/2019 Python for Sciences and Engineering
45/89
Universal functions
NumPy defines 'ufuncs' that operate on entire arrays
and other sequences (hence 'universal')Example: sin()
>>> a = numpy.array([20, 30, 40, 50])>>> c = 10 * numpy.sin(a)
>>> carray([ 9.12945251, -9.88031624, 7.4511316 ,-2.62374854])
-
7/30/2019 Python for Sciences and Engineering
46/89
Array slicing
Arrays can be sliced and indexed powerfully:>>> a = numpy.arange(10)**3>>> aarray([ 0, 1, 8, 27, 64, 125, 216, 343,
512, 729])>>> a[2:5]array([ 8, 27, 64])
-
7/30/2019 Python for Sciences and Engineering
47/89
Fancy indexing
Arrays can be used as indices into other arrays:>>> a = numpy.arange(12)**2>>> ind = numpy.array([ 1, 1, 3, 8, 5 ])>>> a[ind]array([ 1, 1, 9, 64, 25])
-
7/30/2019 Python for Sciences and Engineering
48/89
Other linear algebra features
Matrix inversion: mat(A).I
Or: linalg.inv(A)
Linear solvers: linalg.solve(A, x)
Pseudoinverse: linalg.pinv(A)
-
7/30/2019 Python for Sciences and Engineering
49/89
What is SciPy?
A community
A conference
A package of scientific libraries
-
7/30/2019 Python for Sciences and Engineering
50/89
Python for scientific software
Back-end: computational work
Front-end: input / output, visualization, GUIs
Dozens of great scientific packages exist
-
7/30/2019 Python for Sciences and Engineering
51/89
Python in science (2)
NumPy: numerical / array moduleMatplotlib: great 2D and 3D plotting library
IPython: nice interactive Python shell
SciPy: set of scientific libraries: sparse matrices, signal
processing,
RPy: integration with the R statistical environment
-
7/30/2019 Python for Sciences and Engineering
52/89
Python in science (3)
Cython: C language extensionsMayavi: 3D graphics, volumetric rendering
Nitimes, Nipype: Python tools for neuroimaging
SymPy: symbolic mathematics library
-
7/30/2019 Python for Sciences and Engineering
53/89
Python in science (4)
VPython: easy, real-time 3D programming
UCSF Chimera, PyMOL,VMD: molecular graphics
PyRAF: Hubble Space Telescope interface to RAF astronomicaldata
BioPython: computational molecular biology
Natural language toolkit: symbolic + statistical NLP
Physics: PyROOT
-
7/30/2019 Python for Sciences and Engineering
54/89
The SciPy packageBSD-licensed software for maths, science,engineering
integration signal processing sparse matrices
optimization linear algebra maximum entropy
interpolation ODEs statistics
FFTs
n-dim image
processing scientific constants
clustering interpolationC/C++ and Fortran
integration
-
7/30/2019 Python for Sciences and Engineering
55/89
SciPy optimisation exampleFit a model to noisy data:y = a/xb sin(cx)+
-
7/30/2019 Python for Sciences and Engineering
56/89
Example: fitting a model withscipy.optimize
Task: Fit a model of the form y = a/bx sin(cx)+
to noisy data.
Spec:
1. Generate noisy data
2. Choose parameters (a, b, c) to minimize sum squarederrors
3. Plot the data and fitted model (next session)
-
7/30/2019 Python for Sciences and Engineering
57/89
SciPy optimisation example
import numpyimport pylabfrom scipy.optimize import leastsq
def myfunc(params, x):(a, b, c) = params
return a / (x**b) * numpy.sin(c * x)
true_params = [1.5, 0.1, 2.]def f(x): return myfunc(true_params, x)
def err(params, x, y): # error function return myfunc(params, x) - y
-
7/30/2019 Python for Sciences and Engineering
58/89
SciPy optimisation example
#Generate noisy data to fitn = 30; xmin = 0.1; xmax = 5x = numpy.linspace(xmin, xmax, n)y = f(x)y += numpy.rand(len(x)) * 0.2 * \
(y.max() - y.min())
v0 = [3., 1., 4.] # initial param estimate# Fittingv, success = leastsq(err, v0, args=(x, y), maxfev=10000)
print'Estimated parameters: ', vprint'True parameters: ', true_paramsX = numpy.linspace(xmin, xmax, 5 * n)pylab.plot(x, y, 'ro', X, myfunc(v, X))pylab.show()
-
7/30/2019 Python for Sciences and Engineering
59/89
SciPy optimisation exampleFit a model to noisy data:y = a/xb sin(cx)+
-
7/30/2019 Python for Sciences and Engineering
60/89
Ingredients for this example
numpy.linspace
numpy.random.rand for the noise model (uniform)
scipy.optimize.leastsq
-
7/30/2019 Python for Sciences and Engineering
61/89
Sparse matrix exampleConstruct and solve a sparse linear system
-
7/30/2019 Python for Sciences and Engineering
62/89
Sparse matrices
Sparse matrices are mostly zeros.
They can be symmetric or
asymmetric.Sparsity patterns vary:
block sparse, band matrices, ...
They can be huge!
Only non-zeros are stored.
-
7/30/2019 Python for Sciences and Engineering
63/89
Sparse matrices in SciPy
SciPy supports seven sparse storage schemes
... and sparse solvers in Fortran.
-
7/30/2019 Python for Sciences and Engineering
64/89
Sparse matrix creation
To construct a 1000x1000 lil_matrix and add values:
>>> from scipy.sparse import lil_matrix>>> from numpy.random import rand>>> from scipy.sparse.linalg import spsolve
>>> A = lil_matrix((1000, 1000))>>> A[0, :100] = rand(100)>>> A[1, 100:200] = A[0, :100]>>> A.setdiag(rand(1000))
S l i t i
-
7/30/2019 Python for Sciences and Engineering
65/89
Solving sparse matrix
systemsNow convert the matrix to CSR format and solve Ax=b:>>> A = A.tocsr()>>> b = rand(1000)>>> x = spsolve(A, b)
# Convert it to a dense matrix and solve, andcheck that the result is the same:>>> from numpy.linalg import solve, norm>>> x_ = solve(A.todense(), b)# Compute norm of the error:>>> err = norm(x - x_)>>> err < 1e-10True
-
7/30/2019 Python for Sciences and Engineering
66/89
Matplotlib
Great plotting package in Python
Matlab-like syntax
Great rendering: anti-aliasing etc.
Many backends: Cairo, GTK, Cocoa, PDF
Flexible output: to EPS, PS, PDF, TIFF, PNG, ...
-
7/30/2019 Python for Sciences and Engineering
67/89
Matplotlib: worked examplesSearch the web for 'Matplotlib gallery'
E ample N mP
-
7/30/2019 Python for Sciences and Engineering
68/89
Example: NumPy
vectorization1. Use a Monte Carlo algorithm to
estimate :
1. Generate uniform random variates (x,%y) over [0, 1].
2. Estimate from the proportion p that land in the unit
circle.
2. Time two ways of doing this:
1. Using for loops
2. Using array operations (vectorized)
-
7/30/2019 Python for Sciences and Engineering
69/89
3. Scaling
-
7/30/2019 Python for Sciences and Engineering
70/89
HPCHigh-performance computing
-
7/30/2019 Python for Sciences and Engineering
71/89
Aspects to HPC
Supercomputers Distributed clusters / grids
Parallel programming Scripting
Caches, shared memory Job control
Code porting Specialized hardware
-
7/30/2019 Python for Sciences and Engineering
72/89
Python for HPC
Advantages Disadvantages
Portability Global interpreter lock
Easy scripting, glue Less control than C
Maintainability Native loops are slow
Profiling to identify hotspots
Vectorization with NumPy
-
7/30/2019 Python for Sciences and Engineering
73/89
Large data sets
Useful Python language features:
Generators, iterators
Useful packages:
Great HDF5 support from PyTables!
-
7/30/2019 Python for Sciences and Engineering
74/89
Hierarchical dataDatabases without the relational baggage
-
7/30/2019 Python for Sciences and Engineering
75/89
Great interface for HDF5 dataEfficient support for massive data sets
-
7/30/2019 Python for Sciences and Engineering
76/89
Applications of PyTables
aeronautics telecommunications
drug discovery data mining
financial analysis statistical analysis
climate prediction etc.
-
7/30/2019 Python for Sciences and Engineering
77/89
Breaking news: June 2011
PyTables Pro is now being open sourced.
Indexed searches for speed
Merging with PyTables
Working project name: NewPyTables
-
7/30/2019 Python for Sciences and Engineering
78/89
PyTables performance
OPSI indexing engine speed:
Querying 10 billion rows can take hundredths of asecond!
Target use-case:
mostly read-only or append-only data
-
7/30/2019 Python for Sciences and Engineering
79/89
Principles for efficient code
-
7/30/2019 Python for Sciences and Engineering
80/89
Important principles
1. "Premature optimization is the root of all evil"
Don't write cryptic code just to make it more efficient!
2. 1-5% of the code takes up the vast majority of the
computing time!
... and it might not be the 1-5% that you think!
-
7/30/2019 Python for Sciences and Engineering
81/89
Checklist for efficient code
From most to least important:
1. Check: Do you really need to make it more efficient?
2. Check: Are you using the right algorithms and datastructures?
3. Check: Are you reusing pre-written libraries wherever
possible?
4. Check: Which parts of the code are expensive?
Measure, don't guess!
-
7/30/2019 Python for Sciences and Engineering
82/89
Relative efficiency gains
Exponential-order and polynomial-order speedups are
possible by choosing the right algorithm for a task.
These require the right data structures!
These dwarf 10-25x linear-order speedups from:
using lower-level languages
using different language constructs.
-
7/30/2019 Python for Sciences and Engineering
83/89
4. About Python Charmers
-
7/30/2019 Python for Sciences and Engineering
84/89
The largest Python training provider in South-East Asia
Delighted customers include:
-
7/30/2019 Python for Sciences and Engineering
85/89
Most popular course topics
Python for Programmers 3 days
Python for Scientists and Engineers 4 days
Python for Geoscientists 4 days
Python for Bioinformaticians 4 days
Python for Financial Engineers 4 days
Python for IT Security Professionals 3 days
New courses:
Python Charmers:
-
7/30/2019 Python for Sciences and Engineering
86/89
Python Charmers:
Topics of expertisePython: beginners, advanced
Scientific data processing with Python
Software engineering with Python
Large-scale problems: HPC, huge data sets, grids
Statistics and Monte Carlo problems
Python Charmers:
-
7/30/2019 Python for Sciences and Engineering
87/89
Python Charmers:
Topics of expertise (2)Spatial data analysis /GIS
General scripting, job control, glue
GUIs with PyQt
Integrating with other languages: R, C, C++, Fortran, ...
Web development in Django
-
7/30/2019 Python for Sciences and Engineering
88/89
How to get in touch
See PythonCharmers.com
or email us at:
mailto:[email protected]:[email protected] -
7/30/2019 Python for Sciences and Engineering
89/89