Astronomical Data Analysis with Python - Lecture 8

Post on 09-Feb-2022

3 views 0 download

Transcript of Astronomical Data Analysis with Python - Lecture 8

Astronomical Data Analysis with PythonLecture 8

Yogesh Wadadekar

NCRA-TIFR

July August 2010

Yogesh Wadadekar (NCRA-TIFR) Topical course 1 / 27

Slides available at:

http://www.ncra.tifr.res.in/∼yogesh/python_course_2010/

Yogesh Wadadekar (NCRA-TIFR) Topical course 2 / 27

Comments on assignment

Yogesh Wadadekar (NCRA-TIFR) Topical course 3 / 27

SciPy - superset of numpy http://scipy.org

constants: physical constants and conversion factors (sinceversion 0.7.0[1])cluster: hierarchical clustering, vector quantization, K-meansfftpack: Discrete Fourier Transform algorithmsintegrate: numerical integration routinesinterpolate: interpolation toolsio: data input and outputlib: Python wrappers to external librarieslinalg: linear algebra routinesmisc: miscellaneous utilities (e.g. image reading/writing)optimize: optimization algorithms including linear programmingsignal: signal processing toolssparse: sparse matrix and related algorithmsspatial: KD-trees, nearest neighbors, distance functions

Yogesh Wadadekar (NCRA-TIFR) Topical course 4 / 27

SciPy

special: special functionsstats: statistical functionsweave: tool for writing C/C++ code as Python multiline stringsndimage: image processing

Via RPy, SciPy can interface to the R statistical package R.

Yogesh Wadadekar (NCRA-TIFR) Topical course 5 / 27

SciPy Cookbook

http://www.scipy.org/CookbookThis page hosts worked examples of commonly-done tasks. Some areintroductory in nature, while others are quite advanced (these are atthe bottom of the page).

Yogesh Wadadekar (NCRA-TIFR) Topical course 6 / 27

power of matplotlib

import numpy as npx, y = np.random.randn(2, 100000)H, xedges, yedges = np.histogram2d(x, y, bins=50)extent = [xedges[0], xedges[-1], yedges[0],yedges[-1]]imshow(H, extent=extent)

Yogesh Wadadekar (NCRA-TIFR) Topical course 7 / 27

Three main reasons to extend Python

1 speed.2 to leverage existing libraries that are not already wrapped.3 to allow Python programs to access devices at absolute memory

addresses, using C functions as an intermediary.

If neither of these reasons apply, you can happily code only in Python.

Yogesh Wadadekar (NCRA-TIFR) Topical course 8 / 27

Questions to answer before trying to extend Python

is the compiled language code you need small enough to recodein Python. e.g. a Numerical recipes routine that requires perhaps3-4 other routines should be easily rewritable in Python.

if the code is large, check whether it already has a Pythoninterface. e.g. LAPACKif the code in question is in octave, idl, matlab, mathematica andyou have a licensed copy of these software, it will be easier to usethe interfaces to those languages.Do you really need to pass arrays or are disk files a good way ofexchanging data?It is more efficient to wrap a set of related subroutines rather thanjust one (or a few). Can you collaborate to do this? Ask on astropymailing list.

Yogesh Wadadekar (NCRA-TIFR) Topical course 9 / 27

Questions to answer before trying to extend Python

is the compiled language code you need small enough to recodein Python. e.g. a Numerical recipes routine that requires perhaps3-4 other routines should be easily rewritable in Python.if the code is large, check whether it already has a Pythoninterface. e.g. LAPACK

if the code in question is in octave, idl, matlab, mathematica andyou have a licensed copy of these software, it will be easier to usethe interfaces to those languages.Do you really need to pass arrays or are disk files a good way ofexchanging data?It is more efficient to wrap a set of related subroutines rather thanjust one (or a few). Can you collaborate to do this? Ask on astropymailing list.

Yogesh Wadadekar (NCRA-TIFR) Topical course 9 / 27

Questions to answer before trying to extend Python

is the compiled language code you need small enough to recodein Python. e.g. a Numerical recipes routine that requires perhaps3-4 other routines should be easily rewritable in Python.if the code is large, check whether it already has a Pythoninterface. e.g. LAPACKif the code in question is in octave, idl, matlab, mathematica andyou have a licensed copy of these software, it will be easier to usethe interfaces to those languages.

Do you really need to pass arrays or are disk files a good way ofexchanging data?It is more efficient to wrap a set of related subroutines rather thanjust one (or a few). Can you collaborate to do this? Ask on astropymailing list.

Yogesh Wadadekar (NCRA-TIFR) Topical course 9 / 27

Questions to answer before trying to extend Python

is the compiled language code you need small enough to recodein Python. e.g. a Numerical recipes routine that requires perhaps3-4 other routines should be easily rewritable in Python.if the code is large, check whether it already has a Pythoninterface. e.g. LAPACKif the code in question is in octave, idl, matlab, mathematica andyou have a licensed copy of these software, it will be easier to usethe interfaces to those languages.Do you really need to pass arrays or are disk files a good way ofexchanging data?

It is more efficient to wrap a set of related subroutines rather thanjust one (or a few). Can you collaborate to do this? Ask on astropymailing list.

Yogesh Wadadekar (NCRA-TIFR) Topical course 9 / 27

Questions to answer before trying to extend Python

is the compiled language code you need small enough to recodein Python. e.g. a Numerical recipes routine that requires perhaps3-4 other routines should be easily rewritable in Python.if the code is large, check whether it already has a Pythoninterface. e.g. LAPACKif the code in question is in octave, idl, matlab, mathematica andyou have a licensed copy of these software, it will be easier to usethe interfaces to those languages.Do you really need to pass arrays or are disk files a good way ofexchanging data?It is more efficient to wrap a set of related subroutines rather thanjust one (or a few). Can you collaborate to do this? Ask on astropymailing list.

Yogesh Wadadekar (NCRA-TIFR) Topical course 9 / 27

Numpy C extensions

http://www.scipy.org/Cookbook/C_Extensions/NumPy_arrays

Yogesh Wadadekar (NCRA-TIFR) Topical course 10 / 27

SWIG

SWIG is a software development tool that connects programs writtenin C and C++ with a variety of high-level programming languages.SWIG is used with different types of languages such as Perl, PHP,Python, Tcl, Ruby, C#, Common Lisp (CLISP, Allegro CL, CFFI, UFFI),Java, Lua, Modula-3, OCAML, Octave and R. Also several interpretedand compiled Scheme implementations (Guile, MzScheme, Chicken)are supported. SWIG is most commonly used to create high-levelinterpreted or compiled programming environments, user interfaces,and as a tool for testing and prototyping C/C++ software.

Yogesh Wadadekar (NCRA-TIFR) Topical course 11 / 27

The example.c file

Yogesh Wadadekar (NCRA-TIFR) Topical course 12 / 27

The example.i interface file

Now, in order to add these files to your favorite language, you need towrite an “interface file” which is the input to SWIG. An interface file forthe C functions we want to wrap is needed.

Yogesh Wadadekar (NCRA-TIFR) Topical course 13 / 27

Building the module

$ swig -python example.i$ gcc -c example.c example_wrap.c-I/usr/include/python2.6$ ld -shared example.o example_wrap.o -o _example.so

Yogesh Wadadekar (NCRA-TIFR) Topical course 14 / 27

Dynamic typing gives way to static typing

You can use an assert before the function call to the external function.

Yogesh Wadadekar (NCRA-TIFR) Topical course 15 / 27

F2PY - Fortran to Python interface generator

autogenerates interface files to allow Python to call a Fortransubroutine.

Yogesh Wadadekar (NCRA-TIFR) Topical course 16 / 27

Installing f2py

Recent versions of numpy already include f2py. So, if you installednumpy there is nothing more to do.

Yogesh Wadadekar (NCRA-TIFR) Topical course 17 / 27

The Hello example

C File hello.fsubroutine foo (a)integer aprint*, "Hello from Fortran!"print*, "a=",aend Of course, there may be multiple subroutines in hello.f, all ofwhich will get wrapped in one go.

Yogesh Wadadekar (NCRA-TIFR) Topical course 18 / 27

Build the module

$ f2py -c -m hello hello.f__doc__ are autocreated.

Yogesh Wadadekar (NCRA-TIFR) Topical course 19 / 27

A more complicated example - fib3.f

Calculate the first n Fibbonacci numbers.

Yogesh Wadadekar (NCRA-TIFR) Topical course 20 / 27

If you can’t edit the Fortran code

you can work with signature files which are like the interface files in C.

Yogesh Wadadekar (NCRA-TIFR) Topical course 21 / 27

Features of f2py

F2PY automatically generates __doc__ strings (and optionallyLaTeX documentation) for extension modules.F2PY generated functions accept arbitrary (but sensible) Pythonobjects as arguments. The F2PY interface automatically takescare of type-casting.most Fortran 90 features also work.

Pyfort, a competitor to f2pY is not as feature rich

Yogesh Wadadekar (NCRA-TIFR) Topical course 22 / 27

Attributes and statements of f2py

intent([ in | inout | out | hide | in,out |inout,out | c | copy | cache | callback | inplace |aux ])dimension(<dimspec>)common, parameterallocatableoptional, required, externaldepend([<names>])check([<C-booleanexpr>])note(<LaTeX text>)usercode, callstatement, callprotoargument,threadsafe, fortrannamepymethoddefentry

Yogesh Wadadekar (NCRA-TIFR) Topical course 23 / 27

Other options for extending Python

SIP: lighter version of SWIG. Only works with C/C++. Originallywritten for PyQtctypes: no interface code needed, can directly call compiledfunctions in a binary library file. Some features work only onWindows.Boost.python: good for wrapping entire libraries in C or C++Pyrex: is a Python like language designed for writing Pythonextension modules. Almost any Python code is valid Pyrex code.

Yogesh Wadadekar (NCRA-TIFR) Topical course 24 / 27

Extending Python to Java and C#

Java: Jython is an implementation of Python for the JVM. Jythontakes the Python programming language syntax and enables it torun on the Java platform. This allows seamless integration withthe use of Java libraries and other Java-based applications. ButCPython extensions don’t work.C# - IronPython is an implementation of the Python programminglanguage running under .NET and Silverlight

Yogesh Wadadekar (NCRA-TIFR) Topical course 25 / 27

My recommendations

see if numpy C extensions will do the job for you.if you use Fortran (any flavour), use f2py. If you need to write freshcode (for speedy execution), write in Fortran and wrap with f2py.For C/C++ use SWIG or may SIPFor Java - use Jython, for C# use IronPythonFor all other languages, SWIG is the richest optionand please don’t forget regression tests.

Yogesh Wadadekar (NCRA-TIFR) Topical course 26 / 27

Feel free to contact me for any teething problems...

yogesh@ncra.tifr.res.in

Yogesh Wadadekar (NCRA-TIFR) Topical course 27 / 27