Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of...
-
Upload
oliver-parks -
Category
Documents
-
view
216 -
download
0
Transcript of Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of...
Reproducible Computational Experiments Using MADAGASCAR Software Package
Sergey FomelBureau of Economic Geology
University of Texas at Austin
Applied Inverse Problems
Vancouver BC
June 29, 2007
http://rsf.sf.net/
http://rsf.sourceforge.net/
Principles of Scientific Software
EncapsulationFile FormatsTestingReproducibilityMaintenance
http://rsf.sourceforge.net/
Principles of Scientific Software
EncapsulationFile FormatsTestingReproducibilityMaintenance
http://rsf.sourceforge.net/
Encapsulation
Information hiding (Parnas, 1972) Separation of concerns (Dijkstra, 1974)
Separate physics from mathematics
A is physics Going from b to is mathematics
x̂ argmin Ax -b R x
x̂
http://rsf.sourceforge.net/
Example: Velocity Transform
http://rsf.sourceforge.net/
Physics of Velocity Transform
http://rsf.sourceforge.net/
http://rsf.sourceforge.net/
http://rsf.sourceforge.net/
Encapsulation in Programming
Separation of concerns– Classes or templates (C++)– Function pointers (C)– Function interfaces (Fortran-90)
/* initialize velocity transform (A) */ veltran_init (true, x0, dx, nx, s0, ds, nv, o1, d1, nt, s02, anti, psun1, psun2);
/* least-squares minimization of |A x – b|^2, x=vscan, b=cmp */sf_solver (veltran_lop, sf_cgstep, ntv, ntx, vscan, cmp, niter,
"err", error, "nmem", 0, "nfreq", miter, "mwt", mask, "end");
http://rsf.sourceforge.net/
Encapsulation in UNIX
Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because
that is a universal interface.
http://rsf.sourceforge.net/
Encapsulation in UNIX Shell
bash$ sfveltran < cmp.rsf > vtran.rsf adj=y v0=1 dv=0.025 nv=60bash$ sfdottest sfveltran mod=vtran.rsf dat=cmp.rsf v0=1 dv=0.025 nv=60sfdottest: L[m]*d=21665.9sfdottest: L'[d]*m=21665.9bash$ sfdottest sfveltran mod=vtran.rsf dat=cmp.rsf v0=1 dv=0.025 nv=60sfdottest: L[m]*d=21906.2sfdottest: L'[d]*m=21906.2bash$ sfconjgrad sfveltran < cmp.rsf > vtran.rsf niter=3 v0=1 dv=0.025 nv=60 sfconjgrad: iter 1 of 3sfconjgrad: grad=6.36797e+09sfconjgrad: iter 2 of 3sfconjgrad: grad=1.39068e+09sfconjgrad: iter 3 of 3sfconjgrad: grad=7.50257e+08
http://rsf.sourceforge.net/
Principles of Scientific Software
EncapsulationFile FormatsTestingReproducibilityMaintenance
http://rsf.sourceforge.net/
The Art of UNIX Programming
(Raymond, 2004) To design a perfect anti-Unix, make all file
formats binary and opaque, and require heavyweight tools to read and edit them.
If you feel an urge to design a complex binary file format, or a complex binary application protocol, it is generally wise to lie down until the feeling passes.
http://rsf.sourceforge.net/
RSF (Regularly Sampled Format)
SEPlib (Stanford Exploration Project) Data separated from text headers Conceptually N-dimensional hypercubes Multiple files for complex geometries Not application specific
Data
n1=1000 in=“/path/data.rsf@”n2=500 n3=100 d1=0.001 d2=0.1 o2=1
http://rsf.sourceforge.net/
Principles of Scientific Software
EncapsulationFile FormatsTestingReproducibilityMaintenance
http://rsf.sourceforge.net/
Testing
Test-driven development (Beck, 2003) YAGNI principle
– Always implement things when you actually need them, never when you just foresee that you need them.
In scientific software development, tests are computational experiments
http://rsf.sourceforge.net/
Testing with SCons
Software Construction Replacement for “make”
– reliable and extensible dependency analysis
– configuration files are Python scripts
– cross-platform– open-sourcehttp://www.scons.org
http://rsf.sourceforge.net/
SConstruct File
# Mobil AVO CMP gather 807 at well4 locationFetch('cmp807_raw.HH','rad')
# PreprocessingFlow('cmp','cmp807_raw.HH', 'dd form=native | tpow tpow=2 | mutter half=n v0=1.3 tp=0.2')Plot('cmp','grey title="Input CMP Gather" ‘)
# Velocity TransformFlow('veltran','cmp','veltran s02=0.25 v0=1.250 dv=0.025 nv=60 adj=y')Plot('veltran','grey title="Velocity Scan" ')
# Display Side by SideResult('veltran','cmp veltran','SideBySideAniso')
http://rsf.sourceforge.net/
Experimenting with SCons
bash$ sconsretrieve(["cmp807_raw.HH"], [])< cmp807_raw.HH sfdd form=native | sftpow tpow=2 | sfmutter half=n v0=1.3 tp=0.2 > cmp.rsf< cmp.rsf sfgrey title="Input CMP Gather" > cmp.vpl< cmp.rsf sfveltran s02=0.25 v0=1.250 dv=0.025 nv=60 adj=y > veltran.rsf< veltran.rsf sfgrey title="Velocity Scan" > veltran.vplvppen yscale=2 vpstyle=n gridnum=2,1 cmp.vpl veltran.vpl > Fig/veltran.vplbash$ sed s/Velocity/Slowness/ < SConstruct > SConstruct2bash$ mv SConstruct2 SConstructbash$ scons < veltran.rsf sfgrey title=“Slowness Scan" > veltran.vplvppen yscale=2 vpstyle=n gridnum=2,1 cmp.vpl veltran.vpl > Fig/veltran.vpl
http://rsf.sourceforge.net/
Principles of Scientific Software
EncapsulationFile FormatsTestingReproducibilityMaintenance
http://rsf.sourceforge.net/
Reproducible Research at Stanford
(Knuth, 1992)– A computer program should be written with
human readability as a primary goal. (Claerbout and Karrenbach, 1992)
– The purpose of reproducible research is to facilitate someone going a step further by changing something.
(Buckheit and Donoho, 1995)– An article about computational science in a
scientific publication is not the scholarship itself, it is merely advertising of the scholarship.
http://rsf.sourceforge.net/
Reproducible Experiments
Within the world of science, computation is now rightly seen as a third vertex of a triangle complementing experiment and theory. However, as it is now often practiced, one can make a good case that computing is the last refuge of the scientific scoundrel […] Where else in science can one get away with publishing observations that are claimed to prove a theory or illustrate the success of a technique without having to give a careful description of the methods used, in sufficient detail that others can attempt to repeat the experiment? (LeVeque, 2006)
http://rsf.sourceforge.net/
http://rsf.sourceforge.net/
http://rsf.sourceforge.net/
Principles of Scientific Software
EncapsulationFile FormatsTestingReproducibilityMaintenance
http://rsf.sourceforge.net/
Maintenance
Computational experiments that are not continuously maintained loose reproducibility.– Regression testing (Brooks, 1975)
Contribute computational software and experiments to a community-maintained repository to enable research productivity.
http://rsf.sourceforge.net/
Open Science
http://rsf.sourceforge.net/
Conclusions
Principles of Scientific Software– Encapsulation– File Formats– Testing– Reproducibility– Maintenance
Madagascar software package– Open source, open community, open science
http://rsf.sf.net/