Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one...
Transcript of Post-Processing Cactus Data - ULisboa · Cactus Simulation Folders I Typically contains one...
Post-Processing Cactus Data
Wolfgang Kastaun
AEI Hannover
ET school, Lisbon, Sep. 2018
Post-Processing
General tasks
I Read different data formats from simulation
I Analyse data (e.g. statistics, integrals)
I Visualize data
I Need to combine different tools
I Need custom infrastructure to move data between tools
Visualization ToolsPackage Good for Interface License
Matplotlib 1D,2D Python FreeBokeh 1D,2D Python FreePlotly 1D,2D Python Free
(P)YGraph 1D evolution GUI FreeR 1D,2D Own language Free
Gnuplot simple 1D archaic FreeMathematica 1D,2D Own language Non-Free
Yt 2D,3D Python FreeVisIt 3D Gui,Python FreeVTK 3D C++,Python,Java,.. Free
DataVault 3D GUI FreeBlender Raytracing Gui, Python Free
⇒ Once data can be imported in Python, you can do anything.
Visualization ToolsPackage Good for How to import Cactus data
Matplotlib 1D,2D PostCactus(P)YGraph 1D evolution Supports Cactus
R 1D,2D ?Gnuplot simple 1D Only plain text format
Mathematica 1D,2D SimulationToolsYt 2D,3D SimulationIO
VisIt 3D PluginVTK 3D PostCactus or VisIt
DataVault 3D Conversion toolBlender Raytracing PostCactus+VTK
Data Analysis Tools
Mathematica
I Non-Free
I Very specialized programming language
I Powerful, well-tested, very robust
I Access Cactus data via SimulationTools module
I Integration, differentiation, Fourier analysis, ODE solvers,statistics, vectors, matrices.
I Huge collection of mathematical methods
I Notebook interface
Data Analysis Tools
Python based
I Python is a free, well-known general purpose language
I Numerical capabilities via numpy + scipy packages
I Great for reading files (text, HDF5, JSON, ..)
I Access Cactus data via PostCactus package
I Integration, differentiation, Fourier analysis, ODE solvers,statistics
I Notebook web interfaces: Jupyter, Sage
I Sage Math: Python environment for numerics
I Sage Manifolds: Differential Geometry support
Workflows
Jupyter notebook
I Specialized web-server
I Can be run local (Laptop) or remote (cluster)
I Running remote needs some setup
I Support for many languages, e.g. Python, R, Julia
I Interactive Python coding via webinterface
I Self-documenting workflow
I Notebooks are files, can be shared
I Can embed plots created on the fly
I Can use version control (diffs are ugly though)
Workflows
Python scripts
I Great to automate tasks
I More reproducible
I Trivial to run on cluster
I Easily version-controlled
I Typical use case: figure in publication
Workflows
VisIt
I Explore data interactively
I Easily make movies
I Difficult to repeat stuff
I Restricted by GUI
I Can be scripted though
IPython
I Interactive Python commandline
I Convenient tab-completion, history, “magic” commands
I Good for quick one-time tasks
Workflows
Version control
I git or mercurial (hg)
I Don’t use svn, darcs, or even CVS (Ew!)
I Use on all scripts, articles, talks
I Can be used on notebooks
I Not suitable for large simulation data(Maybe with git-lfs or git-annex)
I Second purpose: sync between machines
I Central repo hosting: github, gitlab, bitbucket
Workflows
Anaconda
I Installing Python environment can be tricky
I Cannot install everything because dependency collisions,e.g. Python 2 versus Python 3
I Create virtual environments
I Need specialized package management
I Anaconda does both
Workflows
Docker
I OS inside OS
I Cheaper than virtual machines
I Provides well defined environments (images)
I Build on one machine, use everywhere
I No more OS updates breaking scripts
I Can use Anaconda and Jupyter inside container
I Can make snapshots of ongoing work
Cactus Simulation Folders
I Typically contains one subfolder for each restart
I Restarts overlap in time
I Folder structure up to user
I Data as single files for different data type
I No infrastructure for metadata
I Many data formats based on HDF5 file format.
I Different file formats in use for the same type of data
I Different variants of each format
I Grid and Scalar data one file per variable or one file per group.
Total mess, impossible to use directly⇒ Need abstraction layer
Cactus Simulation Folders
Typical restart folder contents
I Parfile: *.par
I Logfiles for normal output (*.out) and errors (*.err)
I Scalar data Timeseries: *..asc
I Reduction results Timeseries: *.minimum.asc (minimum),*.norm2.asc (2-norm), etc
I 3D Grid data: *.xyz.h5, * file*.h5
I 2D cuts, e.g. *.xy.h5
I 1D cuts, e.g. *.x.h5 and/or *.x.asc
I Multipole data: mp * l* m* r*.asc or mp *.h5
I GW data: mp psi4 l* m* r*.asc or mp psi4.h5
I Black hole properties BH diagnostics.ah*.gp
I Apparent horizon shape h.t*.ah2.gp
PostCactus Framework – Overview
I Access Cactus data formats from PythonI Transparent merging of restartsI Hides technicalities (e.g. data formats, extensions)
I Also provides some analysis toolsI Time Series: differentiation, resampling, FFTI Gravitational waves: strain, spectraI Grid data: interpolation, histograms, percentiles, gradient
I Some helper functions to integrate with matplotlib and VTKI Simplify 2D data color- and contour-plotsI VTK: isosurfaces, volume rendering, field lines
I History: grown from piles of postprocessing scripts
PostCactus – Supported Data Types
I Grid dataI 1D,2D,3D hdf5 (1 file per variable only)I 1D ASCII
I Scalar dataI ASCII format (1 file per variable or per group)I Scalar, min, max, normsI Integrals from norms (needs grid volume)I Transparent decompression
I Multipole data: ASCII, HDF5
I GW signal from Ψ4 multipoles or WaveExtract (deprecated)I Apparent horizon data
I AHFinderDirect, QuasiLocalMeasures,IsolatedHorizons (deprecated)
I Horizon shape: ASCII format
I Partial support for parameter files
PostCactus – Limitations
I Does not support one file per group hdf5 format
I Reads scalar data only in ASCII format
I More data format readers in preparation
I Parameter file language not fully supported
I No MPI support for postprocessing
I Not ready for Python 3
PostCactus Framework – Installation
Required packages
I Python 2.7
I H5Py
I NumPy, SciPy
Recommended packages
I Ipython, Jupyter notebook server
I Matplotlib
I Sphinx (Documentation)
I ffmpeg (Movies)
I VTK (3D)
Easiest way to install dependencies: Anaconda Python distribution
PostCactus Framework – Installation
I Download from public repository
hg clone\
https://[email protected]/DrWhat/pycactuset
I If using Anaconda, activate environment
source activate <your environment name>
I Install Python package
cd pycactuset/PostCactus
python setup.py install
Can use --prefix= option to install in custom location
I Build documentation
cd doc
make html
See alsohttps://bitbucket.org/DrWhat/pycactuset/wiki/Home
PostCactus – Gallery
I Cactus GW data → PostCactus → Matplotlib
I Instantaneous frequency using PostCactus timeseries
4
3
2
1
0
1
2
3
4
hr e
x/(
100
MP
c)
1e 22
h+ h×
0 5 10 15 20 25 30 35
(t−r) [ms]
0
1
2
3
4
5
f[k
Hz]
PostCactus – Gallery
I Cactus GW data → PostCactus → Matplotlib
I Spectrum using PostCactus GW utilities
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
h̃ef
f
1e 24
0 1 2 3 4 5
f [khz]
0
5
10
15
20
25
30
35
40
(t−r)
[ms]
PostCactus – Gallery
I Cactus 2D + BH data → PostCactus → MatplotlibI 2D horizon cuts done by PostCactus.
100
50
0
50
100
y[k
m]
t = 3.0 ms t = 3.0 ms t = 3.0 ms
100 50 0 50 100x [km]
100
50
0
50
100
y[k
m]
t = 8.9 ms
100 50 0 50 100x [km]
t = 15.0 ms
100 50 0 50 100x [km]
t = 15.0 ms
8.8
9.6
10.4
11.2
12.0
12.8
13.6
14.4
log 1
0(ρ[g/c
m3])
PostCactus – Gallery
I Cactus 3D data → PostCactus → VTK
I Fieldline integration and selection by custom Python code
PostCactus – Gallery
Cactus 3D data → PostCactus → VTK → Blender
PostCactus – Utilities
I simsync: transfers specified variables of a simulation
I pardiff: parses two parameter files and prints differences
I simrep: automated generation of html reports for runs
I simvideo: make movies
SimVideo framework
I Produce movies from Cactus data
I Movies iplemented as Python modules
I Each contains code to plot single frame & load required data.
I Support for matplotlib and VTK
I Uses ffmpeg to assemble frames
I Code not parallel yet
SimRep framework
I Automatic generation of html reports from simulation data
I Modular design, easy to design own report pages
I Python based document description language
I Can run arbitrary postprocessing scripts to get plotsI Available modules
I LogfilesI Global quantities (total baryon mass, max density, lapse, ..)I Constraint violationI Performance (rudimentary, only memory and speed)I GW signal