CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey...
Transcript of CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey...
CASAAlgorithms R&D
S. Bhatnagar
NRAO, Socorro
2
CASAOutline
● Broad areas of work
1. Processing for wide-field wide-band imaging
Full-beam, Mosaic, wide-band, full-polarization
Wide-band continuum and spectral-line imaging
2. Related High Performance Computing (HPC)
Multi-threading, Cluster computing, GPU, ...
3. Pipeline processing (imaging)
4. Establish cost-performance equation
● Relatively low-FTE effort● 1 – 1.5 FTEs spread across 3 – 4 scientists
3
CASAWorkload
● Determine the landscape● Characterize the problem, survey existing solutions, estimate the
parameter space, etc.
● Develop a path with scientifically useful intermediate stops
● R&D for solution (start simple), stabilize the implementation, scientific testing, characterize the algorithm, etc.
● Publish papers in appropriate refereed journals
● Integrate with production software● Overheads of issues related to new code in CASA software system
● Usable HPC is necessary: So also involved with HPC effort(s)
● Write document, maintain code, even user-support...
4
CASACurrent relevant activities
● Algorithms for wide-band wide-field imaging● Frequency dependence of the sky-brightness distribution
● Instrumental: Time- and frequency-dependent PB
Heterogeneous PBs
● Scientific testing, characterization of limits
– Some results & details in later slides
● Full-polarization imaging
– In-beam polarization (full-Mueller (?) Imaging)● Numerical characterization of the problem
– Extend WB PB corrections to full-pol
– Wide-band full-pol. MT-MFS or Cube imaging● RM synthesis
5
CASACurrent relevant activities
● Related HPC activities● Large data + High Computing load
– Data parallelism on Cluster + Multi-threading
– GPU computing
● Determine balance between computing resources and imaging performance
– Needs scientific testers with domain expertise in advanced algorithms
– Establish cost-performance equation● Important for development going forward● Crucial for usable and reliable pipeline processing● Of great interest for the larger RA community and algorithms R&D
6
CASACurrent relevant activities
● Pipeline processing● Develop heuristics to determine an optimal path through the imager
parameter space
– Needs understanding and characterization of limits, estimate of cost-performance equations, scaling laws, etc.
● Software development [Details in presentations later]● Re-factor imager framework
● Integrate with existing parallelization framework, re-integrate with new parallelization framework when it is ready
● Test for correctness, performance
– Many overheads + inherently time-consuming
7
CASASome terminology / definitions
● “Wide-band”: Frequency dependent effects are significant
– Fractional bandwidth used for imaging > ~20%
– High Spectral Index sources
● Wide-field imaging: Imaging FoV requires PB or W-term corrections
– Imaging beyond the 50% of the PB at a reference frequency● Single pointing wide-band imaging at lower EVLA bands● Mosaicking (by definition!) at any of the EVLA or ALMA bands
– Imaging when (error due to the W-term is significant)Bλ
fD2>1
8
CASASome terminology / definitions
● MT-MFS: Multi-term Multi-Frequency Synthesis algorithm
– To account for the frequency dependence of the sky brightness distribution
– Important for fractional bandwidth of > ~20% and dynamic range (DR) > ~103
● A-Projection: Algorithm to correct for Direction-dependent (DD) effects (PB effects) as a function of time and polarization
– Useful for sensitive spectral-line imaging
– a.k.a “Narrow Band A-Projection” or “NB A-Projection”
● WB A-Projection: Algorithm to also account for frequency dependent DD effects (frequency dependent PB)
– PB corrections beyond 50% point in single-pointing imaging
– For accurate mosaic imaging at DR in the range of few x 103
– Probably at even lower DR for full-pol imaging at any of the ALMA or EVLA bands
9
CASAWideband imaging
● MT-MFS for frequency dependence of sky brightness [Rau & Cornwell, A&A, 2010]
S (ν , l⃗ ) ∝ S (νo , l⃗ ) ( ννo )
α(ν , l⃗ )
• 3C286, BW=1.0-2.1 GHz● No wide-band modeling of the sky emission
● DR: 1600
• 3C286, BW=1.0-2.1 GHz● With MS-MFS (freq. Dependent model for the sky emission)
● DR: >110,000
10
CASAPB effects: Characterization
● A-Projection for in-beam time and pol effects
● WB A-Projection for frequency dependence
IContinuum( l⃗ , Pol) = ∬ I ( l⃗ ,ν)PB( l⃗ ,ν , t , Pol) d ν dt
Time-dependent DD effects Pol-dependent DD effects
11
CASATime and polarizationdependence
● Effects of time and polarization dependence of the PB
Errors due to PBSquint + Rotation + Pointing errors
Purely instrumentalStokes-V artifacts
Due to avg. PB
Stokes-I
Stokes-V
12
CASAPB Effects: Frequency dependence
● WB A-Projection for in-beam frequency dependence
I Continuum( l⃗ , Pol) = ∬ I ( l⃗ , ν)PB( l⃗ ,ν , t , Pol) d ν dt
I Spectral(ν , l⃗ , Pol) = ∫ I (ν , l⃗ )PB ( l⃗ ,ν , t , Pol) dt
PB Freq. dependence(blue curve)
13
CASAWideband widefield imaging: Characterization
● Effect of instrumental frequency dependence
Pulsar Sp. Ndx -3.0
Artificially steepSpectral Index
14
CASAWideband widefield imaging: Performance evaluation
● Combined MT-MFS and WB A-Projection algorithm
MF
S
+S
tan
dar
d I
mag
ing
MT-
MF
S
+N
B A
-Pro
ject
ion M
T-MF
S
+W
B A
-Pro
jection
MT-M
FS
+S
tand
ard Im
agin
g
Ap.J., 2013
WB
A-P
rojec
tion
15
CASAWideband widefield imaging: Performance evaluation
● Characterize performance, limits [Rau & Bhatnagar, in prep.]
● Heterogeneous PB correction [Kundert & Rau, Masters thesis]
– Time-, shape-dependence, in-beam effects important at DR > 104
– Size-dependent functions sufficient for ALMA for now (usable already)
– Size-dependent full-pol support for ALMA may be required next
MT-MFS +WB A-Projection
2 uJy rms
Cube +NB A-Projection3 uJy rms
MT-MFS
Cube
Brightest Source :100 mJy
4 uJy rmspeak res : 20 uJy
6 uJy rms*peak res : 15 uJy
16
CASAWideband Mosaic Imaging
● Characterize performance, limits [Rau & Bhatnagar, in prep.]
Intensity : Reconstructed / True
Alpha : Reconstructed - True
17
CASAWideband Mosaic Imaging
● Characterize performance, limits [Rau & Bhatnagar, in prep.]
RMS : 0.3 uJy
Intensity : Reconstructed / True
Alpha : Reconstructed - True
18
CASAFull polarization imaging: Work in progress
● Extend WB A-Projection to full-polarization (full-Mueller?) [PhD Thesis project of P. Jagannathan]
[I I
o
IQo
IUo
IVo ] = [
I Iobs
IQobs
IUobs
I Vobs ]
The Direction-dependent Mueller matrix (in Stokes basis)
19
CASAParameter Space for HPC
● In terms of algorithm design● Move towards higher compute-to-i/o ratio
● Minimize memory footprint
– Remain inside the green box
Computing
I/OM
emo
ry
Compute-to-I/O Ratio
More memory per FLOP
Lesser memory per FLOP
20
CASARelated HPC
● Large data volumes (few 100 GB to few TB), higher computing load, higher memory footprint
● Distributed major-cycle on compute cluster + Luster FS [EVLA Memo #132,133, 2009]
– Favorable compute-to-i/o ratio
– Good scaling: 60 – 70% efficiency
● Memory foot-print an issue beyond a certain scale. Solutions: – Multi-threaded gridding [Golap]
● A single instance of gridder utilizing all available cores
– Optimal W-Projection planes [Golap]● Determine number of w-planes from the data rather than FoV
● Scientific testing in progress
● Frequency resolution for wide-band PB correction
● Rotation with PA: interpolation vs. Caching
● Oversampling
21
CASATesting: MTMFS + WB AWP + Mosaic + HPC Result
● 80-pointing EVLA WB mosaic @L-Band
● MTMFS + WB A-P using ~40 processes
[Rau & Bhatnagar, (work in progress)]
Stokes-I
22
CASATesting: MTMFS + WB AWP + Mosaic + HPC Result
● 80-pointing EVLA WB mosaic @L-Band
● MTMFS + WB A-P using ~40 processes
●Unresolved issues● Numerical noise
with wide-band and“large” number ofpointings
● TODO● Evaluate solutions
for WB OTFM
[Rau & Bhatnagar, (work in progress)]
Intensity WeightedSp. Ndx.
23
CASAImaging pipeline
● The imager can be configured into large number of states with vastly varying computing cost and imaging performance
● Staged development: Heuristics to determine imager parameters● minimize computing costs and maximize imaging performance
● Auto-flagging: Existing algorithms (tfcrop, rflag) a good start, but need heuristics to use in a pipeline
24
CASAOther, longerterm activities
● Asp-Clean type deconvolution algorithm [L. Zhang’s PhD thesis] ● Positively impacts memory footprint and Spectral Index imaging performance
● CompSens ideas built-in
– Evaluate other similar ideas
VTrue- VModel
Id-BIM Niter ~60K 50 ~15K ~1 KClean MEM MS-Clean Asp-Clean
Ima
ge
Vis
ibil
ity
25
CASAOther, longerterm activities
● GPU computing● Collaboration with NVIDIA Dev. Tech. Division + Univ. group
● Efficient for computing OTF convolution function computation and multi-scale computations
● Not yet clear if useful for gridding (the dominant cost)
● PB measurements/modeling
TotalTotal
FFTFFTFFTFFT
Mulit-scale imageMulit-scale imagecomputationcomputation
26
CASASummary
● Pace of work is resource limited
● Resources of the right skill-set is crucial
● Following the 2009 memo plan and EVLA/ALMA requirements
● Wide-band continuum imaging requires MT-MFS for DR > few x 103
● PB effects important @ DR > few x 103-104 ; @ few x 103 for full-pol.
● MT-MFS + WB A-P required for mosaic and spectral index mapping
● Work in progress
● Test combined WB imaging algorithm (including mosaicking) in production code
● Test deployment on HPC platforms (necessary for practical usability)
● Characterizing effects of in-beam polarization for full-pol imaging
● Towards developing wide-field full-band full-pol imaging capability
● Research deconvolution algorithms with smaller memory footprint