Workshop on Semiparametric Methodology, University of of ...

1
Workshop on Semiparametric Methodology, University of of Florida, Jan. 2009 Incorporating Systematic Uncertainties into Spectral Fitting H.Lee * , V.Kashyap, J.Drake, A.Connors, R.Izem, T.Park, P.Ratzlaff, A.Siemiginowska, D.van Dyk, A.Zezas H. Lee * Harvard-Smithsonian Center for Astrophysics [email protected] Introduction X-ray spectral fitting is a process of solving the inverse problem (eq.1) to infer θ, the parameter(s) of a source model S. The observed Poisson photon counts O(E i ) are the main source of the uncertainty in θ estimates, so called statistical error, whereas the systematic uncertainties in A (effective area) and R (response matrix) have been ignored in spectral fitting. This igno- rance generally underestimates the error bars of θ. This presentation focuses on handling the A uncertainty in the spectral fitting process and illustrates an efficient way to obtain calibration uncertainty incorporated error bars. O(E i )= S (E ; θ )R(E i ; E )A(E )dE (1) Effective Area (arf) The plot below shows the coverage of a sample of 1000 ACIS-S arfs gen- erated by Drake et al. (2006) and the default arf (a o ) is in a black line. E [keV] ACIS-S effective area (cm 2 ) 0 200 400 600 800 0.2 1 10 In order to incorporate the arf uncertainty into spectral fitting that affects error calibration results, we propose Bayesian hierarchical modeling for spec- tral fitting by devising the MCMC algorithms by van Dyk et al. (2001). Summarizing the black box of arfs (A) Principal Component Analysis (PCA) reduces dimensionality and summa- rizes the arf set A with a small number of principal components (PCs), ready to be utilized into spectral fitting instead of the entire 1000 arf sample by calibration scientists. Let a j ={1,...,M } ∈A be given arfs by calibration scientists on which we perform PCA. Scree plot: 8 PCs explain 96% of total variation; 12 PCs explain 99%. We will use the first 8 PCs (v n ) and 8 coefficients (r n ) to simu- late arfs. Then, an arf a (j * ) is generated via: PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 PC11 PC12 0.0 0.1 0.2 0.3 0.4 0.44 0.68 0.78 0.87 0.91 0.93 0.95 0.96 0.97 0.98 0.98 0.99 Contribution a (j * ) = a * o + δ a + 8 n=1 e n r n v n , e n N (0, 1) (2) where a * o is the supplied default arf, a o is the default arf, ¯ a the mean of a j s, and δ a a - a o . The figure below shows the coverage of simulated 1000 arfs in red lines. The 8 PCs are sufficient to match the arf uncertainty represented by gray lines. E [keV] ACIS-S effective area (cm 2 ) 0 200 400 600 800 0.2 1 10 Marginalizing over arfs p(θ |y)= A p(θ |y, a)p(a)da = 1 M M j =1 p(θ |y, a j ) How to marginalize over arfs? Drake et al. (2006) proposed a strategy [B.0] using standard packages (e.g., XSPEC). We propose three algorithms [B.1-B.3] with BLoCXS (van Dyk et al., 2001). [B.0] tends to be tedious and time consuming depending on the size of the arf library, whose Bayesian counterpart is [B.1]. To speed up [B.1], we introduce [B.2] by selecting arfs randomly from the arf library. To improve computational efficiency, we introduce [B.3]. Given the observed spectrum, [B.0-B.3] work as follows: [B.0] Fit with XSPEC Require: M arfs and spectral fitting engines; for j =1, ..., M do Set a new arf a j and fit the spectrum yielding a best fit ˆ θ j . end for Compute mean and variance of { ˆ θ j } j =1,...,M . Repeat fitting procedures as many times as the size of the arf library instead of the supplied default arf. Very tedious!!! [B.1] Fit with Gibbs sampler Require: M arfs and Bayesian spectral fitting engines; Set initial values including priors for j =1, ..., M do repeat Augment data y k |j given θ k -1|j and a j Draw θ k |j from p(θ |y k |j , a j ) until the chain {θ k |j } is stable, k =1, ..., n j . Drop n b draws of a burn-in period. end for Compute mean and variance of {θ k |j } j =1,...,M . Extra-tedious! The individual gibbs sequence {θ k |j } offers a statistical error that varies depending on an arf. See the plot at the lower right. [B.2] Fit with randomized arfs Require: M arfs and Bayesian spectral fitting engines; Set initial values including priors repeat Choose a (j ) randomly among M arfs. Augment data y k (j ) given θ k -1(j -1) and a (j ) Draw θ k (j ) from p(θ |y k (j ) , a (j ) ) until the chain {θ k (j ) } is stable, k =1, ..., n. Drop n b draws of a burn-in period. Compute mean/variance or mode/HPD from {θ k (j ) }. Randomizing arfs saves the for loop in [B.1]. [B.3] Fit with PC simulated arfs Require: PCs (v n ), coefficients (r n ), and spectral fitting engines; Set initial values including priors repeat Simulate a (j * ) based on PCs. (see eq.(2).) Augment data y k (j * ) given θ k -1(j * -1) and a (j * ) Draw θ k (j * ) from p(θ |y k (j * ) , a (j * ) ) until the chain {θ k (j * ) } is stable, k =1, ..., n. Drop n b draws of a burn-in period. Compute mean/variance or mode/HPD from {θ k (j * ) }. We distinguish (j * ), PC simulation from (j ), randomization. Comparison across algorithms Results from these algorithms work very similarly as shown below but [B.3] is most efficient. One histogram of best fits [B.0] and three posterior density profiles [B.1-B.3] from fitting an absorbed power-law spectrum of photon index α =2, column density N H = 10 23 cm -2 , and total counts 10 5 are shown. The black bar indicates a best fit± ˆ σ only with the default arf. The widths of posterior densities represent errors including calibration un- certainty. α 1.6 1.8 2.0 2.2 2.4 B.0 B.1 B.2 B.3 N H 9.0 9.5 10.0 10.5 11.0 B.0 B.1 B.2 B.3 Another absorbed power-law spectrum (α =1, N H = 10 21 cm -2 , 10 5 cnts). α 0.85 0.95 1.05 1.15 B.0 B.1 B.2 B.3 N H 0.06 0.08 0.10 0.12 0.14 B.0 B.1 B.2 B.3 How many arfs? PCs and coefficients depend on the arf library provided by calibration sci- entists but our results from PCA indicate that a relatively small number of arfs is sufficient to incorporate calibration uncertainty instead of thousands. Law of Total Variance (LTV) LTV explains the complexity of the error decomposition. A best fit depends on arfs and its uncertainty has two components, statistical error and cali- bration error which are not independent. LTV indicates that the calibration error is dominant with high count data where the statistical error becomes minuscule. This law also explains that [B.3] of 8 PCs (96% calibration error) tends to result in slightly narrower profiles than other algorithms. V [θ ]= V [E [θ |a]] + E [V [θ |a]] Behaviors of calibration and statistical errors Depending on the model used, these two errors may not be separable. In the plot below, two groups of 15 similar arfs are colored and the histograms of gibbs sequences are colored according to the arf colors (default arf in black). The shifting patterns of posteriors do not match between these two spectra. This figure clearly shows that best fit values change with arfs and that calibration uncertainty must be incorporated into spectral fitting. an absorbed power-law spectrum E [keV] ACIS-S effective area (cm 2 ) 0 200 400 600 800 0.3 1 8 α =2, N H = 10 23 cm -2 , 10 5 counts α 1.7 1.8 1.9 2.0 2.1 2.2 2.3 N H 9.0 9.5 10.0 10.5 α =1, N H = 10 21 cm -2 , 10 5 counts α 0.85 0.95 1.05 1.15 N H 0.08 0.10 0.12 0.14 Asymptotics of calibration error In the figure below, the horizontal solid line represents the average uncer- tainty derived from [B.0] and the dashed lines represent the range in this uncertainty obtained from 20 simulations (α =2, N H = 10 23 cm -2 , 10 5 counts). Also shown are the results obtained from combining posterior pdfs by using different numbers of arfs. Dots represent the mean uncertainty and vertical bars denote errors on the means; in other words, N arfs from 1000 are randomly chosen to get the uncertainty of 1 N (j ) p(θ |y, a (j ) ) for 200 times, and the means and rms errors of these uncertainties are the dots and bars. This figure shows that after N25, the estimated uncertainty is stabi- lized and therefore, 25 fits with different arfs are sufficient to account for calibration uncertainty provided that the full posterior pdf on the parameters is obtained. ●●● ●● ●●● ●●●● ●●●●●●●●●●● ●●●●● 0 10 20 30 40 50 0.04 0.08 0.12 N σ tot α ●● ●● ●● ●●●● ●●●●●● ●● ●● ●●●●●● 0 10 20 30 40 50 0.10 0.20 0.30 N σ tot N H Analyzing Quasar Spectra Sixteen radio loud quasar spectra from Chandra Data Archive (CDA) are analyzed based on the PowerLaw*abs model. Both panels display the error characteristics of estimated power law index α. Calibration error with/without the arf uncertainty is denoted by σ tot /σ stat . Three or 4 digit numbers indicate ObsID in CDA. The left panel shows non zero limits in er- rors due to calibration uncertainty. The right panel displays that systematic errors become more significant in high count spectra than low count ones. Summary We have developed a fast, robust, and general method to incorporate effec- tive area calibration uncertainties in model fitting of low-resolution spectra. Because such uncertainties are ignored during spectral fits, the error bars derived for model parameters are generally underestimated. Incorporating them directly into spectral analysis with existing analysis packages is not possible without extensive case-specific simulations, but it is possible to do so in a generalized manner in a Markov chain Monte Carlo framework. We describe our implementation of this method here, in the context of recently codified Chandra effective area uncertainties. We develop our method and apply it to both simulated as well as actual Chandra ACIS-S data. We estimate the posterior probability densities of absorbed power-law model pa- rameters that include the effects of such uncertainties. Overall, a single run of the Bayesian spectral fitting algorithm incorporates calibration uncertainty effectively. References Drake, J. et al. (2006). Proc. SPIE, 6270, p.49 van Dyk, D. et al. (2001). ApJ, 548(1), p.224 Acknowledgment This work was supported by NASA/AISRP grant NNG06GF17G.

Transcript of Workshop on Semiparametric Methodology, University of of ...

Workshop on Semiparametric Methodology, University of of Florida, Jan. 2009

Incorporating Systematic Uncertainties into Spectral FittingH.Lee∗, V.Kashyap, J.Drake, A.Connors, R.Izem, T.Park, P.Ratzlaff, A.Siemiginowska, D.van Dyk, A.Zezas

H. Lee∗Harvard-SmithsonianCenter for [email protected]

IntroductionX-ray spectral fitting is a process of solving the inverse problem (eq.1) to

infer θ, the parameter(s) of a source model S. The observed Poisson photon

counts O(Ei) are the main source of the uncertainty in θ estimates, so called

statistical error, whereas the systematic uncertainties in A (effective area)

and R (response matrix) have been ignored in spectral fitting. This igno-

rance generally underestimates the error bars of θ. This presentation focuses

on handling the A uncertainty in the spectral fitting process and illustrates

an efficient way to obtain calibration uncertainty incorporated error bars.

O(Ei) =

∫S(E; θ)R(Ei; E)A(E)dE (1)

Effective Area (arf)The plot below shows the coverage of a sample of 1000 ACIS-S arfs gen-

erated by Drake et al. (2006) and the default arf (ao) is in a black line.

E [keV]

ACIS

−S e

ffecti

ve a

rea

(cm

2 )

020

040

060

080

0

0.2 1 10

In order to incorporate the arf uncertainty into spectral fitting that affects

error calibration results, we propose Bayesian hierarchical modeling for spec-

tral fitting by devising the MCMC algorithms by van Dyk et al. (2001).

Summarizing the black box of arfs (A)Principal Component Analysis (PCA) reduces dimensionality and summa-

rizes the arf set A with a small number of principal components (PCs),

ready to be utilized into spectral fitting instead of the entire 1000 arf sample

by calibration scientists. Let aj={1,...,M} ∈ A be given arfs by calibration

scientists on which we perform PCA.

Scree plot: 8 PCs explain

96% of total variation; 12

PCs explain 99%. We will

use the first 8 PCs (vn) and

8 coefficients (rn) to simu-

late arfs. Then, an arf a(j∗)is generated via:

PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 PC11 PC12

0.0

0.1

0.2

0.3

0.4 0.44

0.680.780.870.910.930.950.960.970.980.980.99

Contribution

a(j∗) = a∗o + δa +

8∑n=1

enrnvn, en∼N(0, 1) (2)

where a∗o is the supplied default arf, ao is the default arf, a the mean of

ajs, and δa = a − ao. The figure below shows the coverage of simulated

1000 arfs in red lines. The 8 PCs are sufficient to match the arf uncertainty

represented by gray lines.

E [keV]

ACIS

−S e

ffecti

ve a

rea

(cm

2 )

020

040

060

080

0

0.2 1 10

Marginalizing over arfs

p(θ|y) =∫A p(θ|y, a)p(a)da = 1

M

∑Mj=1 p(θ|y, aj)

How to marginalize over arfs?Drake et al. (2006) proposed a strategy [B.0] using standard packages (e.g.,

XSPEC). We propose three algorithms [B.1-B.3] with BLoCXS (van Dyk et

al., 2001). [B.0] tends to be tedious and time consuming depending on the

size of the arf library, whose Bayesian counterpart is [B.1]. To speed up

[B.1], we introduce [B.2] by selecting arfs randomly from the arf library. To

improve computational efficiency, we introduce [B.3]. Given the observed

spectrum, [B.0-B.3] work as follows:

[B.0] Fit with XSPEC

Require: M arfs and spectral fitting engines;

for j = 1, ...,M doSet a new arf aj and fit the spectrum yielding a best fit θj.

end forCompute mean and variance of {θj}j=1,...,M .

Repeat fitting procedures as many times as the size of the arf library instead

of the supplied default arf. Very tedious!!!

[B.1] Fit with Gibbs sampler

Require: M arfs and Bayesian spectral fitting engines;

Set initial values including priors

for j = 1, ...,M dorepeat

Augment data yk|j given θk−1|j and aj

Draw θk|j from p(θ|yk|j, aj)

until the chain {θk|j} is stable, k = 1, ..., nj.

Drop nb draws of a burn-in period.

end forCompute mean and variance of {θk|j}j=1,...,M .

Extra-tedious! The individual gibbs sequence {θk|j} offers a statistical error

that varies depending on an arf. See the plot at the lower right.

[B.2] Fit with randomized arfs

Require: M arfs and Bayesian spectral fitting engines;

Set initial values including priors

repeatChoose a(j) randomly among M arfs.

Augment data yk(j) given θk−1(j−1) and a(j)Draw θk(j) from p(θ|yk(j), a(j))

until the chain {θk(j)} is stable, k = 1, ..., n.

Drop nb draws of a burn-in period.

Compute mean/variance or mode/HPD from {θk(j)}.

Randomizing arfs saves the for loop in [B.1].

[B.3] Fit with PC simulated arfs

Require: PCs (vn), coefficients (rn), and spectral fitting engines;

Set initial values including priors

repeatSimulate a(j∗) based on PCs. (see eq.(2).)

Augment data yk(j∗) given θk−1(j∗−1) and a(j∗)Draw θk(j∗) from p(θ|yk(j∗), a(j∗))

until the chain {θk(j∗)} is stable, k = 1, ..., n.

Drop nb draws of a burn-in period.

Compute mean/variance or mode/HPD from {θk(j∗)}.

We distinguish (j∗), PC simulation from (j), randomization.

Comparison across algorithmsResults from these algorithms work very similarly as shown below but [B.3]

is most efficient. One histogram of best fits [B.0] and three posterior density

profiles [B.1-B.3] from fitting an absorbed power-law spectrum of photon

index α = 2, column density NH = 1023cm−2, and total counts ∼ 105

are shown. The black bar indicates a best fit±σ only with the default arf.

The widths of posterior densities represent errors including calibration un-

certainty.

αα

1.6 1.8 2.0 2.2 2.4

B.0B.1B.2B.3

NH

9.0 9.5 10.0 10.5 11.0

B.0B.1B.2B.3

Another absorbed power-law spectrum (α = 1, NH = 1021cm−2, ∼ 105 cnts).

αα

0.85 0.95 1.05 1.15

B.0B.1B.2B.3

NH

0.06 0.08 0.10 0.12 0.14

B.0B.1B.2B.3

How many arfs?PCs and coefficients depend on the arf library provided by calibration sci-

entists but our results from PCA indicate that a relatively small number of

arfs is sufficient to incorporate calibration uncertainty instead of thousands.

Law of Total Variance (LTV)LTV explains the complexity of the error decomposition. A best fit depends

on arfs and its uncertainty has two components, statistical error and cali-

bration error which are not independent. LTV indicates that the calibration

error is dominant with high count data where the statistical error becomes

minuscule. This law also explains that [B.3] of 8 PCs (96% calibration error)

tends to result in slightly narrower profiles than other algorithms.

V [θ] = V [E[θ|a]] + E[V [θ|a]]

Behaviors of calibration and statistical errorsDepending on the model used, these two errors may not be separable. In

the plot below, two groups of 15 similar arfs are colored and the histograms

of gibbs sequences are colored according to the arf colors (default arf in

black). The shifting patterns of posteriors do not match between these two

spectra. This figure clearly shows that best fit values change with arfs and

that calibration uncertainty must be incorporated into spectral fitting.

an absorbed

power-law

spectrum

E [keV]

ACIS

−S e

ffect

ive

area

(cm

2 )

020

040

060

080

0

0.3 1 8

α = 2,

NH =

1023cm−2,

∼ 105 counts

αα

1.7 1.8 1.9 2.0 2.1 2.2 2.3

NH

9.0 9.5 10.0 10.5

α = 1,

NH =

1021cm−2,

∼ 105 counts

αα

0.85 0.95 1.05 1.15

NH

0.08 0.10 0.12 0.14

Asymptotics of calibration errorIn the figure below, the horizontal solid line represents the average uncer-

tainty derived from [B.0] and the dashed lines represent the range in this

uncertainty obtained from 20 simulations (α = 2, NH = 1023cm−2, ∼ 105

counts). Also shown are the results obtained from combining posterior pdfs

by using different numbers of arfs. Dots represent the mean uncertainty and

vertical bars denote errors on the means; in other words, N arfs from 1000

are randomly chosen to get the uncertainty of 1N

∑(j) p(θ|y, a(j)) for 200

times, and the means and rms errors of these uncertainties are the dots and

bars. This figure shows that after N≈25, the estimated uncertainty is stabi-

lized and therefore, ∼25 fits with different arfs are sufficient to account for

calibration uncertainty provided that the full posterior pdf on the parameters

is obtained.

●●

●●

●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

0 10 20 30 40 50

0.0

40

.08

0.1

2

N

σσ to

t

αα●

●●

●●

●●

●●●●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

0 10 20 30 40 50

0.1

00

.20

0.3

0

N

σσ to

t

NH

Analyzing Quasar SpectraSixteen radio loud quasar spectra from Chandra Data Archive (CDA) are

analyzed based on the PowerLaw*abs model. Both panels display the

error characteristics of estimated power law index α. Calibration error

with/without the arf uncertainty is denoted by σtot/σstat. Three or 4 digit

numbers indicate ObsID in CDA. The left panel shows non zero limits in er-

rors due to calibration uncertainty. The right panel displays that systematic

errors become more significant in high count spectra than low count ones.

SummaryWe have developed a fast, robust, and general method to incorporate effec-

tive area calibration uncertainties in model fitting of low-resolution spectra.

Because such uncertainties are ignored during spectral fits, the error bars

derived for model parameters are generally underestimated. Incorporating

them directly into spectral analysis with existing analysis packages is not

possible without extensive case-specific simulations, but it is possible to do

so in a generalized manner in a Markov chain Monte Carlo framework. We

describe our implementation of this method here, in the context of recently

codified Chandra effective area uncertainties. We develop our method and

apply it to both simulated as well as actual Chandra ACIS-S data. We

estimate the posterior probability densities of absorbed power-law model pa-

rameters that include the effects of such uncertainties. Overall, a single run

of the Bayesian spectral fitting algorithm incorporates calibration uncertainty

effectively.

ReferencesDrake, J. et al. (2006). Proc. SPIE, 6270, p.49

van Dyk, D. et al. (2001). ApJ, 548(1), p.224

AcknowledgmentThis work was supported by NASA/AISRP grant NNG06GF17G.