A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis...
-
Upload
mae-wilkinson -
Category
Documents
-
view
220 -
download
1
Transcript of A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis...
![Page 1: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/1.jpg)
A taste of statisticsA taste of statistics
Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations for many experiments
Poisson distribution generally appropriate for counting experiments related to random processes (e.g., radioactive decay of elementary particles)
Statistical tests: 2 and F-test
Statistical errors and contour plots
Low-count regime: the C-statistic
![Page 2: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/2.jpg)
2
The Gaussian (normal error) The Gaussian (normal error) distributiondistribution
Casual errors are above and below the “true” (most “common”) valuebell-shape distribution if systematic errors are negligible
σσ=standard deviation =standard deviation of the functionof the function
ГГ=half-width at =half-width at half maximum=1.17half maximum=1.17σσ
μμ=mean=mean
FWHMFWHM
![Page 3: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/3.jpg)
3
Gaussian probability functionGaussian probability function
22 2σ/xe− function centered on 0
function centered on μ
normalization factor, so that ∫f(x) dx=1
![Page 4: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/4.jpg)
4
Pro
bab
ilit
yP
rob
ab
ilit
y
σσ
σσ
![Page 5: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/5.jpg)
5
The Poissoin distributionThe Poissoin distribution
Describes experimental results where events are counted and the uncertainty is not related to the measurement but reflects the
intrinsically casual behavior of the process (e.g., radioactive decay of particles, X-ray photons, etc.)
!/)( υμυ υμμ
−=eP
μμ>0: main parameter of the Poisson distribution>0: main parameter of the Poisson distributionνν=observed number of events in a time interval (frequency of events)=observed number of events in a time interval (frequency of events)
∑∑∞
=
−∞
=
===00 υ
υμ
υ
μ μυμυυυυ !/)( eP
μμ=average number of expected events if the experiment is =average number of expected events if the experiment is repeated many timesrepeated many times
average average number number of eventsof events
![Page 6: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/6.jpg)
6
μυμμυ
μυσ
μυ
υ=−=
=⟩−⟨=
−∞
=∑ e
!)(
)(
2
0
22
the Poisson distribution with average counts=the Poisson distribution with average counts=μμ has has standard deviation standard deviation √√μμ
μμ » : » : the Poisson distribution is approximated by the the Poisson distribution is approximated by the Gaussian distributionGaussian distribution
expectation value of the square of the deviations
![Page 7: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/7.jpg)
7
defined by only one parameter defined by only one parameter μμ
![Page 8: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/8.jpg)
8
Statistical tests: Statistical tests: 22
Test to compare the observed distribution of the results with that expected
2
1
2
∑=
−=
n
k k
kk
EEO )(
χ
OOkk=observed values =observed values
EEkk=expected values=expected values
K=number of intervalsK=number of intervals
n≤2
the observed and expected distributions are similar
![Page 9: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/9.jpg)
9
Degrees of freedomDegrees of freedom
Degrees of freedom (d.o.f.)=number of observed data – number of parameters computed from the data and used in the calculation
d.o.f.=n-cd.o.f.=n-cwhere
nn=number of data (e.g., spectral bins)cc=number of parameters which must be computed from the data to
obtain the expected Ek
X-ray spectral fitsX-ray spectral fitsd.o.f.=number of spectral data points – number of free parameters d.o.f.=number of spectral data points – number of free parameters
![Page 10: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/10.jpg)
10
d.o.f.=d.o.f.=channels (PHA bins) – channels (PHA bins) – free parametersfree parameters
.../ fod22
= reduced
![Page 11: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/11.jpg)
11
![Page 12: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/12.jpg)
12
F-testF-test
If two statistics following the 2 distribution have been determined, the ratio of the reduced chi-squares is distributed according to
the F distribution
2
2
2
1
2
121
υ
υυυ/
/),;( =fPf
Example: improvement to a spectral fit due to the Example: improvement to a spectral fit due to the assumption of a different model, with possibly assumption of a different model, with possibly additional terms additional terms see the F-test tables for see the F-test tables for the corresponding probabilities (specific command in XSPEC)the corresponding probabilities (specific command in XSPEC)
k/2
Δ∝with k=number of additional with k=number of additional terms (parameters)terms (parameters)
![Page 13: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/13.jpg)
An application of the F-test
low F value likely low significance
![Page 14: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/14.jpg)
14
Errors within XSPEC: the contour Errors within XSPEC: the contour plotsplots
68%68%90%90%
99%99%
contour plots:contour plots: show the statistical show the statistical uncertainties related uncertainties related to two parametersto two parameters
CONFIDENCE INTERVALSCONFIDENCE INTERVALSConfidence sigma (Delta)chi2
68.3% 1.0 1.00 90.0% 1.6 2.71 95.5% 2.0 4.00 99.0% 2.6 6.63 99.7% 3.0 9.00 see Avni (1976)
![Page 15: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/15.jpg)
Low-counting statistic regime
The fit statistic routinely used is referred to as the 2 statistic
€
S = (Si − Bii
∑ ts / tb −mits)2 /((σ S )i
2 + (σ B )i2)
where Si = src observation counts in the I={1,…,N} data bins withexposure tS, Bi = background counts with exposure tB and mi = model predicted count rate;(σS)2 and (σB)2 = variance on the src and background counts, typically estimated by Si and Bi
BUTBUTthe 2 statistic fails in low-counting regime
(few counts in each data bin)
![Page 16: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/16.jpg)
16
Possible solution: to rebin the data so that each bin contains a large enough number of counts
BUTLoss of information and dependence
upon the rebinning method adopted
Viable solution: to modify S so that it performs better in low-count regime
Estimate the variance for a given data bin by the average counts from surrounding bins
(Churazov et al. 1996)
BUTneed for Monte-Carlo simulations to support the result
![Page 17: A taste of statistics Normal error (Gaussian) distribution most important in statistical analysis of data, describes the distribution of random observations.](https://reader038.fdocuments.us/reader038/viewer/2022103101/56649f165503460f94c2c54f/html5/thumbnails/17.jpg)
17
The Cash statistic
Construct a maximum-likelihood estimator based on the Poisson distribution of the detected counts
(Cash 1979; Wachter et al. 1979) in presence of background counts implemented into XSPEC
Spectrum fitted using the C-stat and then rebinned just for presentation
purposes