The Higgs Boson What it is and how to find it Roger Barlow Manchester University.
Statistics for HEP Roger Barlow Manchester University
description
Transcript of Statistics for HEP Roger Barlow Manchester University
Slide 1
Statistics for HEPRoger Barlow
Manchester University
Lecture 2: Distributions
Slide 2
0
0.2
0.4
0 1 2 3 4 5
r
The Binomialn trials r successesIndividual success probability
p
rnr pp
rnr
npnrP
)1(
)!(!
!),;(
Variance
V=<(r- )2>=<r2>-<r>2
=np(1-p)
Mean
=<r>=rP( r )
= np
1-p p q
Met with in Efficiency/Acceptance calculations
Slide 3
Binomial Examples
0
0.5
1
r
0
0.1
0.2
0.3
r
0
0.1
0.2
0.3
0.4
r
0
0.1
0.2
0.3
0.4
r
n=10 p=0.2 p=0.5 p=0.8
0
0.1
0.2
0.3
r
p=0.1 n=20
0
0.1
0.2
r
n=50 n=5
Slide 4
Poisson
‘Events in a continuum’e.g. Geiger Counter clicksMean rate in time intervalGives number of events in
data0
0.1
0.2
0.3
r
!);(
rerP
r Mean
=<r>=rP( r )
= Variance
V=<(r- )2>=<r2>-<r>2
=
=2.5
Slide 5
Poisson Examples
0
0.1
0.2
0.3
r
0
0.2
0.4
0.6
0.8
r
0
0.1
0.2
0.3
0.4
r
0
0.1
0.2
r
0
0. 2
r
0
0. 1
r
=25=10=5.0
=0.5 =2.0=1.0
Slide 6
Binomial and Poisson
From an exam paperA student is standing by the road, hoping to hitch a lift. Cars pass
according to a Poisson distribution with a mean frequency of 1 per minute. The probability of an individual car giving a lift is 1%. Calculate the probability that the student is still waiting for a lift
(a) After 60 cars have passed
(b) After 1 hour
a) 0.9960=0.5472 b) e-0.6=0.5488
Slide 7
Poisson as approximate binomial
0
0.1
0.2
0.3
r
Poisson mean 2
Binomial: p=0.1, 20 tries Binomial: p=0.01, 200 tries
Use: MC simulation (Binomial) of Real data (Poisson)
Slide 8
Two Poissons
2 Poisson sources, means 1 and 2 Combine samplese.g. leptonic and hadronic decays of W
Forward and backward muon pairsTracks that trigger and tracks that don’t
What you get is a Convolution
P( r )= P(r’; 1 ) P(r-r’; 2 )
Turns out this is also a Poisson with mean 1+2 Avoids lots of worry
Slide 9
The GaussianProbability
Density
ex
xP22 2/)(
2
1),;(
Mean
=<x>=xP( x ) dx
=Variance
V=<(x- )2>=<x2>-<x>2
=
Slide 10
Different GaussiansThere’s only one!
Normalisation (if required)
Location change
Width scaling factor
Falls to 1/e of peak at x=
Slide 11
Probability Contents
68.27% within 195.45% within 299.73% within 3
90% within 1.645 95% within 1.960 99% within 2.576 99.9% within
3.290These numbers apply to Gaussians and only Gaussians
Other distributions have equivalent values which you could use of you wanted
Slide 12
Central Limit TheoremOr: why is the Gaussian Normal?
If a Variable x is produced by the convolution of variables x1,x2…xN
I) <x>=1+2+…N
II) V(x)=V1+V2+…VN
III) P(x) becomes Gaussian for large N
There were hints in the Binomial and Poisson examples
Slide 13
CLT demonstration
Convolute Uniform distribution with itself
0
0.02
0.04
0.06
0.08
0.1
0.12
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Series1
00.010.020.030.040.050.060.070.08
1 5 9
13
17
21
25
29
0
0.02
0.04
0.06
0.08
0.1
0.12
1 2 3 4 5 6 7 8 9 100
0.02
0.04
0.06
0.08
0.1
0.12
1 4 7 10 13 16 19
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
1 7 13 19 25 31 37 43 49
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
1 7 13 19 25 31 37 43 49 55
Slide 14
CLT Proof (I) Characteristic functions
Given P(x) consider
The Characteristic Function
For convolutions, CFs multiplyIf then Logs of CFs Add
)()(~
)( kkPdxxPee ikxikx
)()()( xhxgxf )(~
)(~)(~
khkgkf
Slide 15
CLT proof (2) Cumulants
CF is a power series in k<1>+<ikx>+<(ikx)2/2!>+<(ikx)3/3!>+…
1+ik<x>-k2<x2>/2!-ik3<x3>/3!+…Ln CF can then be expanded as a series
ikK1 + (ik)2K2/2! + (ik)3K3/3!…
Kr : the “semi-invariant cumulants of Thiele”Total power of xr
If xx+a then only K1 K1+a
If xbx then each Kr brKr
Slide 16
CLT proof (3)
• The FT of a Gaussian is a Gaussian
Taking logs gives power series up to k2
Kr=0 for r>2 defines a Gaussian• Selfconvolute anything n times: Kr’=n Kr
Need to normalise – divide by n Kr’’=n-r Kr’= n1-r Kr
• Higher Cumulants die away fasterIf the distributions are not identical but similar the same
argument applies
2/2/ 2222 kx ee
Slide 17
CLT in real life
Examples• Height• Simple
Measurements• Student final
marks
Counterexamples• Weight• Wealth• Student entry
grades
Slide 18
Multidimensional Gaussian
e yxyxyyxx yxyx
yx
yxyxyxP
/))((2/)(/)()1(2
1
2
22222
12
1
),,,,;,(
ee yyxx yx
yxyxyxyxP
2222 2/)(2/)(
2
1),,,;,(
Slide 19
Chi squared
Sum of squared discrepancies, scaled by expected error
Integrate all but 1-D of multi-D Gaussian
n
i i
iix
1
2
2
2/22/
2 2
)2/(
2);(
e
nnP n
n
Mean n Variance 2n
CLT slow to operate
Slide 20
Generating DistributionsGiven int rand()in stdlib.hfloat Random()
{return ((float)rand())/RAND_MAX;}float uniform(float lo, float hi)
{return lo+Random()*(hi-lo);}float life(float tau)
{return -tau*log(Random());}float ugauss() // really crude. Do not use
{float x=0; for(int i=0;i<12;i++) x+=Random(); return x-6;}
float gauss(float mu, float sigma) {return mu+sigma*ugauss();}
Slide 21
A better Gaussian Generator
float ugauss(){
static bool igot=false;
static float got;
if(igot){igot=false; return got;}
float phi=uniform(0.0F,2*M_PI);
float r=life(1.0f);
igot=true;
got=r*cos(phi);
return r*sin(phi);}
Slide 22
More complicated functions
Find P0=max[P(x)]. Overestimate if in doubtRepeat :
Repeat: Generate random xFind P(x)Generate random P in range 0-P0
till P(x)>P
till you have enough data If necessary make x non-random and compensate
0
5
10
15
20
25
1 3 5 7 9 11 13 15 17 19
Slide 23
Other distributions
Uniform(top hat)=width/12
Breit Wigner (Cauchy)Has no variance – useful for wide tails
LandauHas no variance or mean
Not given by .Use CERNLIB ee
Slide 24
Functions you need to know
•Gaussian/Chi squared
• Poisson• Binomial• Everything else