Practical Statistics for Particle Physicists Lecture 1

Practical Statistics for Particle PhysicistsLecture 1

Harrison B. ProsperFlorida State University

European School of High-Energy PhysicsParádfürdő, Hungary

5 – 18 June, 2013

1

Outline

h Lecture 1h Descriptive Statisticsh Probability & Likelihood

h Lecture 2h The Frequentist Approach h The Bayesian Approach

h Lecture 3h Analysis Example

2

Descriptive Statistics

Descriptive Statistics – 1

Definition: A statistic is any function of the data X.Given a sample X = x1, x2, … xN, it is often of interest to compute statistics such as

the sample average

and the sample variance

In any analysis, it is good practice to study ensemble averages, denoted by < … >, of relevant statistics

4

N

iix

Nx

1

1

2 2

1

1 ( )N

ii

S x xN


Ensemble Average

Mean

Error

Bias

Variance

Mean Square Error

5

x

b x

V (x− x )2

x

MSE (x−)2

6


2

2

MSE ( )x

V b

The MSE is the most widely used measure of closeness of anensemble of statistics {x} to the true value μ

The root mean square (RMS) is

RMS MSE

Exercise 1:Show this


Consider the ensemble average of the sample variance

7

S2 1N

(xi −x)2 i1

N

∑

1N

xi2 −

i1

N

∑ 2N

xix i1

N

∑ 1N

x 2

i1

N

∑

1N

xi2 − x 2

i1

N

∑ x2 − x 2


The ensemble average of the sample variance

has a negative bias of –V / N

8

S2 x2 − x 2

x2 − x2 N

−N −1N

⎛⎝⎜

⎞⎠⎟ x 2

V −VN

Exercise 2:Show this

Descriptive Statistics – SummaryThe sample averageis an unbiased estimateof the ensemble average

The sample variance is a biased estimateof the ensemble variance

9

N

iix

Nx

1

1

2 2

1

1 ( )N

ii

S x xN

Probability

11

Probability – 1

Basic Rules 1. P(A) ≥ 02. P(A) = 1 if A is true3. P(A) = 0 if A is false

Sum Rule4. P(A+B) = P(A) + P(B) if AB is false *

Product Rule5. P(AB) = P(A|B) P(B) *

*A+B = A or B, AB = A and B, A|B = A given that B is true

P(A | B) P(AB)P(B)

Probability – 2

By definition, the conditional probability of A given B is

P(A) is the probability of A withoutrestriction.

P(A|B) is the probability of A when we restrict to the conditions under which B is true.

P(B | A) P(AB)P(A)

12

ABAB

Fromwe deduceBayes’ Theorem:

( | ) ( )( | )( )BP BB A PP

PA

A

( ) ( | ) ( )( | ) ( )

P PBA A AP A

PBB P B

13

ABAB

Probability – 3

A and B are mutually exclusive if

P(AB) = 0

A and B are exhaustive if

P(A) + P(B) = 1Theorem

14

Probability – 4

( ) ( ) ( ) ( )P P PA B A B BAP

Exercise 3: Prove theorem

ProbabilityBinomial & Poisson Distributions

Binomial & Poisson Distributions – 1A Bernoulli trial has two outcomes:

S = success or F = failure.

Example: Each collision between protons at the LHC is a Bernoulli trial in which something interesting happens (S) or does not (F).

16

Binomial & Poisson Distributions – 2Let p be the probability of a success, which is assumed to be

the same at each trial. Since S and F are exhaustive, the probability of a failure is 1 – p. For a given order O of n trails, the probability Pr(k,O|n) of exactly k successes and n – k failures is

17

Pr(k,O,n) pk(1−p)n−k

Binomial & Poisson Distributions – 3If the order O of successes and failures is irrelevant, we can

eliminate the order from the problem integrating over all possible orders

This yields the binomial distribution

which is sometimes written as

18

Pr(k ,n) Pr(k,O,n)

O∑ pk(1−p)n−k

O∑

Binomial(k,n, p) ≡ kn( )pk(1−p)n−k

k : Binomial(n, p)

Binomial & Poisson Distributions – 3We can prove that the mean number of successes a is

a = p n.

Suppose that the probability, p, of a success is very small,

then, in the limit p → 0 and n → ∞, such that a is constant, Binomial(k, n, p) → Poisson(k, a).

The Poisson distribution is generally regarded as a good model for a counting experiment

19

Exercise 5: Show that Binomial(k, n, p) → Poisson(k, a)

Exercise 4: Prove it

20

Common Distributions and Densities

1 / a

exp[−(x−)2 / (2σ 2 )] / (σ 2p )

xn/2−1 xp(−x / 2) / [2n/2Γ(n / 2)]

xb−1ab xp(−ax) / Γ(b)axp(−ax)

kn( )pk(1−p)n−k

ak xp(−a) / k!n!

k1!L kK !piki

i1

K

∏ , pi 1i1

K

∑ , ki ni1

K

∑

Uniform(x,a)Gaussian(x,,σ )Chiσq(x,n)Γa a(x,a,b)Exp(x,a)Bino ial(k,n, p)Poiσσon(k,a)Multino ial(k,n, p)

Probability – What is it Exactly?

21

There are at least two interpretations of probability:

1. Degree of belief in, or plausibility of, a propositionExample:

It will snow in Geneva on Friday

2. Relative frequency of outcomes in an infinite sequence of identically repeated trialsExample:

trials: proton-proton collisions at the LHC

outcome: the creation of a Higgs boson

Likelihood

23

Likelihood – 1The likelihood function is simply the probability, or probability

density function (pdf), evaluated at the observed data.

Example 1: Top quark discovery (D0, 1995)

p(D| d) = Poisson(D |d) probability to get a count D

p(17|d) = Poisson(17|d) likelihood of observation D = 17

Poisson(D|d) = exp(-d) dD / D!

24

Likelihood – 2Example 2: Multiple counts Di with a fixedtotal count N

This is an example of a multi-binned likelihood

p(D | p) Multino ial(D,N, p)D D1,L ,DK , p p1,L , pK

D i Ni1

K

∑

25

Likelihood – 3Example 3: Red shift and distance modulus measurements of N = 580 Type Ia supernovae

This is an example of an un-binned likelihood

p(D |ΩM ,ΩD ,Q)

Γauσσian(xi,(zi,ΩM ,ΩD ,Q),σ i)i1

N

∏D zi, xi ±σ i

26

Likelihood – 4Example 4: Higgs to γγThe discovery of the neutral Higgs boson in the di-photon final

state made use of an an un-binned likelihood,

where x = di-photon massesm = mass of new particlew = width of resonances = expected signalb = expected backgroundfs = signal modelfb = background model

p(x | s,m,w,b) xp[−(σ b)] σfσ(xi | ,w) b fb(xi)[ ]i1

N

∏

Exercise 6: Show that a binned multi-Poisson likelihood yields anun-binned likelihood ofthis form as the bin widthsgo to zero

27

Likelihood – 5

Given the likelihood function we can answer questions such as:

1. How do I estimate a parameter?2. How do I quantify its accuracy?3. How do I test an hypothesis?4. How do I quantify the significance of a result?

Writing down the likelihood function requires:5. Identifying all that is known, e.g., the observations6. Identifying all that is unknown, e.g., the parameters7. Constructing a probability model for both

28

Likelihood – 6

Example: Top Quark Discovery (1995), D0 Results

knowns:D = 17 eventsB = 3.8 ± 0.6 background events

unknowns:b expected background counts expected signal countd = b + s expected event count

Note: we are uncertain about unknowns, so 17 ± 4.1 is a statement about d, not about the observed count 17!

Likelihood – 7

Probability:

Likelihood:

whereB = Q / kδB = √Q / k

29

Q (B /δB)2 (3.8 / 0.6)2 41.11

k B /δB2 3.8 / 0.62 10.56

p(D | s, b) Poiσσon(D, σ b) Poiσσon(Q, bk)

(σb)D −(σb)

D !(bk)Q−bk

Γ(Q 1)

p(17 | s, b)

30

Summary

Statistic A statistic is any function of potential observations

ProbabilityProbability is an abstraction that must be interpreted

LikelihoodThe likelihood is the probability (or probability density) of potential observations evaluated at the observed data

Tutorials

Location:http://www.hep.fsu.edu/~harry/ESHEP13

Download tutorials.tar.gz

and unpacktar zxvf tutorials.tar.gz

Need:Recent version of Root linked with RooFit and TMVA

31

http://www.hep.fsu.edu/~harry/ESHEP13

http://www.hep.fsu.edu/~harry/ESHEP13

Practical Statistics for Particle Physicists Lecture 1

Documents

Transcript of Practical Statistics for Particle Physicists Lecture 1