Results from Small Numbers Roger Barlow Manchester 26 th September 2007.
-
date post
20-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Results from Small Numbers Roger Barlow Manchester 26 th September 2007.
Results from Small Numbers
Roger Barlow
Manchester
26th September 2007
Results from Small Numbers
2
Summary: 3 problems
1. Number of events small (including 0)
2. Number of events Background
3. Uncertainties in background and efficiency
Results from Small Numbers
3
Preamble: Particle Physics is about counting
Pretty much everything is Poisson Statistics
NNumbers of events give cross sections, branching
ratios…E.g. Branching ratio for some rare B decayHave T=200M B mesons, Efficiency E=0.02Observe 100 events. Error 100=10
BR=N/TEBR=(25.0 2.5) 10-6
Results from Small Numbers
4
1. What do you do with zero?Observe 0 events. (Many searches do)BR=00 is obviously wrongWe know BR is small. But not that it’s exactly 0.
Combination of small value+bad luck can give 0 events
Need to go back two steps to consider what we mean by measurement errors what we mean by probability
There are two ways of doing this…
Results from Small Numbers
5
Probability (conventional definition) Limit of frequency
P(A)= Limit N N(A)/N
Standard (frequentist) definition.
Two tricky features
1. P(A) depends not just on A but on the ensemble. Several ensembles may be possible
2. If you cannot define an Ensemble there is no such thing as probability
Ensemble of Everything
A
Results from Small Numbers
6
Feature 1:There can be many Ensembles
Probabilities belong to the event and the ensemble Insurance company data shows P(death) for 40 year old male
clients = 1.4% (Classic example due to von Mises) Does this mean a particular 40 year old German has a 98.6%
chance of reaching his 41st Birthday? No. He belongs to many ensembles
German insured males German males Insured nonsmoking vegetarians German insured male racing drivers …
Each of these gives a different number. All equally valid.
Results from Small Numbers
7
Feature 2: Unique events have no ensemble
Some events are unique.
Consider
“It will probably rain tomorrow.”
There is only one tomorrow (Thursday 27th September). There is NO ensemble. P(rain) is either 0/1 =0 or 1/1 = 1
Strict frequentists cannot say 'It will probably rain tomorrow'.
This presents severe social problems.
Results from Small Numbers
8
Circumventing the limitationA frequentist can say:
“The statement ‘It will rain tomorrow’ has a 70% probability of being true.”
by assembling an ensemble of statements and ascertaining that 70% (say) are true.
(E.g. Weather forecasts with a verified track record)
Say “It will rain tomorrow” with 70% confidence
For unique events, confidence level statements replace probability statements.
Results from Small Numbers
9
What is a measurement?MT=1743 GeV
What does it mean?For true value and standard deviation the probability (density) for a
result x is (for the usual Gaussian measurement)
P(x ; , )=(1/ 2) exp-[(x -)2/22]
So is there a 68% probability that MT lies between 171 and 177 GeV?
No. MT is unique. It is either in that range or outside. (Soon we will know.)
For a given , the probability that x lies within is 68%This does not mean that for a given x, the ‘inverse’ probability that
lies within is 68%P(x; , ) cannot be used as a probability for .(It is called the likelihood function for given x.)
Results from Small Numbers
10
What a measurement error saysA Gaussian measurement gives a result within 1 of
the true value in 68% of all cases.
The statement “[x - , x + ]” has a 68% probability of being true, using the ensemble of all such statements.
We say “[x - , x + ]”, or “MT lies between 169 and 179 GeV” with 68% confidence.
Can also say “[x - 2, x + 2]” @ 95% or“[-, x + ]” @ 84% or whatever
Results from Small Numbers
11
Extension beyond simple GaussianChoose construction (functions x1(), x2()) for
which
P(x[x1(), x2()]) CL for all
Given a measurement X, make statement
[LO, HI] @ CL
Where X=x2(LO), X=x1(HI)
Results from Small Numbers
12
Confidence Belt
x
Example: proportional Gaussian
= 0.1
Measures with 10% accuracy
Result (say) 100.0
LO=90.91 HI= 111.1
Constructed horizontally such that the probability of a result lying inside the belt is 68%(or whatever)
Read vertically using the measurement
X
Results from Small Numbers
13
Use for small numbers
Can choose CL
Just use one curve to give upper limit
Discrete observable makes smooth curves into ugly staircases
Observe n. Quote upper limit as HI from solving
0n P(r, HI) = 0
n e-HI HI r/r! = 1-CL
English translation. n is small. can’t be very large. If the true value is HI (or higher) then the chance of a result this small (or smaller) is only (1-CL) (or less)
Results from Small Numbers
14
Poisson table
Upper limitsn 90% 95% 99%0 2.30 3.00 4.611 3.89 4.74 6.642 5.32 6.30 8.413 6.68 7.75 10.054 7.99 9.15 11.605 9.27 10.51 13.11
.....
Results from Small Numbers
15
Alternative: Bayesian (Subjective) Probability
P(A) is a number describing my degree of belief in A1=certain belief. 0=total disbeliefIntermediate beliefs can be calibrated against simple
frequentist probabilities. P(A)=0.5 means: I would be indifferent given the choice of betting on A or betting on a coin toss.
A can be anything. Measurements, true values, rain, MT, MH, horse races, existence of God, age of the current king of France…
Very adaptable. But no guarantee my P(A) is the same as your P(A). Subjective = unscientific?
Results from Small Numbers
16
Bayes’ Theorem
General (uncontroversial) form
P(A|B)=P(B|A) P(A)
P(B)
“Bayesian” form
P(Theory|Data)=P(Data|Theory) P(Theory)
P(Data)
PriorPosterior
Results from Small Numbers
17
Bayes at workPrior and Posterior can be numbers
Successful predictions boost belief in theorySeveral experiments modify belief cumulatively
Prior and Posterior can be distributions P(|x) P(x|) P()
Ignore normalisation problems
= X
Results from Small Numbers
18
Example: PoissonP(r,)=exp(- ) r/r!
With uniform prior this gives posterior for
Shown for various small r results
Read off intervals...r=6
r=2
r=1
r=0 P()
Results from Small Numbers
19
Upper limitsUpper limit from n events
0
HI exp(- ) n/n! d = CL
Repeated integration by parts:0
n exp(- HI) HIr/r! = 1-CL
Same as frequentist limit This is a coincidence! Lower Limit formula is not
the same
Results from Small Numbers
20
Result depends on Prior
Example: 90% CL Limit from 0 events
Prior flat in
Prior flat in
X
X =
=1.65
2.30
Results from Small Numbers
21
Health Warning
Results using Bayesian Statistics will depend on the prior
Choice of prior is arbitrary (almost always). ‘Uniform’ is not the answer. Uniform in what?
Serious statistical analyses will try several priors and check how much the result shifts (robustness)
Many physicists don’t bother
Results from Small Numbers
22
2. Next problem: add a background
=S+bFrequentist Approach:1. Find range for 2. Subtract b to get range for SExamples: See 5 events, background 1.2
95% Upper limit: 10.5 9.3 See 5 events, background 5.1
95% Upper limit: 10.5 5.2 ? See 5 events, background 10.6
95% Upper limit: 10.5 -0.1
Results from Small Numbers
23
S< -0.1? What’s going on?
If N<b we know that there is a downward fluctuation in the background. (Which happens…)
But there is no way of incorporating this information without messing up the ensemble
Really strict frequentist procedure is to go ahead and publish.
We know that 5% of 95%CL statements are wrong – this is one of them
Suppressing this publication will bias the global results
Results from Small Numbers
24
=S+b for Bayesians No problem! Prior for is uniform for Sb Multiply and normalise as before
Posterior Likelihood Prior
Read off Confidence Levels by integrating posterior
=X
Results from Small Numbers
25
Incorporating Constraints: PoissonWork with total source strength (s+b) you know
is greater than the background b
Need to solve
Formula not as obvious as it looks.
1 CL 0
ne s b s b r r !
0
ne b b r r !
Results from Small Numbers
26
Feldman Cousins MethodWorks by attacking what looks like a different problem...
E xampl e:
Y ou hav e a back grou n d of 3 .2
O bserv e 5 ev en t s? Q u ot e on e-si ded u pper l i mi t (9 .27 -3 .2 =6.07@ 90% )
O bserv e 25 ev en t s? Q u ot e t w o-si ded l i mi t s
Physic ist s ar e humanI deal Physic ist1. Choose S t r at egy2 . E x amine dat a3 . Q uot e r esult
Real Physic ist1. E x amine dat a2 . Choose S t r at egy3 . Q uot e R esult
A ls o c a lle d * ‘th e U n ifie d A p p ro a c h ’
* by Feldman and Cousins, mostly
Results from Small Numbers
27
Feldman Cousins: =s+bb is known. N is measured. s is what we're after
This is called 'flip-flopping' and BAD because is wrecks the whole design of the Confidence Belt
Suggested solution:
1) Construct belts at chosen CL as before
2) Find new ranking strategy to determine what's inside and what's outside
1 sided 90%
2 sided 90%
Results from Small Numbers
28
Feldman Cousins: RankingFirst idea (almost right)Sum/integrate over range of (s+b) values with
highest probabilities for this observed N.(advantage that this is the shortest interval)
Glitch: Suppose N small. (low fluctuation)P(N;s+b) will be small for any s and never get
countedInstead: compare to 'best' probability for this N, at
s=N-b or s=0 and rank on that numberSuch a plot does an automatic ‘flip-flop’N~b single sided limit (upper bound) for sN>>b 2 sided limits for s
Results from Small Numbers
29
How it worksHas to be computed for the
appropriate value of background b. (Sounds complicated, but there is lots of software around)
As n increases, flips from 1-sided to 2-sided limits – but in such a way that the probability of being in the belt is preserved
s
n
Means that sensible 1-sided limits are quoted instead of nonsensical 2-sided limits!
Results from Small Numbers
30
Arguments against using Feldman Cousins
Argument 1
It takes control out of hands of physicist. You might want to quote a 2 sided limit for an expected process, an upper limit for something weird
Counter argument:
This is the virtue of the method. This control invalidates the conventional technique. The physicist can use their discretion over the CL. In rare cases it is permissible to say ”We set a 2 sided limit, but we're not claiming a signal”
Results from Small Numbers
31
Feldman Cousins: Argument 2 Argument 2
If zero events are observed by two experiments, the one with the higher background b will quote the lower limit. This is unfair to hardworking physicists
Counterargument
An experiment with higher background has to be ‘lucky’ to get zero events. Luckier experiments will always quote better limits. Averaging over luck, lower values of b get lower limits to report.
Example: you reward a good student with a lottery ticket which has a 10% chance of winning £10. A moderate student gets a ticket with a 1% chance of winning £ 20. They both win. Were you unfair?
Results from Small Numbers
32
3. Including Systematic Errors =aS+b is predicted number of eventsS is (unknown) signal source strength.
Probably a cross section or branching ratio or decay rate
a is an acceptance/luminosity factor known with some (systematic) error
b is the background rate, known with some (systematic) error
Results from Small Numbers
33
3.1 Full Bayesian
Assume priors for S (uniform?) For a (Gaussian?) For b (Poisson or Gaussian?)
Write down the posterior P(S,a,b).
Integrate over all a,b to get marginalised P(s)
Read off desired limits by integration
Results from Small Numbers
34
3.2 Hybrid Bayesian
Assume priors For a (Gaussian?) For b (Poisson or Gaussian?)Integrate over all a,b to get marginalised P(r,S)
Read off desired limits by 0nP(r,S) =1-CL etc
Done approximately for small errors (Cousins and Highland). Shows that limits pretty insensitive to a , b
Numerically for general errors (RB: java applet on SLAC web page). Includes 3 priors (for a) that give slightly different results
Results from Small Numbers
35
3.3-3.9
Extend Feldman Cousins Profile Likelihood: Use P(S)=P(n,S,amax,bmax)
where amax,bmax give maximum for this S,n Empirical Bayes And more…
Results being compared as outcome from Banff workshop
Results from Small Numbers
36
Summary Straight Frequentist approach is objective and
clean but sometimes gives ‘crazy’ results Bayesian approach is valuable but has
problems. Check for robustness under choice of prior
Feldman-Cousins deserves more widespread adoption
Lots of work still going on This will all be needed at the LHC