The search for new resonances using the ATLAS detector at the LHC: A statistical analysis

The search for new resonances using the ATLASdetector at the LHC: A statistical analysis

Eric Williams

Columbia University

April 20, 2012

Memorial Sloan Kettering Cancer Center

Overview

The Large Hadron Collider

The ATLAS detector

Why new particles?

How to find new particles?

The statistical analysis of search results

E. Williams (Columbia U.) April 20, 2012 2 / 40

The Large Hadron Collider (LHC)The LHC is the world’s largest and highest-energy particle accelerator.

Located just outside of Geneva, Switzerland at the European Organization for NuclearResearch (CERN)

Collides counter-rotating proton beams at center-of-mass energy = 7 TeV (!)

Consists of a 17 mile long tunnel, 300 feet underground

Beams collide at the centers of four experiments (detectors):

ATLAS, ALICE, CMS and LHC-b


The Large Hadron Collider (LHC)

Many challenges:

1232 superconducting (1.9 K) dipole magnets at 8.33 T

7600 km Nb-Ti superconducting cable carrying 11850 A

Ultra-high vacuum: ∼ 10−13 atm ( 110 × Pmoon)

Number of bunches per beam (2011): 1380 with 1× 1011 protons perbunch


The ATLAS Detector

The ATLAS (A Toroidal LHC ApparatuS) detector is designed to be a ‘general-purpose’detector undertaking a broad range of physics analyses.


The ATLAS Detector

It’s BIG!


The ATLAS Detector

ATLAS is composed of components, each optimized for particular functions

Inner Detector: measures the momentum andtrajectories charged particles

Electromagnetic Calorimeters: measures theenergies of electrons, photons, and others

Hadronic Calorimeters: measures the energiesof the hadronic particles (‘jets’, protons,neutrons)

Muon System: measures the momenta ofmuons in the event

The goal of particle detection is to reconstruct the kinematics of each collision(particle energies, directions, charges and masses), to determine whethersomething “interesting”1 happened during that event

1In this talk, the term “interesting” refers to relevance for new particle detection


Why search for new particles?

Why new particles?

The Standard Model of particle physics is a wildly successful theory,

that it provides a description of the fundamental particles and their interactions:

however...


Why new particles?

The Standard Model of particle physics is a wildly successful theory,

that it provides a description of the fundamental particles and their interactions:

however...E. Williams (Columbia U.) April 20, 2012 9 / 40

Why new particles?

... as it stands, it is an incomplete theory of nature (even with a Higgs):

Does not incorporate dark matter or dark energy

Says nothing about gravity

Neutrinos have mass??

Matter-antimatter asymmetry of the universe

Has inherent ‘unnaturalness’ in the hierarchy problem(why is gravity so weak?)

No Higgs seen yet... or has it?? (more on this later)

Bullet cluster as evidence of dark

matter

Proposed solutions to these issues all have one thing in common: new particles!

particle/theory motivation what is it? seen at LHC?

Supersymmetry Hierarchy problem & Dark matter All SM particle have SUSY ’twin’ Not yet

Extra dimensions2 Gravity & Hierarchy Problem 4D space-time rests on ’brane’ Not yetof larger dimensional space

Higgs Explains particle mass Generator of electroweak maybe (not yet)symmetry breaking

2My particular interest (thesis topic)


http://en.wikipedia.org/wiki/Bullet_Cluster

How to search for new particles?


The problem: y-axis – the production rate (or cross section) of a given process

7



7


The true challenge in the search for new physics is finding the signal…



underneath all this background!

7



Discovery rate ~11 orders of magnitude below the noise*!



underneath all this background!

7



Discovery rate ~11 orders of magnitude below the noise*!

* Today’s “noise” is 1979’s (W) and 1984’s (Z) Nobel Prizes



How to hack through 11 orders of magnitude of particle jungle?

t

Three steps:

1 ‘Trigger’: An event filter designed to keep only ‘interesting’ events

2 Kinematic ‘cuts’: Use distinguishing variables specific to the decay ofour particle of interest to discriminate between signal and background

3 Statistical analysis: Quantify the statistical significance of discoveryor exclusion


How to find new particles? 1) Trigger

There are 20 million collisions per second in the ATLAS detector. Event filtering(triggering) is not only necessary for identifying the ‘interesting’ events out of the hugebackgrounds, but also crutial to controling unmanageably large amount of data output.

Without triggers: 2× 107 crossingss× ∼ 10 collisions

crossing× ∼ 4 MB

event→∼ TB

sof data!

With triggers: Recorded event rate reduced to few hundred events per second

Example of what to trigger on:

not “interesting” “interesting”


How to find new particles? 2) Kinematic “cuts”

A “cut” on an event, is simply a choice of whether or not to keep the event, dependingits observable properties. Examples:

Data quality: Was the detector working when event was recorded?

Object quality: Were the objects (particles) in the event recorded properly?

Kinematic: Do the objects in the event look more like signal than they dobackground?

Example:Searching for Higgs: H →WW → eνeν: expect to see 2 electrons, 2 neutrinos

Can cut (remove) background events:

Z → µµ/ττ or (no electrons or neutrinos)

W → eν/µν/τν (single electron or neutrino)

tt̄→W+bW−b̄→ eνeν + bb̄ (veto on bb̄)

But it is often the case that the background ‘looks’ just like the signal:e.g. pp→WW → eνeν or Z → ee + detector mis-measurement (looks like νν)



Can you tell the difference?

“Ordinary” event with 2 “jets” Z boson decaying to electron + antielectronits discovery was rewarded by a Nobel prize

in 1984



We know what the signal (and background) decay products should look likethrough Monte-Carlo simulations

From these simulations, we can identify certain variables that will give bestsignal/background separation

Many additional cuts are made on object energies, directions, combinations,composition, etc.



After optimizing cuts made on final state particles for best signal to background

rejection, we choose a ‘discriminating variable’:

A quantity that gives best separation between signal and background after allthe event level cuts are applied

Common discriminating variables:

Number of estimated signal/background/data eventsPhysical observable (below), e.g. invariant mass of the decay particlesOutput from neural net or decision tree

Use as input to statistical analysis!


Statistical Analysis of Search Results

Signal Significance

The ultimate goal of an experimental search for a new particle is to state whether or nota statistically significant observation of the signal has been made. In other words, toanswer the canonical question:

Given the data, is it possible to distinguish between two hypotheses?

Three main steps toward answering this question:

1 Define a test-statistic which optimizes theseparation of the signal+backgroundhypothesis (H1) and the background-onlyhypothesis (H0)

2 Run an appropriate number ofpseudo-experiments (Frequentist) for bothhypothesis, incorporating all signal andbackground nuisance parameters (systematics)in a coherent way (Bayesian).

3 Define confidence levels designating exclusionsor discoveries 2012 Higgs → γγ 4.7fb−1 result


https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/CONFNOTES/ATLAS-CONF-2011-161/

1) Define test-statistic: Likelihood-Ratio

Neyman-Pearson lemma suggests that the most powerful test forstatistically separating two point hypotheses is the likelihood-ratiotest, that is:

Λ =L(s + b|x)

L(b|x)

s = signalb = backgroundx = dataL = likelihood

Rate of signal or background events follow a Poisson distribution,appropriate choice for likelihood functional form:

L(s + b) =(s + b)xe−(s+b)

x!, L(b) =

(b)xe−b

x!


http://www.jstor.org/stable/91247

1) Define test-statistic: Likelihood-Ratio

With this choice, combining likelihoods from multiple channels (e.g. X → Y andX → Z) as well as from multiple bins within a discriminating variable (e.g.M(X)) is natural:

Λ(x) =

channels∏i

bins∏j

(sij + bij)xij e−(sij+bij)

xij !/

(bij)xij e−(bij)

xij !.

In the high-statistics limit the distributions of -2 ln Λ are expected to converge to(χ2s+b − χ2

b), thus it is more common to use:

NLLR(x) = −2 ln(Λ(x))

= −2

channels∑i

bins∑j

[sij − xij ln

(1 +

sijbij

)]

This test statistic decreases monotonically for increasingly signal-like (decreasinglybackground-like) experiments. Can be used to order data outcomes relative toeach other in hypothesis significance


2) Pseudo-Experiments: A Semi-Frequentist Approach

Assuming that the data is drawn randomly from a Poisson parent distribution, wecan create pdfs of NLLR(x) for both the signal+background hypothesis (H1) andthe background-only (H0) hypothesis, by conducting pseudo-experiments

Systematic uncertainties (nuisance parameters) are incorporated by sampling abifurcated Gaussian distribution with the ±σ uncertainties estimated for eachsource (hence ‘Semi’-Frequentist)

The pseudo-experiment background (Bmj ) and signal (Smj ) yields are then given as:

Bmj = B0,m

j (1 +

Nbkgdsys∑i

gbkgdi )

Smk = S0,m

k (1 +

Nsigsys∑i

gsigi )

Where B0,m (S0,m) is the nominal background (signal) poisson yield for channel j (k)and bin m. gbkgd (gsig) is the contribution from systematic uncertainty i.



Running O(20k) pseudo-experiments, we evaulate the NLLR distributions underthe H0, NLLR(x = Db), and H1, NLLR(x = Ds+b), hypotheses. Where:

Db =

Nbins∑m

Nb∑j

Bmj , Ds+b =

Nbins∑m

(

Nb∑j

Bmj +

Ns∑k

Smk )

N

-2ln(Λ(x))

Bkgd Only

Sig + Bkgd



Running O(20k) pseudo-experiments, we evaulate the NLLR distributions underthe H0, NLLR(x = Db), and H1, NLLR(x = Ds+b), hypotheses. Where:

Db =

Nbins∑m

Nb∑j

Bmj , Ds+b =

Nbins∑m

(

Nb∑j

Bmj +

Ns∑k

Smk )

N

-2ln(Λ(x))

Bkgd Only

Sig + Bkgd

NLLR(xdata)

Location of measured data on NLLR pdf (Prior Predictive Ensemble) used toquantify exclusion/discovery


3) Modified Frequentist Confidence Levels: CLs

Confidence levels defined as the fraction of outcomes predicted to falloutside of the specified confidence interval

CLs+b: fraction of H1 pseudo-experiments less signal-like than data

CLs+b = Ps+b(X ≥ Xobs) =

∫ ∞NLLR(x=Dobs)

P(x = Ds+b) dP

CLb: fraction of H0 pseudo-experiments less signal-like than data

CLb = Pb(X ≥ Xobs) =

∫ ∞NLLR(x=Dobs)

P(x = Db) dP

Therefore...

High CLs+b → data signal-like. (otherwise, used for exclusion)High CLb (or low 1− CLb) → data not background like.

For discovery, (1-CLb) ≡ p-value = the probability, under H0 hypothesis, that backgroundfluctuated to produce observed signal. Typically require (1-CLb) < 5σ(4.3× 10−7) toclaim discovery



N

-2ln(Λ(x))

Bkgd Only

Sig + Bkgd

NLLR(xdata)

1-CLb

CLs+b

Therefore...

High CLs+b → data signal-like. (otherwise, used for exclusion)High CLb (or low 1− CLb) → data not background like.

For discovery, (1-CLb) ≡ p-value = the probability, under H0 hypothesis, that backgroundfluctuated to produce observed signal. Typically require (1-CLb) < 5σ(4.3× 10−7) toclaim discovery



The strictly frequentist CLs+b confidence level, while a powerfulstatistical tool, is unstable if the background model dramaticallydisagrees with the data:

Background overestimated → low CLs+b → possible exclusion!Background underestimated → high CLs+b → possible discovery!

The solution: The modified frequentist confidence level, CLs

CLs ≡CLs+b

CLb

Normalizing CLs+b with CLb removes the dependence on backgroundmodelling and leads to more conservative limits on H1 hypothesis, aswell as lower false exclusion rate (type II error) than nominal (1− CL)

A signal model is then excluded at or above95% confidence level if CLs ≤ 0.05


CLs in action

Latest results from the ATLAS Higgs searchSummary plots from complicated combination of 7 different Higgsdecay channels


CLs in action

Latest results from the ATLAS Higgs searchSummary plots from complicated combination of 7 different Higgsdecay channels

CLs provides ranges of 95% CL exclusion:

113–116 GeV 131-238 GeV 251-466 GeV


CLs in action

If we look in the non-excluded regionsThe significance of an excess is quantified by (1−CLb) = p-value ≡ the probability(p0) that a background-only experiment is more signal-like than observed

Observed local significances for a 126 GeV Higgs boson is 3.5σ (expected SMHiggs significance is 2.5σ)

Observed global probability for such an excess to be found in the full search range,in the absence of a signal, is approximately 1.4%, corresponding to 2.2σ.


Other results and the future!These statistical methods have been adopted collaboration wide and many limits havebeen set thus far:


Other results and the future!

In 2012, the ATLAS will collect 4× the amount of data and the LHChas ramped up to a center-of-mass energy of 8 TeV!

Will also be able to probe even further for new physics.

This means, assuming Higgs, a 5σ Higgs discover will happen withinthe year!

Stay tuned!


Backups

Cross section exclusion

It is common to express confidence interval exclusions in terms of signal cross sections

A confidence limit for exclusion is defined as the value of a parameter (e.g. cross section)which is excluded at a specified confidence level.


Signal mass exclusion

Cross section exclusion plots can also be used to express an upper limit on signal massexclusionSignal model excluded where signal limits cross observed limits (Randall-SundrumGraviton search shown as example)


The search for new resonances using the ATLAS detector at the LHC: A statistical analysis

Technology

Transcript of The search for new resonances using the ATLAS detector at the LHC: A statistical analysis