Search for the Standard Model Higgs boson in the …...aiuta ad individuare la scala di energia alla...
Transcript of Search for the Standard Model Higgs boson in the …...aiuta ad individuare la scala di energia alla...
Universita degli Studi di Roma Tor Vergata
FACOLTA DI SCIENZE MATEMATICHE, FISICHE E NATURALI
Corso di Laurea Magistrale in Fisica Nucleare e Subnucleare
Search for the Standard Model Higgs boson
in the four leptons decay channel with multivariate analysis
at the ATLAS experiment
Candidato:
Cristina Papaleo
Relatore:
Anna Di Ciaccio
Anno Accademico 2010-2011
Universita degli Studi di Roma Tor Vergata
FACOLTA DI SCIENZE MATEMATICHE, FISICHE E NATURALI
Corso di Laurea Magistrale in Fisica Nucleare e Subnucleare
Ricerca del bosone di Higgs del Modello Standard
nel canale di decadimento a quattro leptoni
applicando l’analisi multivariata
nell’ambito dell’esperimento ATLAS
Candidato:
Cristina Papaleo
Relatore:
Anna Di Ciaccio
Anno Accademico 2010-2011
Contents
Introduction 1
Introduzione 3
1 The Standard Model and The Higgs Boson 6
1.1 The Standard Model (SM) . . . . . . . . . . . . . . . . . . . . . . 6
1.1.1 Particles and Interactions . . . . . . . . . . . . . . . . . . 6
1.1.2 The Gauge Field Theories . . . . . . . . . . . . . . . . . . 8
1.1.3 The Higgs Mechanism . . . . . . . . . . . . . . . . . . . . 12
1.2 The SM Higgs Boson . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.2.1 Theoretical Constraints on the Higgs Mass . . . . . . . . 19
1.2.2 Experimental Limits on the Higgs Mass . . . . . . . . . . 26
1.3 Higgs at the LHC . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.3.1 Production Mechanisms . . . . . . . . . . . . . . . . . . . 33
1.3.2 Decay Modes . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.3.3 Discovery Potential . . . . . . . . . . . . . . . . . . . . . . 45
1.3.4 The Higgs Mass and Total Decay Width . . . . . . . . . . 46
2 LHC and the ATLAS Detector 49
2.1 The Large Hadron Collider . . . . . . . . . . . . . . . . . . . . . 49
2.1.1 Architectural Overview . . . . . . . . . . . . . . . . . . . 51
2.2 ATLAS Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.2.1 Geometry and Definitions . . . . . . . . . . . . . . . . . . 55
2.2.2 Physics Requirements . . . . . . . . . . . . . . . . . . . . 56
2.2.3 ATLAS Detector Overview . . . . . . . . . . . . . . . . . 57
2.2.4 The Inner Detector (ID) . . . . . . . . . . . . . . . . . . . 58
2.2.5 The Calorimeters . . . . . . . . . . . . . . . . . . . . . . . 63
2.2.6 The Magnet System . . . . . . . . . . . . . . . . . . . . . 67
i
Contents
2.2.7 The Muon Spectrometer . . . . . . . . . . . . . . . . . . . 68
2.2.8 Trigger System . . . . . . . . . . . . . . . . . . . . . . . . 76
2.2.9 Electron Reconstruction and Identification . . . . . . . . . 79
2.2.10 Muon Reconstruction and Identification . . . . . . . . . . 84
3 Higgs search in the decay channel H → ZZ(∗) → 4l 89
3.1 Signal and Main Backgrounds . . . . . . . . . . . . . . . . . . . . 90
3.1.1 Data and Monte Carlo Samples . . . . . . . . . . . . . . . 92
3.2 Pileup Reweighting . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.3 Lepton Reconstruction and Identification . . . . . . . . . . . . . 97
3.3.1 GSF Electrons . . . . . . . . . . . . . . . . . . . . . . . . 99
3.4 Lepton Corrections . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.5 Event Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
3.5.1 Preliminary Cuts . . . . . . . . . . . . . . . . . . . . . . . 101
3.5.2 Event Preselection . . . . . . . . . . . . . . . . . . . . . . 102
3.5.3 Quadruplet Candidates and Higgs Candidate Selection . . 105
3.5.4 Reducible Background Rejection . . . . . . . . . . . . . . 106
3.5.5 Higgs Boson Mass Reconstruction . . . . . . . . . . . . . 110
3.6 Background Estimation . . . . . . . . . . . . . . . . . . . . . . . 111
3.7 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
3.7.1 Combined Results . . . . . . . . . . . . . . . . . . . . . . 115
4 Angular Analysis and TMVA 119
4.1 Angular Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.1.1 Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.1.2 Angular and pT Distributions . . . . . . . . . . . . . . . . 124
4.2 TMVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.2.1 What is TMVA . . . . . . . . . . . . . . . . . . . . . . . . 125
4.2.2 Boosted Decision Tree (BDT) . . . . . . . . . . . . . . . . 130
4.2.3 Artificial Neural Network (ANN) . . . . . . . . . . . . . . 135
4.2.4 Optimization of the MVA methods . . . . . . . . . . . . . 139
4.2.5 Monte Carlo Samples . . . . . . . . . . . . . . . . . . . . 139
4.2.6 Input Variables . . . . . . . . . . . . . . . . . . . . . . . . 140
4.2.7 Tuning Parameters for the implemented BDTG . . . . . . 141
4.2.8 Tuning Parameters for the implemented ANNs . . . . . . 144
4.2.9 Comparing MVA Methods Performance . . . . . . . . . . 150
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Conclusions 163
A Vector Boson Fusion Angular Distributions 166
ii
Contents
Bibliography 169
iii
Introduction
The Standard Model of elementary particles is the theory that describes
three of the four fundamental interactions (the strong, weak and electromag-
netic ones) in a coherent framework. It has been developed during the last
century and has proven to be extremely successful providing excellent descrip-
tion of all the phenomena observed in the particle physics domain up to energies
explored by LEP (Large Electron-Positron collider) and Tevatron. Nevertheless,
the origin of particle masses remains an open question. The electroweak symme-
try breaking mechanism provides an elegant answer. However, it predicts also
a yet undiscovered particle, the Higgs boson. The Higgs boson is searched for
through its direct production and its virtual effects on electroweak observables,
but so far physicists have only been able to set limits on the mass of this particle
since no signal has been observed. This quest is very important and the future
of particle physics will be driven by either the discovery or the exclusion of the
SM Higgs boson. In case of discovery, the Higgs boson mass can give hints on
which scale new physics occurs; in case of exclusion, a deep theoretical work
will be needed to find an alternative explanation of the experimental successes
of the SM.
In 2009 the Large Hadron Collider (LHC) started to provide proton-proton col-
lision data at the highest energy ever reached. This machine has been designed
to provide the ultimate answer about the Higgs boson existence because it will
be able to explore the whole mass range, from 114.4 GeV up to 1 TeV, where
the Higgs boson is expected to be.
This thesis describes the work dedicated to the search of the Higgs boson in
the ATLAS experiment at LHC. Since the Higgs mass is unknown, many mass
channels need to be studied; I have studied the channel H → ZZ(∗) → 4l, where
l denotes an electron or a muon. The presence of a real Z provides two high
pT leptons in the final state together with other two leptons coming from the
virtual Z. A mass constraint can be made on both lepton pairs. The applica-
1
Introduction
tion of kinematic cuts on leptons pT , requirements of isolation for the leptons
in the final state and impact parameter requirements provide a sufficient back-
ground rejection. This cut-based analysis has been implemented and its results
have been validated with the ones obtained by the ATLAS Collaboration, then
I have integrated it with a multivariate one (MVA).
For this purpose, at the beginning I have carried out an angular analysis to
find additional variables besides the invariant mass of the four leptons selected
events, that can potentially discriminate between signal and background pro-
cesses. I have selected the angular variable cosΘ of the momentum of the
incoming quark in the CM frame together with the Higgs transverse momen-
tum and invariant mass as input variables to apply the multivariate analysis
at an Higgs signal sample of mH = 360 GeV. The signal at high mass has
been considered as the benchmark case because in this range the discriminating
power of the chosen variables is greater. Then using the found observables a
search optimization has been performed applying advanced techniques such as
artificial neural networks and boosted decision tree methods. Therefore this
thesis will describe the application of the traditional cut-based analysis and of
the MVA to the Higgs search in the four leptons channel and will compare the
signal efficiency and background rejection for the two analyses.
In summary this thesis is organized as follows: in order to provide the nec-
essary preliminary notions, in the first chapter an introduction to the Stan-
dard Model and the SM electroweak symmetry breaking is given along with our
present knowledge about the SM Higgs boson and the latest results from the
direct and the indirect searches. The second chapter presents the main features
of the LHC accelerator and the ATLAS detector. Then Chapter 3 describes
the cut-based Higgs search analysis and the events results that I have obtained
applying it to data collected by ATLAS in 2011. The subsequent chapter ex-
plains the multivariate analysis tool and the tuning of the multivariate methods,
selected to optimize the sensitivity for finding the Higgs boson in the considered
channel, is illustrated in detail. Finally the optimization results for different
methods are discussed. Concluding remarks and perspectives are discussed in a
dedicated chapter.
2
Introduzione
Il Modello Standard e la teoria che descrive tre delle quattro interazioni fon-
damentali (forte, debole ed elettromagnetica) in un quadro coerente. E’ stato
sviluppato durante l’ultimo secolo e si e dimostrato estremamente valido fornen-
do un’eccellente descrizione di tutti i fenomeni osservati in fisica delle particelle
fino alle energie esplorate da LEP e Tevatron. Tuttavia l’origine della massa
delle particelle rimane ancora una questione aperta. Il meccanismo di rottura
della simmetria elettrodebole fornisce un’elegante soluzione. Tale meccanismo
pero predice anche l’esistenza di una particella non ancora scoperta, il bosone di
Higgs. La ricerca di tale bosone avviene attraverso la sua produzione diretta e
attraverso misure indirette, a partire dagli effetti virtuali che il bosone di Higgs
induce sugli osservabili elettrodeboli. Finora i fisici sono stati in grado solo di
fissare dei limiti sulla massa di questa particella poiche nessun segnale e stato
osservato. Questa ricerca e molto importante e il futuro della fisica particellare
verra determinato dalla scoperta o dall’esclusione del bosone di Higgs predet-
to dal Modello Standard. In caso di scoperta, la massa del bosone di Higgs
aiuta ad individuare la scala di energia alla quale appare nuova fisica; in caso
di esclusione invece un profondo lavoro teorico sara necessario per trovare una
spiegazione alternativa ai successi sperimentali del Modello Standard.
Nel 2009 l’acceleratore LHC ha iniziato a fornire eventi di collisioni protone
protone alla piu alta energia mai raggiunta. Questa macchina e stata costruita
al fine di fornire la risposta definitiva sull’esistenza del bosone di Higgs, poiche
sara in grado di esplorare l’intero intervallo di massa, che va da 114.4 GeV ad
1 TeV, in cui ci si aspetta di trovarlo.
Questa tesi descrive il lavoro dedicato alla ricerca del bosone di Higgs condotta
nell’ambito dell’esperimento ATLAS ad LHC. Poiche la massa del bosone di
Higgs non e nota, molti sono i canali di ricerca che devono essere indagati; io
ho studiato il canale H → ZZ(∗) → 4l, dove con l si indica un elettrone o un
muone. La presenza di una Z reale produce due leptoni di alto pT nello stato
3
Introduzione
finale insieme ad altri due leptoni provenienti dalla Z virtuale. Un vincolo sulla
massa puo essere posto per entrambe le coppie leptoniche. L’applicazione di
tagli sul pT dei leptoni, richieste di isolamento per i leptoni nello stato finale e
le condizioni poste sul parametro di impatto forniscono una reiezione del fondo
sufficiente. Questa analisi, basata sull’applicazione di alcuni tagli (cut-based),
e stata implementata e i suoi risultati sono stati convalidati con quelli ottenuti
dalla Collaborazione ATLAS; ho quindi poi integrato questa analisi con una
multivariata (MVA).
A tal scopo, dapprima ho effettuato un’analisi angolare per trovare ulteriori va-
riabili, oltre la massa invariante degli eventi a quattro leptoni selezionati, che
potenzialmente potessero contribuire alla separazione tra segnale e processi di
fondo. Ho selezionato la variabile angolare cosΘ dell’impulso del quark inciden-
te nel sistema di riferimento del centro di massa, insieme al momento trasverso
dell’Higgs e alla sua massa invariante, come variabili di input per applicare
un’analisi multivariata ad un campione di segnale, corrispondente ad un Higgs
di massa mH = 360 GeV. E’ stato considerato il segnale ad alta massa come
caso di riferimento perche in questa regione il potere discriminante delle varia-
bili scelte e maggiore. Usando quindi gli osservabili cosı trovati, e stata eseguita
una ricerca di ottimizzazione applicando tecniche avanzate come le reti neurali
artificiali e gli alberi decisionali (boosted decision tree). Pertanto in questo la-
voro di tesi si presentera l’applicazione della tradizionale analisi cut-based e di
MVA alla ricerca dell’Higgs nel canale di decadimento in quattro leptoni e si
confrontera l’efficienza di segnale e la reiezione del fondo per le due analisi.
In breve questa tesi e organizzata come segue: al fine di fornire le nozioni
preliminari necessarie, nel primo capitolo un’introduzione al Modello Standard
e alla rottura della simmetria elettrodebole viene data insieme all’attuali co-
noscenze del bosone di Higgs del Modello Standard e agli ultimi risultati delle
ricerche dirette e indirette. Il secondo capitolo presenta le principali caratteri-
stiche dell’acceleratore LHC e del rivelatore ATLAS. Poi, il Capitolo 3 descrive
l’analisi cut-based per la ricerca dell’Higgs ed i risultati ottenuti applicandola ai
dati raccolti da ATLAS nel 2011. Il successivo capitolo illustra lo strumento di
analisi multivariata e come sono stati scelti i parametri caratterizzanti i meto-
di selezionati al fine di ottimizzare la sensibilita per trovare il bosone di Higgs
nel canale considerato. Infine, i risultati di ottimizzazione per diversi metodi
sono discussi. Osservazioni conclusive e prospettive future sono delineate in un
capitolo a parte.
4
Chapter 1The Standard Model and The
Higgs Boson
The Standard Model (SM) is a successful theory incorporating the present
understanding of fundamental particles and their interactions. The SM is per-
turbative at sufficiently high energies and renormalizable due to its gauge invari-
ant formulation. It is able to accommodate basically all the known experimen-
tal facts and precise measurements performed in high energy particles colliders
over the last decades. However it remains “incomplete”: the existence of dark
matter and the gravitational interaction are not described; the mechanism for
electroweak symmetry breaking that gives masses to the particles is not iden-
tified, and the associated particle, the Higgs boson, has not been observed yet.
Therefore it is crucial to prove the Higgs existence and the validity of the theory
or completely exclude it over the entire allowed mass range.
This chapter briefly describes the Standard Model and its key ingredients with
some attention to the mechanism which predicts the existence of the Higgs par-
ticle and describes the origin of the masses of the fundamental particles. Then
it will discuss the Higgs boson production, decay modes and the limits on its
mass.
1.1 The Standard Model (SM)
1.1.1 Particles and Interactions
In our current understanding, the physical world is composed by a few fun-
damental building blocks, which are collectively called matter, and is shaped
by their interactions, which are collectively called forces. In the Standard
Model particles can be divided in two categories: fermions and bosons [1].
6
1. The SM and The Higgs Boson 1.1 The Standard Model (SM)
The fermions are the bricks of the ordinary matter. They, by definition, have
semi-integer spins and the fundamental fermions in nature and therefore in SM
have all spin 1/2. Fermions obey Fermi-Dirac statistic and the Pauli exclusion
principle. The fermions can be grouped into leptons (l) and quarks (q), as shown
in Table 1.1. Quarks have an electric charge that is a fraction of the electron’s
one and carry a color charge. They do not exist as free particles but they are
combined to form particles called hadrons. Leptons are typically divided into
charged and neutral ones. The latter are referred to as neutrinos, interact only
weakly with the matter and can be transformed into each other when they pass
through matter. Moreover both leptons and quarks can be organized in three
symmetric sets of particles, called families, with increasing mass. Each of them
includes two quarks and two leptons.
There are four kinds of interactions between the fermions: the gravitational,
Fermions
Quarks
GenerationQ = + 2
3, T3,L = + 1
2, T3,R = 0, Q = − 1
3, T3,L = − 1
2, T3,R = 0,
YL = + 13, YR = + 4
3YL = + 1
3, YR = − 2
3
I u 1.7− 3.3 MeV d 4.1− 5.8 MeVII c 1.18− 1.34 GeV s 80− 130 MeVIII t 173.1± 0.6± 1.1 GeV b 4.13− 4.37 GeV
Leptons
GenerationQ = −1, T3,L = − 1
2, T3,R = 0, Q = 0, T3,L = + 1
2, T3,R = 0,
YL = −1, YR = −2 YL = −1, YR = 0
I e 0.511 MeV νe < 3 eVII µ 106 MeV νµ < 19 keVIII τ 1.78 GeV ντ < 18.2 MeV
Table 1.1: Fermions of the SM. The properties of these particles are ex-pressed in terms of their mass, their charge (Q), their weak isospin (I3, is thethird component) and their hypercharge (Y ). These quantum numbers arerelated to each other by the relation: Q = T3+
Y2. It also introduces the dis-
tinction between left-handed and right-handed component of the fermionicparticle (indicated respectively by the subscript L and R) [1].
the electromagnetic, the weak and the strong. Each interaction is mediated by
one or more massive or massless spin-1 particles, summarized in Table 1.2 along
with the interaction’s range. The integer spin particles obey the Bose-Einstein
statistics and are referred to as the bosons. In particular gravitation should also
be mediated by a boson, the graviton, even if with spin 2 and not 1 as the oth-
ers, but there is still no evidence of its existence. The last boson which appears
in Table 1.2 is the Higgs boson. This particle has never been observed and its
search is the topic of this thesis. Here it is only mentioned that, on the contrary
of the other bosons, it is not a mediator of a force and it is expected by SM
predictions to be scalar, i.e. spin 0, and neutral. It appears as a consequence
7
1. The SM and The Higgs Boson 1.1 The Standard Model (SM)
of the Higgs mechanism after the spontaneous electroweak symmetry breaking
(see Section 1.1.3). Through this mechanism, as a consequence of the interac-
tion with the Higgs field, the vector bosons and the fermions acquire mass.
It is noted that in the following the natural system of units ~ = c = 1 will be
used.
ForceRelative Range
Boson SpinMass
strength [m] [GeV]
Strong 1 10−15 8 gluons (g) 1 0Electromagnetic 10−2 ∞ photon (γ) 1 0
Weak 10−2 10−13 W± 1 80.399± 0.02310−2 10−13 Z 1 91.1876± 0.0021
Gravitational 10−40 ∞ graviton (?) 2 0Higgs (H) 0 ?
Table 1.2: The particle interactions with their carrier particles, whose massand spin are reported. The relative strength and effective range of forces arealso shown [1]. The gravitational force is not described by the SM. Thegraviton and the Higgs particle have not yet been observed experimentally.
1.1.2 The Gauge Field Theories
The Standard Model has been extended from models developed in the 1960’s
by Glashow, Winberg and Salam. The Standard Model description of particles
and forces in nature is based on the mathematical language of the Quantum
Field Theory (QFT), where particles are excitations of fundamental fields which
are functions of, or extend in, space and time. Particle dynamics are described
by a Lagrangian density L, simply referred to as the Lagrangian hereafter1.
Fermions are represented mathematically by matter fields, interactions between
them are represented by gauge fields that operate on matter fields. For a given
Lagrangian description of a system, gauge invariance implies that L is conserved
under local symmetry transformations2. Any such symmetry corresponds to a
conservation law and vice versa (Noether’s theorem), for instance if a temporal
translation invariance is required, a conservation of energy is obtained, whereas
spatial translation invariance implies conservation of momentum.
The SM is a renormalizable quantum field theory which provides a unified ap-
proach for the description of the electromagnetic, weak and strong interactions.
The Standard Model is based on the gauge symmetry SU(3)C×SU(2)L×U(1)Y .
1In the following it will not use the Lagrangian L to describe the system, but the Lagrangiandensity L which is related to L by L =
∫Ld~x. However for simplicity L will be called
Lagrangian as well.2These symmetries are said to be global if they are the same at any point in the Universe.
In the SM local symmetries are imposed as well as the global symmetries. These stricterrequirements imply that a certain local (position dependent) transformations should leave allphysical quantities conserved in the local space, apart from the globally preserved ones.
8
1. The SM and The Higgs Boson 1.1 The Standard Model (SM)
SU(3)C is the group of color symmetry, described within the frame of Quantum
Chromo Dynamics (QCD), SU(2)L the one of the weak isospin symmetry and
U(1)Y the one for the hypercharge symmetry. The symmetry SU(2)L ×U(1)Y ,
that represents the unified weak and electromagnetic interaction, is broken spon-
taneously by the Higgs mechanism (SU(2)L × U(1)Y → U(1)Q) [2, 3].
1.1.2.1 Quantum Electrodynamics (QED)
The Lagrangian of the massless electromagnetic field Aµ interacting with a
spin-1/2 field ψ of bare mass m is
L = −1
4FµνF
µν + ψ(iγµDµ −m)ψ (1.1)
where ψ = ψ†γ0 and γµ are the 4 × 4 Dirac matrices satisfying the anti-
commutation relation: {γµ, γν} = 2gµν with gµν being the metric tensor. The
electromagnetic field tensor is defined as
Fµν = ∂µAν − ∂νAµ (1.2)
The Lagrangian in (1.1) is obtained by requiring that the Dirac Lagrangian of
the free spin-1/2 particle
L = ψ(iγµ∂µ −m)ψ (1.3)
becomes symmetric under local U(1)Q3 transformations of the form:
U(x) = exp (−ieQα(x)) (1.4)
where e is the unit electric charge and Q is the charge operator4.
The Lagrangian invariance can be maintained with the addition of a spin-1 field
Aµ, called gauge boson, it is noted that under this transformation ψ and Aµ
change as:
ψ(x) → ψ′(x) = ψ(x) exp (−ieQα(x)) (1.5)
Aµ(x) → A′
µ(x) = Aµ(x) + ∂µα(x) (1.6)
For (1.3) to be invariant under (1.4), the covariant derivative Dµ
Dµ = ∂µ + ieAµQ (1.7)
3U(1) is the group of all complex numbers with module one.4Qψ = qfψ, where qf = 1 for the electrons.
9
1. The SM and The Higgs Boson 1.1 The Standard Model (SM)
has to be introduced. Therefore the new Lagrangian can be written as (1.1). It is
characterized by a term representing the original electron field (ψ(iγµ∂µ−m)ψ),
composed by the fermionic kinetic term (iψγµ∂µψ) and the fermionic mass term
(ψmψ). eψγµAµQψ is the interaction term between the vector field Aµ and
the electromagnetic current. The strength of the interaction is proportional to
the value of the constant e. The new field Aµ is thus the photon field and
the interaction term appearing in the Lagrangian due to to the local gauge
invariance describes the electromagnetic interactions mediated through photons.
Finally the first term is the kinetic term for the photon (− 14FµνF
µν).
The respective conserved current is the electric current:
jµ = eψγµQψ (1.8)
The photon is massless, and a mass term of the form m2AµAµ was not intro-
duced to preserve gauge invariance.
1.1.2.2 Quantum Chromodynamics (QCD)
The strong interactions are described by a local non-abelian gauge theory,
in which SU(3)C is the gauge group and gluons are the gauge bosons. The
corresponding Lagrangian is:
L = −1
4Gµν
a Gaµν + qj(iγµD
µjk −Mjk)qk (1.9)
where Mjk is the quark mass matrix. The Latin indices refer to color and
assume values a = 1,2,. . . 8 for the eight gluons and j,k =1,2,3 for the three
quarks. The gluon field tensor is defined as:
Gµνa = ∂µGν
a − ∂νGµa − gsf
abcGµbG
νc (1.10)
here Gµa are the gluon fields, gs is the strong coupling and fabc are the structure
constants of the SU(3) group. The covariant derivative acting on the quark
fields is:
Dµjk = δjk∂
µ + igs(Ta)jkGµa (1.11)
where Ta are the generators of the SU(3) group defined by the commutation
relation [Ta, Tb] = ifabcTc. In particular T a = λa/2, λa are the eight Gell-Mann
matrices. These are hermitian, 3× 3 and traceless matrices.
The second term of (1.9) contains a quark-gluon interaction vertex, while the
first term contains three and four gluon couplings. These self-interactions of the
gluons, which have no analog in QED where the photon is electrically neutral,
are a consequence of the fact that gluons also carry color charge due to the
10
1. The SM and The Higgs Boson 1.1 The Standard Model (SM)
non-abelian nature of the group.
Gluons are required to be massless since the presence of a mass term for gauge
fields break the gauge invariance of Lagrangian.
1.1.2.3 Weak Interactions and Electroweak Unification
The fermions are grouped in left-handed and right-handed fields:
ψL = PLψ =1
2(1− γ5)ψ (1.12)
ψR = PRψ =1
2(1 + γ5)ψ (1.13)
where PL,R are the chirality5 operators and γ5 = iγ0γ1γ2γ3 [4].
Only left-handed particles and right-handed antiparticles participate in the weak
interaction. The left-handed fermions are SU(2)6 doublets7,
ψL =
(νe
e−
)L
,
(νµ
µ−
)L
,
(ντ
τ−
)L
,
(u
d′
)L
,
(c
s′
)L
,
(t
b′
)L
(1.14)
while the right-handed fermions are singlets
ψR =(µ−)
R,(e−)R, ...(u)R,(d)R, ... (1.15)
As a consequence of these couplings the SU(2) symmetry is denoted as SU(2)L.
The electromagnetic and weak interactions are unified into a single theory by the
Glashow, Salam and Weinberg (GSW) theory, the Electroweak theory (EW).
The simplest unification of the parity violating weak force and the parity con-
serving electromagnetic force is the SU(2)L × U(1)Y gauge theory.
Local gauge invariance under SU(2) transformations requires introduction of
three massless spin 1 gauge bosons W iµ, i = 1,2,3. The conserved quantity is
called weak isospin (Ta with a = 1,2,3). An additional U(1) symmetry was
added to include the electromagnetic interaction in the EW theory. It is an
independent gauge symmetry of the weak hypercharge (Y ), which is specified
according to the formula Q = T3+Y2 , where Q is the electric charge and T3 the
third component of the weak isospin. This symmetry is denoted as U(1)Y and
it requires an additional gauge boson Bµ with spin 1. The U(1)Y gauge boson
5Chirality is a property of the field defined by the operator γ5, which is formed, as shownin the text, by the product of Dirac matrices so that it anti-commutes with all the others. Incase of massless particles the chirality corresponds to the helicity: fermions with right-handed(left-handed) helicity are the ones that have the spin pointing in the same (opposite) directionof the momentum. For antifermions this convention is reversed.
6SU(2) is the group of the special unitary 2× 2 matrices. A unitary matrix satisfies: T †a =
T−1a , where T †
a is the hermitian conjugate matrix. The generators of SU(2) are Ta = τa/2,where τa are the Pauli matrices defined in the weak isospin space.
7d’,s’,b’ are the eigenstates of the weak interactions.
11
1. The SM and The Higgs Boson 1.1 The Standard Model (SM)
couples to both the right-handed and the left-handed components8.
The gauge invariant Lagrangian describing the electroweak interactions is
L = −1
4W a
µνWµνa − 1
4BµνB
µν + ψiγµDµψ (1.18)
where the field tensor Wµν and Bµν are defined as:
W iµν = ∂µW
iν − ∂νW
iµ − gεijkW
jµW
kν (1.19)
Bµν = ∂µBν − ∂νBµ (1.20)
Additionally cubic and quartic self-couplings of the W iµ fields have been intro-
duced. The covariant derivative is
Dµ = ∂µ + igW aµTa + ig′
1
2BµY (1.21)
where g is the SU(2) constant coupling and g′ the U(1) constant coupling. The
interaction between the fermions and the gauge fields is
Lint = −ψLγµ
(g
2W a
µ τa +g′
2BµY
)ψL − ψRγ
µ
(g′
2BµY
)ψR (1.22)
It should be noted that for the local gauge invariance to be conserved, no mass
terms for the fermions or the gauge bosons (m2BµBµ and m2WµWµ) could be
introduced in the Lagrangian.
1.1.3 The Higgs Mechanism
The Standard Model, i.e. the SU(3)C × SU(2)L × U(1)Y theory, is the
combination of the electroweak theory and the QCD theory. The symmetry of
SU(2)L × U(1)Y , i.e. the invariance for a local gauge transformation, requires
the presence of massless gauge bosons in the EW theory. This conflicts with
experimental measurements of W± and Z gauge bosons, according which their
masses are large and can not be neglected (see Table 1.2). A solution has been
proposed by F. Englert, R. Brout, P. Higgs and independently G. Guralnik, C.
R. Hagen, and T. Kibble. They conjectured that the massless gauge bosons of
weak interactions acquire their mass through interaction with a scalar field (the
8Transformations under SU(2)L×U(1)Y of the left-handed and right-handed components:
• Under SU(2)L transformation:
ψL(x) → ψ′L(x) = eigα
a(x) τa2 ψL(x), ψR(x) → ψ′
R(x) = ψR(x) (1.16)
• Under U(1)Y transformation:
ψL(x) → ψ′L(x) = eig
′β(x)Y2 ψL(x), ψR(x) → ψ′
R(x) = eig′β(x)Y
2 ψR(x) (1.17)
12
1. The SM and The Higgs Boson 1.1 The Standard Model (SM)
Higgs Field), resulting in a single massless gauge boson (the photon) and three
massive gauge bosons (W± and Z). This is possible because the Higgs field has
a potential function which allows degenerate vacuum solutions with a non-zero
vacuum expectation value [5].
In the context of the SU(2)L × U(1)Y symmetry, the Higgs mechanism is im-
plemented through an additional SU(2)L doublet of complex scalar fields (four
real scalar fields)
φ =
(φ+
φ0
)=
√1
2
(φ1 + iφ2φ3 + iφ4
), (1.23)
the self-interaction of which leads to the spontaneous electroweak symmetry
breaking9. The quantum numbers of these fields are summarized in Table 1.3.
The Higgs sector of the Lagrangian is
T T3 Y/2 Qφ+ 1/2 1/2 1/2 1φ0 1/2 -1/2 1/2 0
Table 1.3: The quantum numbers of the complex scalar fields of the SU(2)Ldoublet φ.
LH = (Dµφ)†(Dµφ)− V (φ) (1.24)
where the most general renormalizable form of the scalar potential is
V (φ) = µ2φ†φ+ λ(φ†φ)2 = µ2|φ|2 + λ|φ|4 (1.25)
The potential is chosen such that it is an even function of the scalar field, i.e.
V (φ) = V (−φ), so that the Lagrangian is invariant under the parity transfor-
mation φ → −φ. The potential is parametrized by λ and µ. λ, which is the
strength of the quartic self-coupling of the scalar field (showed by φ4 term), is
required to be positive so that the energy is bounded from below. This require-
ment ensures the existence of stable ground states. Two qualitatively different
cases, corresponding to manifest or spontaneously broken symmetry, may be
distinguished depending on the sign of the coefficient µ2.
If µ2 > 0, the potential has a unique minimum at φ = 0 that corresponds to
the ground state, i.e. the vacuum. In terms of a quantum field theory, where
9It might break this symmetry simply introducing by hand a mass term for gauge bosons,which violates the symmetry, however, such a procedure would destroy the renormalizabilityof the theory. Then it uses a more elegant way to break the symmetry called “spontaneoussymmetry breaking”. In this scenario, the gauge invariant Lagrangian is maintained, whilethe state of lowest energy, which is interpreted as the vacuum state is not gauge invariant.There is an infinite number of states, each of which has the same ground state energy andnature chooses one as a state of “true” vacuum.
13
1. The SM and The Higgs Boson 1.1 The Standard Model (SM)
(a) µ2 > 0.
(b) µ2 < 0.
Figure 1.1: Illustration of the Higgs potential for a scalar field φ = φ1+iφ2
with µ2 > 0 and µ2 < 0.
φ is an operator, the precise statement is that the operator φ has zero vacuum
expectation value (vev), i.e. 〈φ〉0 = 〈0|φ|0〉 = 0. The vacuum obeys the re-
flection symmetry of the Lagrangian. In this case, aside from the φ4 term the
Lagrangian is just the Lagrangian for a charged scalar particle of mass µ and
massless gauge bosons.
If µ2 < 0, the Lagrangian has a mass term of the wrong sign for the field φ and
the minimum energy is not at φ = 0. The potential adopts a shape known as
the “Mexican hat”, with a maximum at φ = 0, as can be seen in Figure 1.1(b).
The vacuum expectation value is obtained by looking at the stationary points
of L:
∂Lφ
∂(φ†φ)= 0 ⇒ φ20 = φ†φ ≡ 1
2(φ21 + φ22 + φ23 + φ24) = −µ
2
2λ≡ v2
26= 0 (1.26)
where v is referred to as the vacuum expectation value of the scalar field. It must
be noted that it is not zero. The values of (Reφ+, Imφ+, Reφ0, Imφ0) can range
over the surface of a 4-dimensional sphere of radius v, such that v2 = −µ2/λ
and φ†φ = |φ+|2 + |φ0|2. This implies that Lagrangian of φ is invariant under
rotations of this 4-dimensional sphere. The minimum of this potential no longer
corresponds to a unique value of φ but there is an infinite number of states
14
1. The SM and The Higgs Boson 1.1 The Standard Model (SM)
with the same lowest energy, there is a degenerate vacuum. The solution for
the location of the minima, φ0, is satisfied by
φ0 = eiθ√−µ
2
2λ(1.27)
where 0 ≤ θ ≤ 2π is the angle around the axis of the potential, V (φ). Choosing
one of the non-zero ground states φ0 for µ2 < 0 spontaneously breaks the
SU(2)L × U(1)Y symmetry down to U(1)Q. Lagrangian is still invariant, but
the ground state is no more symmetric under SU(2)L×U(1)Y . This is illustrated
in Figure 1.1: the potential in Figure 1.1(b) is symmetric under rotations but
any minimum chosen is not. Usually, θ = 0 is used to fix one vacuum state:
φ0(θ = 0) ≡ φvacuum. The direction of the minimum in the SU(2)L × U(1)Y
space is not determined since the potential depends only on the combination
φ†φ. Without loss of generality, now the Higgs field is fixed such that the vacuum
expectation value of φ is defined to be a real parameter in the φ0 direction, i.e.,
φ1 = φ2 = φ4 = 0, φ23 = −µ2/λ:
φ0 = 〈0|φ|0〉 = 1√2
(0
v
)(1.28)
φ+ is chosen to be zero because the vacuum state has to be neutral in order to
break SU(2)L ×U(1)Y symmetry saving the U(1)Q. Using the charge operator
Q on φ0 and the properties of the Higgs field, this leads to:
Qφ0 =
(T3 +
Y
2
)φ0 = (−1
2+
1
2)φ0 = 0 (1.29)
The vev has to be neutral because if Qφ = 0, as shown above, performing U(1)Q
transformation
φ→ φ′ = exp(−ieα(x)Q)φ ∼ (1− ieαQ)φ = φ (1.30)
So that choice for the vacuum state breaks both SU(2)L and U(1)Y but not
U(1)Q. In this way the vacuum stays neutral but it carries a hypercharge and
an isospin so that it couples to weak bosons.
The physical content of the theory is revealed by a perturbative expansion of the
Lagrangian around the ground state, φ(x) can be expanded about this particular
vacuum, one can parametrize excitations from this ground state by
φ =1√2e
iξa(x)τa
2v
(0
v +H(x)
)(1.31)
15
1. The SM and The Higgs Boson 1.1 The Standard Model (SM)
where the real fields ξ1, ξ2, ξ3 andH have a zero vacuum expectation value. This
gives rise to a scalar field H(x), the massive Higgs field, which describes radial
excitations from the ground state changing the potential energy, and to three
massless scalar fields ξa(x), the Goldstone bosons, corresponding to angular
excitations without potential energy change. These three massless scalar bosons
correspond to the three broken symmetry generators. These phase factors,
and thus the Goldstone bosons, can be eliminated by a local SU(2)L gauge
transformation with α(x) = ξ(x)/v, this gauge choice is referred to as unitary
gauge, leading to the following parametrization of the scalar field:
φ =1√2
(0
v +H(x)
)(1.32)
Here the degrees of freedom represented by the Goldstone bosons are absorbed
(“eaten up”) by the vector particlesW± and Z, given them an additional degree
of freedom: a longitudinal polarization, thus the vector bosons acquire mass.
Only massive particles with velocities below the speed of light can have longi-
tudinal degrees of freedom. The photon has only a transversal polarization10.
Therefore the ξa(x) disappear from the Lagrangian and reappear as the longitu-
dinal component of the massive gauge bosons. Since Qφ = 0, the ground state
is still symmetric under U(1)Q and the photon will remain massless.
1.1.3.1 Massive Gauge Bosons
The coupling of φ to the gauge bosons takes place through the covariant
derivative Dµ. By expanding around the ground state of φ, i.e. introducing the
ansatz (1.32) into the Lagrangian of the electroweak theory (the Higgs sector is
expressed in (1.24)), it is now straightforward to see how the Higgs mechanism
generates masses for W± and Z bosons. Evaluating the resulting kinetic term
(Dµφ)†(Dµφ) at the vacuum expectation value φ0:
(Dµφ0)†(Dµφ0) =
∣∣∣∣(∂µ + igW aµ
τa2
+ ig′1
2BµY
)φ0
∣∣∣∣2 (1.33)
10It is instructive to count the degrees of freedom after the spontaneous symmetry breakinghas occurred. The starting point is a Lagrangian with a complex scalar SU(2)L doublet φand four massless vector bosons. Counting degrees of freedom gives four from the scalarsand eight from the vector bosons, for a total of twelve. Through the Higgs mechanism theLagrangian is transformed into one real scalar, three massive vector bosons and one masslessvector boson. The massless vector boson is of course to be identified with the photon and thesingle remaining scalar with the Higgs boson. Counting degrees of freedom again gives onefrom Higgs, two from the photon and nine from the massive vector bosons, again adding upto twelve.
16
1. The SM and The Higgs Boson 1.1 The Standard Model (SM)
The relevant terms are (it should be noted that φ has hypercharge Y = 1):
∆L =1
2(0 v)
(1
2gW j
µτj +1
2g′Bµ
)(1
2gW kµτk +
1
2g′Bµ
)(0
v
)=
=1
8(0 v)
(gW 3
µ + g′Bµ g(W 1µ − iW 2
µ)
g(W 1µ + iW 2µ) −gW 3
µ + g′Bµ
)2(0
v
)=
=1
8v2[g2(W 1
µ − iW 2µ)(W
1µ + iW 2
µ) + (−gW 3µ + g′Bµ)
2
]=
=1
8v2g2
[(W 1
µ
)2+(W 2
µ
)2]+
1
8v2(g′Bµ − gW 3
µ
)(g′Bµ − gW 3µ
)=
=
(vg
2
)2
W+µ W
−µ +1
8v2(W 3
µ Bµ)
(g2 −gg′
−gg′ g′2
)(W 3µ
Bµ
)(1.34)
where the charged gauge bosons, W±, have already acquired mass and a mixing
between W 3µ and Bµ is observed. The corresponding mass eigenstates for the
neutral gauge bosons are obtained by diagonalizing the mass matrix:
W±µ =
1√2(W 1
µ ∓ iW 2µ) con mW =
1
2gv (1.35)
Zµ =gW 3
µ − g′Bµ√g2 + g′2
con mZ =1
2v√g2 + g′2 (1.36)
Aµ =g′W 3
µ + gBµ√g2 + g′2
con mA = 0 (1.37)
The spontaneous symmetry breaking rotates the four SU(2)L × U(1)Y gauge
bosons to their mass eigenstates by means of the gauge interaction term of the
Higgs fields: {W 1µ , W
2µ}→{W+
µ , W−µ } e {W 3
µ , Bµ}→{Aµ, Zµ}.W± is associated to the charged current processes, A to the electromagnetic
currents and Z to the neutral currents. Once Aµ is recognized as the photon,
the three couplings, previously described, are related to each others by
e =gg′√g2 + g′2
(1.38)
Usually the ratio between g and g′ is defined through the weak mixing angle
θW , also known as the Weinberg’s angle:
tan θW =g′
g(1.39)
It is important to notice that θW is a free parameter of the model since it is the
ratio of two coupling constants related to independent symmetry groups. Given
θW , with sin θW = g′√g2+g′2
, all gauge couplings are determined by the electric
17
1. The SM and The Higgs Boson 1.1 The Standard Model (SM)
charge:
e = g sin θW = g′ cos θW (1.40)
and thus the electroweak unification is achieved. In terms of θW , the photon
and Z boson field are(Zµ
Aµ
)=
(cos θW − sin θW
sin θW cos θW
)(W 3
µ
Bµ
)(1.41)
and the relationmW
mZ= cos θW (1.42)
for the masses of the gauge bosons is predicted.
By inserting (1.32) into the expression (1.25) for the Higgs potential V (φ), the
mass term −µ2H2 for the Higgs field H appears, implying the existence of a
new physical particle, the Higgs boson as already said, with mass
mH =√−2µ2 =
√2λv (1.43)
It can be noticed that basically in this model there are two constants g and
g′ related to the symmetry SU(2)L × U(1)Y and two parameters of the Higgs
potential µ and λ. Usually they are parametrized with the observables α, the
fine structure constant, GF , the Fermi constant, mZ , the Z boson mass, and
mH , the mass of the Higgs boson, for which the relations are summarized here:
α =g2g′2
4π(g2 + g′2
) =g2 sin2 θW
4π(1.44)
mZ =1
2v√g2 + g′2 (1.45)
GF =1√2v2
(1.46)
mH =√
−2µ2 =√2λv (1.47)
GF is the strength of the weak interaction in the effective and point-like de-
scription of weak interactions formulated by Fermi. The parameter v can be
determined from the measurement of the muon life time in the weak charged
current decay µ→ eνeνµ. The interaction strength for muon decay is measured
very precisely to be GF = 1.16637(1) · 10−5 GeV−2[1] giving the value for the
vacuum expectation value
v =(√
2GF
)−1/2 ≈ 246 GeV (1.48)
18
1. The SM and The Higgs Boson 1.2 The SM Higgs Boson
This value set the scale of the electroweak symmetry breaking11, but it is not
predicted by the SM. The relation between GF and the vev v comes from
GF√2=
g2
8m2W
=1
2v2(1.50)
which is a comparison between the Fermi theory and the charged current in the
limit of highly massive gauge bosons. In fact the muon decay at the leading
order can be described by the propagation of a W boson but Fermi showed
that this can be simplified to one vertex with the constant coupling GF . Once
the values of α, GF and mZ are known, it can be predicted from (1.35) and
(1.36) the mass of the W boson, at the lowest order mW = mZ cos θW ≈ 80
GeV, which has been confirmed experimentally by the Z’s and W ’s discovery
at SppS (Super proton-antiproton Synchrotron) and by precise measurements
of mW and mZ at LEP (Large Electron Positron Collider). The v parameter
can be experimentally determined but there is no way to measure the value of
λ before a discovery of the Higgs.
1.2 The SM Higgs Boson
1.2.1 Theoretical Constraints on the Higgs Mass
Despite the prediction of a Higgs boson, the Standard Model does not pro-
vide for its mass. Although the Higgs boson mass is a free parameter of the
theory, constraints on the possible mass values can be derived using theoretical
arguments regarding the energy regime in which the perturbative expansion of
the Standard Model is valid. Therefore these argumentations come from very
reasonable considerations, but they cannot provide stringent limits since they
depend on the absence of new physics up to a cut-off energy scale. As it will see,
this means that it can be set a range of masses that is valid as long as virtual
effects of new physics enter in the calculation of the Higgs boson mass. These
arguments are: the unitarity in longitudinal scattering amplitudes; the pertur-
bativity of the Higgs self-coupling and the stability of the electroweak vacuum
[5, 6, 7, 8, 9].
1.2.1.1 Unitarity
A major deficit of Fermi’s theory of weak interactions was the violation of
unitarity at the electroweak scale√s ∼ G
−1/2F , due to the assumption of point
11By the relation
mW =gv
2=
g
2√2λmH (1.49)
it is showed that the Higgs mass sets the electroweak scale.
19
1. The SM and The Higgs Boson 1.2 The SM Higgs Boson
interactions. The introduction of the massive intermediate vector bosons pushed
this problem to higher energies. However, owing to the high energy behaviour
of their longitudinal polarisation:
εµL =
(|~p|mV
, 0, 0,E
mV
)E�mV−−−−−→ pµ
mV(1.51)
unitarity violation is still expected in processes involving the longitudinal com-
ponents of the vector bosons. Indeed, in the Standard Model vector bosons
are predicted to have self-interactions, the SM calculation of the scattering
amplitude of longitudinal gauge bosons VLVL → VLVL, where V = W±,Z,
leads to conclude that if virtual effects of the Higgs boson or new physics are
not included, then this amplitude grows proportionally to the center of mass
energy of the scattering. This behaviour violates the unitarity, which means
that at some energy this process has the probability to occur greater than one
(σVLVL→VLVL ≥ 1), which is unphysical.
A characteristic example of such process is theW+L W
−L scattering at high ener-
Figure 1.2: Divergent WW cross section graphs and their cancellation.The upper three diagrams violate unitarity starting from
√s ≈ 1.2 TeV.
These unitarity violations are deleted by the lower two diagrams involvingHiggs boson exchange.
gies. In the following it will be proved that without an additional interaction, the
cross section of longitudinal WW scattering, shown in the first three graphs in
Figure 1.2, would be divergent and violate unitarity bounds from√s = 1.2 TeV
onwards. The Higgs mechanism cancels the divergences of the cross section of
the longitudinal degrees of freedom of the W bosons by destructive interference
of the last two graphs of Figure 1.2 with the first three graphs. Thus the cross
section does not diverge and no violation of unitarity occurs. This mechanism
only works if the Higgs boson is not too heavy, otherwise it would not contribute
enough to the scattering amplitudes before unitarity is violated.
Considering the W+L W
−L → W+
L W−L process, at the tree level, neglecting dia-
20
1. The SM and The Higgs Boson 1.2 The SM Higgs Boson
grams relative to the exchange in the s and t channels of the Higgs boson, the
remaining processes are:
1. The exchange in s and t channels of a Z boson.
2. The exchange in s and t channels of a photon.
3. The direct coupling in one vertex of the four W bosons.
The amplitudes of these processes can be computed, the result is that each of
these amplitudes can be parametrized in the following way:
A = α
(p
mW
)4
+ β
(p
mW
)2
+ γ (1.52)
Adding up all amplitudes, it results that the α term of the process 3 deletes all
the others, whereas the β term is deleted only including the processes relative
to the Higgs exchange in s or t channel. Expanding the amplitude in partial
waves, it is obtained
A = 16π∑L
(2L+ 1)aL(s)PL(cos θ) (1.53)
The unitarity condition of the scattering matrix is expressed by
|aL(s)| < 1 ⇒ |Re(aL(s))| ≤1
2(1.54)
Then ignoring the contributions of the Higgs diagrams (or equivalently ifm2H �
s), and considering only the s wave, the amplitude turns out to be:
a0(s) = − s
32πv2(1.55)
leading to the bound√s . 1.7 TeV. Assuming more constraining channels:
a0(s) = − g2s
64πm2W
= − GF s
8π√2
(1.56)
and the following bound
s ≤ 4π√2
GF≈ (1.2 TeV)2 (1.57)
on the center of mass energy√s can be derived for the validity of a theory of
weakly coupled massive gauge bosons. The increase of the scattering amplitude
can be damped by Higgs boson exchange, indeed including these diagrams the
β term is deleted and the s wave amplitude in the high center of mass energy
21
1. The SM and The Higgs Boson 1.2 The SM Higgs Boson
limit, s� m2H , is:
a0(s) = −m2HGF
4π√2
(1.58)
This puts an upper limit on the Higgs boson mass as follows:
mH ≤
√2π
√2
GF≈ 870 GeV (1.59)
This bound can be improved by requiring unitarity in all the relevant scattering
amplitudes, like W+L W
−L → ZLZL. In this case the obtained upper bound
is mH < 710 GeV [8]. Within the canonical formulation of the SM, internal
consistency therefore requires mH < 1 TeV. If there is no Higgs boson or its
mass is greater than 710 GeV, an additional strong force acting on the W
bosons is needed to cancel the divergences. It means that the couplings in the
W and Z boson sector become so large that the whole concept of the Higgs
mechanism as a perturbative expansion around the vacuum expectation value
breaks down. Then there must be a critical energy scale (Λ) above which new
physics appears to ensure unitarity and this scale should be around 1 or 2 TeV,
as extracted in (1.57). In other words, if energies up to 2 TeV can be explored,
then it will be possible either to discover the Higgs boson either exclude it and
to reach the limit where the SM fails. Luckily, this is now possible thanks to
the collisions produced at the Large Hadron Collider that will be described in
the next chapter.
1.2.1.2 Triviality
Figure 1.3: Diagrams generating the evolution of the Higgs self-interactionλ[9].
Quite restrictive bounds on the Higgs mass depend on the energy scale Λ
up to which the Standard Model is valid, i.e. the scale up to which no new
interactions and no new particles should appear. These bounds are derived from
the evolution of the quartic Higgs self-coupling λ with the energy E determined
by the quantum fluctuations,
dλ
dt=
1
16π2[12λ2+12λg2t −12g4t −
3
2λ(3g2+g′2)+
3
16λ(2g4+(g2+g′2)2)] (1.60)
22
1. The SM and The Higgs Boson 1.2 The SM Higgs Boson
with t = log(E2/v2), where the electroweak symmetry breaking scale v has
been used as the reference point, gt is the energy dependent top-Higgs Yukawa
coupling12, g and g′ are the electroweak gauge couplings. The three main con-
tributions to the evolution of λ are depicted in Figure 1.3. Due to their different
spin statistics, the Higgs loop gives rise to an indefinite increase of the quartic
coupling while the top quark loop drives the coupling to the smaller values,
finally even to values below zero.
If λ is large, for moderate top masses the contribution from Higgs loops domi-
nates over the top loops. Then the first term in (1.60) from Higgs loops
dλ
dt≈ 3
4π2λ2 (1.61)
dominates over the top loops leading to
λ(E2) =λ(v2)
1− 3λ(v2)4π2 log E2
v2
(1.62)
Hence, λ(E2) grows to infinity as the energy E increases and tends to zero as the
energy decreases. Without the λ(φ†φ)2 interaction, however, the theory becomes
a non-interacting theory at low energy, termed a trivial theory. Besides growing
λ to infinity, no well-defined theory would exist, since the Higgs potential in
Figure 1.1 would be reduced to an infinitesimally thin band with a vev equal to
0 and infinitely strong interactions. In particular it follows that λ(E2) becomes
infinite at the Landau pole, corresponding to the energy
Λ2 = v2 exp
(4π2
3λ
)= v2 exp
(8π2v2
3m2H
)(1.63)
It is therefore required that the quartic coupling λ is finite up to a large scale
Λ where no new physics appears,
1
λ(Λ)> 0 i.e. 0 < λ(Λ) <∞ (1.64)
which, with mH =√2λv, leads to the upper bound on the Higgs mass:
m2H <
8π2v2
3 log(Λ2/v2)(1.65)
This mass bound is related logarithmically to the energy Λ up to which the
Standard Model is assumed to be valid. The Renormalization Group equation
(1.60) can be used to establish the energy domain in which the SM is valid as a
12gt = −mtv
.
23
1. The SM and The Higgs Boson 1.2 The SM Higgs Boson
function of the Higgs mass. From Equation (1.63) it can be seen that for large
cut-off scales, the Higgs mass should be small, while for small cut-off scales the
Higgs boson mass can be rather heavy.
The maximal value of mH for the minimal cut-off Λ ∼ 1 TeV is given by ∼750 GeV. The caveat in this estimation is that for large λ the perturbation theory
breaks-down and the constraint is not valid. However, the lattice simulations
of gauge theories, properly including non-perturbative effects, provide an upper
bound of the Higgs mass of mH < 700 GeV, in agreement with the above
calculation [7].
If the cut-off energy is at the Plank scale around 1019 GeV, then the Higgs
boson mass should be little, mH < 190 GeV. The lower this energy is set, the
looser is the upper constraint on the Higgs boson mass. However in the Higgs
boson mass calculation the contribution from top and gauge boson loops cannot
be neglected. If these corrections are included and it is required that the theory
is perturbative (i.e. λ is finite) below a given energy, then an upper limit on
mH as a function of the top quark mass can be set. For mt = 175 ± 6 GeV,
mH < 180± 4± 5 GeV if Λ = 1019 GeV [6]. Figure 1.4 shows the upper limits
that prevent the self-interaction to become infinite. In the picture there are also
lower limits that will be explained in the following subsection.
1.2.1.3 Vacuum Stability
In the previous subsection only the Higgs boson self-interaction has been
included in the running of the quartic coupling, which is a good approximation
in the region where the coupling is large. For completeness the contributions
from gauge bosons and fermions should be also included. Nevertheless, out of
all the fermions only the top quark contributes significantly, as can be seen
from (1.60). For small values of the coupling λ the top quark contribution can
dominate:dλ
dt≈ 1
16π2
[− 12g4t +
3
16λ(2g4 + (g2 + g′2)2)
](1.66)
driving λ(E2) to a negative value for large gt. In this case the scalar potential
has no minimum, there is no ground state and the vacuum is not stable, no
stable spontaneous symmetry breaking occurs. The equation (1.66) is easily
solved to find
λ(Λ) = λ(v) +1
16π2
[− 12g4t +
3
16λ(2g4 + (g2 + g′2)2)
]log
(Λ2
v2
)(1.67)
The requirement to have a scalar potential which is bounded from below at all
scales Λ, and therefore keeping λ(Λ) > 0, results in a lower limit for the Higgs
24
1. The SM and The Higgs Boson 1.2 The SM Higgs Boson
boson mass which depends on the cut-off scale:
m2H >
v2
8π2
[− 12g4t +
3
16λ(2g4 + (g2 + g′2)2)
]log
(Λ2
v2
)(1.68)
At one-loop order this lower limit is:
Λ = 1 TeV ⇒ mH > 70 GeV (1.69)
Λ = 1019 GeV ⇒ mH > 130 GeV (1.70)
(1.71)
Figure 1.4: Theoretical bounds on the SM Higgs mass as a function of thecut-off scale. It is assumed that the SM is a valid theory up to the scale Λ.The upper solid area indicates the sum of theoretical uncertainties in themH upper bound for mt = 175 GeV. The upper edge corresponds to Higgsmasses for which the SM Higgs sector ceases to be meaningful at scale Λ,and the lower edge indicates a value of mH for which perturbation theory iscertainly expected to be reliable at scale Λ. The lower solid area representsthe theoretical uncertainties in the mH lower bounds derived from stabilityrequirements using mt = 175 GeV and αs(mZ) = 0.118, where αs is thestrong coupling constant [6].
1.2.1.4 Combining the Theoretical Arguments
Therefore it is seen that when λ is small (a light Higgs boson) radiative
corrections from the top quark and gauge couplings become important and lead
to a lower limit on the Higgs boson mass from the requirement of vacuum sta-
bility, λ(Λ) > 0. If λ is large (a heavy Higgs boson) then triviality arguments,
(1/λ(Λ) > 0), lead to an upper bound on the Higgs mass. The allowed region
25
1. The SM and The Higgs Boson 1.2 The SM Higgs Boson
for the Higgs mass from these considerations is shown in Figure 1.4 as a function
of the scale up to which the Standard Model is expected to be valid and no new
physics is required, Λ. The width of the bands correspond to the theoretical
uncertainties due to the truncation of the perturbative expansion and on the
experimental uncertainties of the input parameters. It is known that Λ ≤ MP
because at energies above the Plank scale MP ≈ 1019 GeV quantum gravita-
tional effects become significant and the Standard Model must be placed by a
more fundamental theory which incorporates gravity. According to Figure 1.4,
i.e. considering mt = 175 GeV, if the Standard Model is valid up to the Planck
scale, the Higgs mass is bounded to the mass range [9]:
130 GeV . mH . 190 GeV (1.72)
On the other hand, if a Higgs boson with higher mass will be found, it will
imply a break down of the Standard Model at lower energy scale. At the scale
Λ = 1 TeV, the Higgs boson can be found in a broader mass region:
55 GeV . mH . 700 GeV (1.73)
1.2.2 Experimental Limits on the Higgs Mass
Experimental limits for the Higgs mass are deduced both from direct [1, 10]
and indirect searches [5, 7, 8, 11, 12].
1.2.2.1 Direct Searches
The first direct search was carried out at LEP. The LEP collider operated
in two phases: in the first (LEP I) the center of mass energy was close to
mZ , while in the second (LEP II) the energy was gradually increased from
189 GeV to 209 GeV. The principle mechanism for producing the SM Higgs
boson in e+e− collisions at LEP energies was Higgs-strahlung in the s channel,
e+e− → Z(∗) → HZ, where the electron and the positron annihilate producing
a virtual vector boson which becomes real emitting the Higgs boson. The Z
boson in the final state is either virtual (LEP I), or on mass shell (LEP II).
The Standard Model Higgs searches at LEP13 were concentrated in four final
state topologies:
• four-jet topology: H → bb and Z → qq;
• τ -lepton production: H → τ+τ− and Z → qq or H → bb and Z → τ+τ−;
13At LEP I, only the modes Z → l+l− and Z → νν were used because the backgrounds inthe other channels were prohibitive. For the data collected at LEP II, all decay modes wereused.
26
1. The SM and The Higgs Boson 1.2 The SM Higgs Boson
• missing energy14, mainly in the process H → bb and Z → νν;
• leptonic final states: H → bb and Z → e+e−, µ+µ−.
The combination of the four LEP experiments (ALEPH, DELPHI, L3 and
OPAL) data set a lower bound at mH > 114.4 GeV at a 95% confidence level
(C.L.) [1] .
Another accelerator where direct searches have been carried out was Tevatron
Figure 1.5: Higgs boson production cross sections (fb) at the Tevatron(√s = 1.96 TeV) for the most relevant production mechanisms as a function
of the Higgs boson mass [13].
at Fermi National Accelerator Laboratory. It was a proton-antiproton collider
which was running at√s = 1.96 TeV, it shut down on September 30, 2011.
The decays mode of the Higgs boson and its production in hadron colliders will
be discussed in Section 1.3. However, the most important Higgs production
mechanisms at Tevatron, whose cross sections are visible in Figure 1.5, were:
• gluon fusion: gg → H;
• associated production with a vector boson (Higgs-strahlung): qq →W±H
or ZH;
• vector boson fusion: qq → qqH, where the quarks radiate weak gauge
bosons that fuse to form the H.
14If a particle traverses the detector without leaving a trace, the total sum of momenta isfound to be non-zero. The energy corresponding to the particle is said to be “missing”.
27
1. The SM and The Higgs Boson 1.2 The SM Higgs Boson
For masses less than about 135 GeV, the most promising discovery channel
was the associated production with H → bb. The final states analyzed for
the vector bosons were: W± → l±ν (l = e, µ, τ), Z → νν and Z → l+l−
(l = e, µ). Besides both Tevatron collaborations (CDF and DØ) searched
for SM H → τ+τ− where the gg → H, WH, ZH and vector boson fusion
production processes were considered. Another research channel was the SM
process H → γγ, making use of all production modes. The CDF Collaboration
searched also for WH + ZH → jjbb processes, making use of signal events in
which the vector boson decays to jets.
In the high mass region, above 135 GeV, studies were focused on the gg →H →WW (∗) → l+νl−ν channel15 (l = e, µ, τ). Nevertheless, WH production,
ZH production, and vector boson fusion qqH production contribute additional
signal in this channel which was used in the searches.
The sensitivity reached combining the results from all the analyses carried out by
the two experiments at Tevatron, CDF and DØ, enabled to set new limits on the
SM Higgs boson mass in 2011. Analyzing an integrated luminosity from 4.3 fb−1
up to 10.0 fb−1 (CDF) and up to 9.7 fb−1 (DØ) at√s = 1.96 TeV, the ratios
of the 95% C.L. expected and observed limit to the SM cross section are shown
in Figure 1.6 for the combined CDF and DØ analyses. The regions of Higgs
boson masses excluded at the 95% C.L. are 147 GeV < mH < 179 GeV and
100 GeV < mH < 106 GeV. The expected exclusion region, given the current
sensitivity, is 141 GeV < mH < 184 GeV and 100 GeV < mH < 119 GeV
(masses below mH < 100 GeV were not studied). There is an excess of data
events with respect to the background estimation in the mass range 115 GeV
< mH < 135 GeV. At mH = 120 GeV, the p-value for a background fluctuation
to produce this excess is ∼ 3.5 × 10−3, corresponding to a local significance of
2.7σ. The global significance for such an excess anywhere in the full mass range
is approximately 2.2σ [10].
1.2.2.2 Indirect Searches
Indirect evidence for a light Higgs boson can be derived from the high-
precision measurements of electroweak observables at LEP and elsewhere. In-
deed, the fact that the Standard Model is renormalizable only after including the
top and Higgs particles in the loop corrections, indicates that the electroweak
observables are sensitive to the masses of these particles. The Higgs boson en-
ters into one-loop radiative corrections in the Standard Model and so precision
electroweak measurements can bound the Higgs boson mass. For example, the
Fermi coupling can be rewritten in terms of the weak coupling and theW mass;
15The star indicates that below H →W+W− threshold, one of the W± boson is virtual.
28
1. The SM and The Higgs Boson 1.2 The SM Higgs Boson
1
10
100 110 120 130 140 150 160 170 180 190 200
1
10
mH (GeV/c2)
95%
CL
Lim
it/S
M
Tevatron Run II Preliminary, L ≤ 10 fb-1
ExpectedObserved±1σ Expected±2σ Expected
LE
P E
xclu
sio
n
Tevatron+ATLAS+CMS
Exclusion
SM=1
Tev
atro
n +
LE
P E
xclu
sio
n
CM
S E
xclu
sio
n
AT
LA
S E
xclu
sio
n
ATLAS+CMSExclusion
ATLAS+CMSExclusion
February 27, 2012
Figure 1.6: Observed and expected (median, for the background-only hy-pothesis) 95% C.L. upper limits on the ratios to the SM cross section, asfunctions of the Higgs boson mass for the combined CDF and DØ analyses.The limits are expressed as a multiple of the SM prediction for test masses(every 5 GeV/c2) for which both experiments have performed dedicatedsearches in different channels. The points are joined by straight lines forbetter readability. The bands indicate the 68% and 95% probability regionswhere the limits can fluctuate, in the absence of signal [10].
29
1. The SM and The Higgs Boson 1.2 The SM Higgs Boson
at lowest order, GF /√2 = g2/8m2
W . After substituting the electromagnetic
coupling α, the electroweak mixing angle and the Z mass for the weak coupling,
and the W mass, this relation can be rewritten as
GF√2=
2πα
sin2 2θWm2Z
[1 + ∆rα +∆rt +∆rH ] (1.74)
The ∆ terms take account of the radiative corrections: ∆rα describes the shift
Figure 1.7: One-loop diagrams for the contributions to the W mass in-volving the Higgs boson (a) and the top quark (b).
in the electromagnetic coupling α if evaluated at the scale mZ instead of zero-
momentum; ∆rt denotes the top (and bottom) quark contributions to the W
and Z masses, which are quadratic in the top mass. Finally, ∆rH accounts for
the virtual Higgs contributions to the masses; the Higgs boson, for instance,
enters in the one-loop corrections to the W mass, illustrated in Figure 1.7. This
latter term depends only logarithmically on the Higgs mass at leading order [7]:
∆rH =GFm
2W
8√2π2
11
3
[log
m2H
m2W
− 5
6
](m2
H � m2W ) (1.75)
Since the dependence on the Higgs boson mass is only logarithmic, the limits
derived on the Higgs boson from this method are relatively weak. In contrast,
the top quark contributes quadratically to many electroweak observables. It is
proved that at one-loop all electroweak parameters have at most a logarithmic
dependence on mH . This fact has been glorified by the name of the “screening
theorem”. Since in general electroweak radiative corrections involving the Higgs
boson take the form [5],
g2(log
mH
mW+ g2
m2H
m2W
. . .
)(1.76)
the quadratic effects in the Higgs mass are always screened by two additional
powers of g relative to the lower order logarithmic effects and so radiative cor-
rections involving the Higgs boson can never be large, being dominated by the
logarithmic term. Although the sensitivity on the Higgs mass is only logarith-
30
1. The SM and The Higgs Boson 1.2 The SM Higgs Boson
mic, the increasing precision in the measurement of the electroweak observables
allows to derive interesting estimates and constraints on the Higgs mass. The
Higgs boson mass can be extracted indirectly from precision fits of all the mea-
sured electroweak observables, within the fit uncertainty. This is actually one
of the most important results that can be obtained from precision tests of the
Standard Model and greatly illustrates the predictivity of the Standard Model
itself. The two observables that are most sensitive to the Higgs boson mass
are the W boson mass and the effective leptonic weak mixing angle sin2 θleff16.
Even if the precision of mW is better compared to sin2 θleff , the latter has a
more pronounced dependence on mH .
Stringent limits on Higgs boson mass are set from a global fit historically by
the LEP Electroweak Working Group and more recently using GFitter toolkit.
The SM predictions for the electroweak precision observables measured by the
LEP, SLC (Stanford Linear Collider), and Tevatron experiments are fully imple-
mented in the Gfitter software [11, 12]. The list of floating fit SM parameters
is17: mZ , mH , mc, mb, mt, ∆α(5)had(m
2Z), αs(m
2Z) where only the latter pa-
rameter is kept fully unconstrained allowing an independent measurement. In
particular for the running quark masses mc and mb, that are the MS renor-
malized masses of the c and b quark, the world average values are used as input
data. The combined top-quark mass is taken from the Tevatron Electroweak
Working Group.
The fit is performed minimizing the statistic test χ2, which consider the dif-
ference between measurements and SM predictions. The results are produced
either ignoring (“standard fit” or “blue-band”) or including (“complete fit”) the
direct searches at LEP, at Tevatron and at LHC. The values for GFitter are the
following using the Tevatron results from July/August 2011:
Electroweak precision data only: mH = 95+30[+74]−24[−43] GeV, (1.78)
Including direct searches: mH = 125+8[+21]−10[−11] GeV, (1.79)
where the errors are quoted at 1σ [2σ] level. The results, very similar to the
ones obtained by LEP group, are illustrated in Figure 1.8, which shows ∆χ2 ≡16sin2 θleff is a particular renormalization prescription of
sin2 θW = 1−m2
W
m2Z
(1.77)
which can be taken as the tree level definition of the Weinberg angle.17The masses of leptons and light quarks are fixed to their world average values; GF has
been determined through the measurement of the µ lifetime, giving GF =1.16637(1)·10−5
GeV−2. The leptonic and top-quark vacuum polarisation contributions to the running of theelectromagnetic coupling are precisely known or small. Only the hadronic contribution for
the five lighter quarks, ∆α(5)had(m
2Z), adds significant uncertainties and replaces the electro-
magnetic coupling at the Z peak, α(m2Z), as floating parameter in the fit.
31
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
χ2 − χ2min of the global least-squares fit of the Standard Model predictions to
the electroweak data as a function of the Higgs boson mass. The preferred
value for its mass correspond to the minimum of the curve. The data clearly
favours a low mass Higgs boson. It may be concluded from these numbers that
[GeV]HM
50 100 150 200 250 300
2 χ∆
0
1
2
3
4
5
6
7
8
9
10
LE
P 9
5% C
L
Teva
tro
n 9
5% C
L
σ1
σ2
σ3
Theory uncertaintyFit including theory errorsFit excluding theory errors
G fitter SM
AU
G 11
Figure 1.8: ∆χ2 as a function of mH for the “standard” fit. The solid(dashed) line gives the results when including (ignoring) theoretical errors[14].
the canonical formulation of the Standard Model including the existence of a
light Higgs boson is compatible with the electroweak data. However, alternative
mechanisms cannot be ruled out if the system is opened up to contributions from
physics areas beyond the Standard Model.
1.3 Higgs at the LHC
In the Standard Model once the Higgs boson mass is fixed, all the properties
of the particle are uniquely determined. The SM Higgs couplings to fundamental
fermions are proportional to the fermion masses, and the couplings to bosons
are proportional to the squares of the boson masses. In particular, its couplings
to gauge bosons, Higgs bosons and fermions are given by [1]:
gHff =mf
v, gHV V =
2m2V
v, gHHV V =
2m2V
v2
gHHH =3m2
H
vand gHHHH =
3m2H
v2
(1.80)
where V = W± or Z. These are s wave couplings which are even under parity
and change conjugation, corresponding to the JPC = 0++ assignment of the
32
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
Higgs spin and parity quantum numbers.
In the next sections it will be argued that in Higgs boson production and decay
processes, the dominant mechanisms involve the coupling of the H to the W±,
Z and/or the third generation quarks and leptons. The Higgs boson coupling
to gluons, Hgg, is induced at leading order by a one-loop graph in which the
H couples mostly to a virtual tt pair. Likewise, the Higgs boson coupling to
photons, Hγγ, is also generated via loops, although in this case the one-loop
graph with a virtual W+W− pair provides the dominant contribution.
1.3.1 Production Mechanisms
A Higgs particle can be created from the fusion of beam particle constituents,
from the fusion of heavy particles produced at the collision, or it can be radiated
off a massive virtual particle. In Figure 1.9 the first process (A) is referred to
as direct production, whereas the other processes (B-E) are referred to as as-
sociated production. The distinction is important: the final state of associated
production will also contain the signatures of the two quarks or the massive
vector boson that radiated the Higgs. Their experimental signatures can be
used as a label in the search for Higgs events.
As illustrated by the diagrams in Figure 1.9, the four main production mecha-
nisms for the Standard Model Higgs boson at the LHC are [7, 15, 13, 16]:
• gluon fusion: gg → H;
• vector boson fusion (VBF): qq → qq +H;
• associated production with W/Z (or Higgs-strahlung off W ,Z): qq →W,Z +H;
• associated production with heavy quarks (or Higgs bremsstrahlung off
heavy quarks): gg, qq → QQ+H.
In addition to these main mechanisms, one could add bottom fusion bb→ H+X
and Higgs production in association with single top qb → qtH. In Figure 1.10
the fully inclusive Higgs boson production cross sections for√s = 7 TeV at
the LHC (for a description of the experimental apparatus see Chapter 2) are
summarized as a function of the Higgs mass.
For the 7 TeV proton-proton collider, the dominant process for Higgs produc-
tion is gluon fusion. The Higgs boson does not couple directly to the gluons, as
gluons are massless. Instead, the coupling is mediated by a quark loop, which is
most often a top loop and a bottom loop to a lesser extent, as the Higgs-quark
coupling is proportional to the quark mass.
The Higgs production cross section from weak vector boson fusion is typically
33
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
Figure 1.9: Feynman diagrams of the main Higgs production processes:gluon fusion (A), W+W− and ZZ fusion (B), W± and Z Higgs-strahlung(C), qq fusion (D), and Higgs bremsstrahlung of a top or bottom quark (E).
an order of magnitude lower than gluon fusion. At higher Higgs masses, how-
ever, it becomes important, because of the decrease in cross section for gluon
fusion. It will be competitive with the dominant gluon fusion mechanism for
large masses,mH ∼ 1 TeV. In this reaction two quarks radiate two vector bosons
which annihilate producing the Higgs boson. Besides, the peculiar final state
with two hard jets in the forward and backward region of the detector (coming
from the fragmentation of the two quarks after the vector bosons radiation),
with a big rapidity gap between them, provides a distinctive signature that can
be efficiently used to disentangle the signal from the background18.
Note also that the main contribution to the cross section is due to the W+W−
fusion channel, σ(W+W− → H) ∼ 3σ(ZZ → H) at the LHC, a consequence of
the fact that the W boson couplings to fermions are larger than those of the Z
boson.
Besides the fact that the gluon fusion and the VBF are important for the high
cross sections, they are also complementary because the former is specified by
the Yukawa Higgs boson couplings to fermions, while the latter is fixed by the
gauge Higgs boson couplings to vector bosons. Since the Yukawa and the gauge
sectors are not really connected, it is important to explore both these produc-
18Since it is a pure electroweak process there are no color fields connecting the two quarks.The result is that gluons cannot be emitted in the central part of detector, but they are mostlyradiated in a collinear direction with respect to the interacting quarks, while the Higgs bosondecays in the central part of the detector. So a Central Jet Veto is an effective cut thatdiscriminates between signal and backgrounds.
34
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
tion processes in order to understand the role of the Higgs boson in the SM
Lagrangian.
The Higgs boson production in association with a vector boson (W , Z) is called
Higgs-strahlung. The Higgs-strahlung production processes have even lower
cross sections. The production in association with a W± or a Z boson is im-
portant in the intermediate mass region (mH < 2mZ), its cross section is about
one to two orders of magnitude smaller than the gluon fusion cross section for
Higgs masses mH < 200 GeV. Its cross section falls rapidly with an increasing
value of mH , but the associated W± or Z can decay into leptons, which allow
for effective tags.
The Higgs production in association with heavy quarks, that are mostly top
quarks, is less important because the cross section is about five times smaller
than that forW±H or ZH for mH < 200 GeV; for Higgs masses above 500 GeV
it reaches above W±/ZH cross section but is still far below the gluon fusion
cross section. However for light Higgs bosons, ttH is expected to be the only
channel where H → bb is observable. So although the largest cross section is
that of the gg → H production on the whole mass range, it is often convenient
to consider associated production channels.
Considering that the maximum Higgs boson production cross section for low
Higgs masses is σH ≈ 30 pb and the total pp cross section at the LHC is
σtot ≈ 110 mb a major challenge of the LHC experiments becomes clear: com-
pared to other pp reactions, the Higgs boson signal is suppressed by ten orders
of magnitude. Extreme care has to be taken to understand and reject the back-
ground processes.
[GeV] HM100 200 300 400 500 1000
H+
X)
[pb]
→(p
p σ
-210
-110
1
10= 7 TeVs
LH
C H
IGG
S X
S W
G 2
010
H (NNLO+NNLL QCD + NLO EW)
→pp
qqH (NNLO QCD + NLO EW)
→pp
WH (NNLO QCD + NLO EW)
→pp
ZH (NNLO QCD +NLO EW)
→pp
ttH (NLO QCD)
→pp
Figure 1.10: Higgs boson production cross sections (fb) at the LHC,√s =
7 TeV, for different production channels as a function of the Higgs mass [17].
35
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
1.3.2 Decay Modes
In the Standard Model the possible Higgs boson decays channels are to pairs
of fermions or bosons, the most important decay modes are shown in Figure
1.11. As the production, the decay of Higgs is dependent on the mass of the
boson and the Branching Ratios (BRs19), which are the probability of decay in
specific final states, for the various Higgs decay channels are shown in Figure
1.12 as a function of the Higgs mass. The BRs are known at next-to-leading
order (NLO), including both QCD and electroweak corrections. Besides the pro-
duction modes, the branching ratio is the second mean feature that has to be
considered to find the best channels where the Higgs boson might be detected.
However it cannot be simply chosen the final states with the highest branching
ratios, since it is necessary to think if it is actually possible to recognize such sig-
nal events among backgrounds. There are two kinds of background: non-signal
events with the same experimental signature (“irreducible” background) and
misidentifications by the detector (“reducible” background). The latter is set
by the quality of the detector and is therefore to a certain extent controllable.
The former may be reduced by the application of constraints on the experi-
mental signature in the final analysis. In Table 1.4 the main decay channels
considered by ATLAS and the relative reducible and irreducible backgrounds
are scheduled [15, 16, 18].
The Higgs boson coupling to a particle is proportional to the mass of that
Figure 1.11: Tree level (first diagrams from left) and loop processes (lasttwo diagrams from left) for the Higgs boson decays.
particle, as seen in (1.80), thus its main decay channels are to the most massive
particles, insofar as the decay is kinematically allowed. So certain decay possi-
bilities open up only if mH is larger than the kinematical threshold as can be
seen in Figure 1.12. Note in particular the opening of the WW channel, which
causes a dip in the branching ratio of the ZZ(∗) channel and the subsequent
19It is defined Branching Ratio of an initial state |i〉 in a final state |f〉, the ratio betweenthe decay width |i〉 → |f〉 and the total decay width of |i〉:
BR(|i〉 → |f〉) =Γ|i〉→|f〉
Γtot(1.81)
36
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
Figure 1.12: The main branching ratios BR(H) of the Standard ModelHiggs decay channels [19].
Decay channel Irreducible background Reducible background
WH,H → bbWZ → lνbb tt→WWbbWbb→ lνbb Wj
ttH,H → bbttZ ttjjttbb Wjjjj
(WH, ttH, ZH), H → γγqq → γγ jjgg → γγ γj
qg → qγ → γγ Z → ee
H → τ+τ− Z → τ+τ− + jets
Z → ll + jets (l = e, µ)W → lν + jetstt (+jets)
WW/WZ/ZZ + jetsQCD di-jets
H → ZZ(∗) → 4lZZ(∗) tt→ 4l+X
Zγ(∗) Zbb→ 4l
H →WW (∗) → lνlνWW (∗) tt→WWbb→ lνlν +X
WZ/ZZ → lνlν +X Wt→WWb→ lνlν +X
H → ZZ → 4l Z(γ(∗))Z(γ(∗)) → 4l
H → ZZ → llνν ZZZjj
WZ/ZZ, tt
H →WW → lνjj WW → lνjjWjj
tt→ lνjjbb
Table 1.4: Higgs decay channels and relative backgrounds searched byATLAS.
37
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
opening of the ZZ channel which re-establishes the decay into both WW and
ZZ.
Since the pole masses of the gauge bosons and fermions are known (the electron
and light quarks masses are too small to be relevant) all the partial widths for
the Higgs decays into these particles can be predicted. The decay widths into
massive gauge bosons V = W,Z and into fermions are directly proportional to
the HV V and to Hff couplings, respectively. In leading order the partial decay
widths are given by the expressions [16]:
Γ(H → ff) = Nc
GFmHm2f
4√2π
[1−
4m2f
m2H
]3/2; (1.82)
Γ(H → V V ) =GFm
3H
16√2πδv√1− 4x(1− 4x+ 12x2), x =
m2V
m2H
, (1.83)
whereNc is the color factor: Nc = 3 (1) for quarks (leptons), δW = 2 and δZ = 1.
The massive decay particles usually have short life times themselves, thus only
their decay products are detected. The signatures of these decay products are
roughly divided in four categories: charged tracks (e.g. electrons, muons), jets
(e.g. quarks, hadrons), electromagnetic showers (e.g. photons, electrons), and
missing transverse energy (e.g. neutrinos). With the exception of the H → γγ
channel, the search for Higgs boson events starts with the reconstruction of
the intermediate particles from such signatures. The Higgs boson itself is then
reconstructed as a mass resonance, either from the final decay products or from
the reconstructed intermediates. The most promising channels are those where
the final state of the Higgs event stands out clearly against the huge background
of soft (hadronic) events. This limits the ability to discover the Higgs in several
of the existing channels.
The varying production cross sections and decay branching ratios for the Higgs
result in different approaches to a direct research at ATLAS20, which will be able
to eventually detect a SM Higgs boson over the full kinematic range between the
LEP limit and a theoretical upper limit of 1 TeV. To discuss the Higgs decays,
it is useful to split the mass range from 100 GeV to 1 TeV in three regions:
• the “low mass” range: mH . 130 GeV,
• the “intermediate mass” range: 130 GeV . mH . 180 GeV,
• the “high mass” range: 180 GeV . mH . 1 TeV.
20ATLAS is one of the experiments installed at LHC, its detector will be described inChapter 2.
38
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
1.3.2.1 Low Mass Region
In the “low mass” range (mH . 130 GeV), the most important decay chan-
nel, compatible with the accessible phase space, for the Higgs is H → bb
(BR∼ 75 − 50% for mH = 115 − 130 GeV, see Figure 1.12). Than there is
a set of final states with branching ratios one order of magnitude smaller than
bb, which are τ+τ− (BR∼ 7−5%), gg (BR∼ 7%) and cc (BR∼ 3−2%). Finally
the last ones with a probability of a few per mille are γγ and Zγ. As already
mentioned, decays to massless particles, gluons and photons, proceed through a
virtual loop of heavy fermions and/or gauge bosons with the major contribution
coming from the top quarks in the gluon channel and the W boson in case of
photons. Among all these final states not all of them can be used.
H → bb has the highest branching ratio at low masses, but it suffers from
huge QCD backgrounds (the background of this channel is the continuous tt
production). Therefore the detection of the Higgs boson is not feasible in the
inclusive production, associated productions can be used to gain additional re-
jection. This can be obtained considering final states such as ttH, WH, ZH
and exploiting the leptons coming from decays of gauge bosons and top quark21.
As listed in Table 1.4, the main irreducible backgrounds to this channel for the
associated production WH (searched in the final state lνbb) are WZ → lνbb
andWbb. The reducible backgrounds come largely from events with a bb pair in
the final state, mainly tt → WWbb, and from events in which jet are misiden-
tified as b, mainly W + jet. For the ttH production channel, the complexity of
the final state (lνjjbbbb, in which the leptonic couple and the jet couple come
from the decays of the W s coming from top quarks, one bb couple comes from
top quarks and the other from Higgs) allows to reduce the backgrounds, due to
W + jets and QCD multijet production. For instance, four b-jets are required
in the event. The irreducible backgrounds are due to ttbb and ttZ.
For the research of H → bb channel is required an efficient selection of b-jet22
from the jets for the reconstruction of their invariant mass and a good identi-
fication of the leptons. Experimentally it is searched high pT leptons (pT > 20
GeV) in the final state as channel signature. This decay gives further problems
in the reconstruction. The Higgs mass has to be reconstructed from two jets
giving trouble with the invisible energy from escaping neutrinos and energy lost
outside the jet cone. As a result the mass peak will be wide. The ttH channel
also suffers from the combinatorial problem of selecting the correct combination
21The HZ production is already suppressed in comparison with the HW production andtaking the factor three lower branching ratio for leptonic decays into account the HZ produc-tion mode is of limited interest as the rate will be low.
22The method called b-tagging is based either on the long lifetime of the b-quarks whichcauses secondary vertices or on the high amount of leptons in B meson decays. While HWwill in general have two b-quarks in the final state, the Htt will have four as t→Wb.
39
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
of b-jets.
H → γγ decay channel, despite its very low branching ratios of O(10−3), is one
of the most important channels in the low mass range. The Higgs boson does
not couple directly to photons, which are massless particles. The H → γγ decay
is however possible through a W± boson or top quark loop, where the Higgs
couples to this virtual W± boson or top quark which in turn couples to the two
photons.
The trigger is two isolated electromagnetic clusters. This decay mode suffers
less from background processes than the H → bb decay in associated produc-
tion23 and it is therefore the preferred channel in this mass range. Indeed, it
has a very clear signature with two isolated very energetic photons in the final
state forming a narrow invariant mass peak. The associated resolution on the
Higgs mass is around 1− 2 GeV.
This decay mode is particularly interesting for the case of the associated produc-
tion. In fact, if the associatedW± or Z boson decays leptonically, the (isolated)
lepton can be used to find the decay vertex, yielding a better mass resolution
than what is possible for the direct production.
The irreducible background comes from the direct production of γγ (together
with W , Z or tt in the case of associated production), while the main reducible
background comes from jj and jγ final states where the jets have been misidenti-
fied as photons. This can occur especially if the jet is composed by one leading
π0 and a number of soft hadrons. The rejection against these jets requires
high angular granularity to distinguish the two photons coming from the π0
decay. Reducible backgrounds are also events Z → e+e−, in which electrons
are misidentified as photons. These events are only a problem if mH ≈ mZ ,
which is ruled out by the LEP results. A photon that traverses material may be
absorbed and subsequently knock a high energy electron out of the material. If
this happens before the photon is detected, then the electron is observed instead.
Similarly, a high energy electron may lose a sizable fraction of its energy by the
emittance of a bremsstrahlung photon. Both processes lead to misidentification
of photons.
It is the reducible background that puts strict constraints on the detector. For
a general purpose detector, the ability to reconstruct photon conversions is
required, as well as a good electron identification. In this region the Higgs
natural width is negligible, as a result the sensitivity is heavily affected by the
di-photon mass resolution, which calls for excellent reconstruction of the energy
and the direction of the photons. This is obtained by the combination of the
23The same decay can occur trough a direct production or with an associated productionof WH, ZH or ttH in which in the final state there are jets generated from gauge bosons ortop quark decays.
40
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
high granularity presampler and first strip layer of the ATLAS electromagnetic
calorimeters; it is in fact required a good identification of photons from jets
along with a good resolution in energy and in the opening angle of photons.
H → τ+τ− decay mode has the second highest branching fraction at low masses.
The final states depend on the tau decays: 42% of the time both taus decay
hadronically, in 46% of the cases one goes to hadrons and the other one to an
electron or a muon, and the remaining 12% are fully leptonic modes. The fully
hadronic channels are very challenging, while semi-leptonic and leptonic ones
(H → τ+τ− → ντντ νl−νl+, where l = e, µ) can more easily lead to a discovery
at LHC using the vector boson fusion production mode. The tau products in
the central region and tagging jets from VBF in the forward part of the detector
with a rapidity gap containing little hadronic activity are a rare topology for
QCD events. The rapidity gap arises from the lack of color flow between the
initial interacting particles. A central jet veto is therefore included allowing ef-
fective discrimination against backgrounds. Neutrinos in the final state prevent
the full reconstruction of the event and the Higgs mass is calculated using the
collinear approximation, which assumes that the visible decay products of the
τ follow the same direction of the latter. This is a good approximation since
mH/2 � mτ and hence the taus are highly boosted. Resolutions around 30%
on mH are obtained.
The main sources of backgrounds are the Z + jets and tt production; in the
secondary backgrounds the W + jets and single top production are included.
The main background after the selection cuts is Z → τ+τ−, with an invariant
mass peak close to the signal for mH . 130 GeV.
It should be concluded that the bb decay mode is tipically overwhelming by the
QCD background, cc and gg are never considered because, firstly, if the hadronic
final states can be detected they would be dominated by bb, secondly, there are
no proper “tagging” algorithms for c quarks and gluons such as for b quarks.
Therefore at LHC the two main decay modes explored for the low mass range
are τ+τ− and γγ. Figure 1.13 show achievable discovery sensitivity results for
these channels at 30 fb−1, the importance of ttH in the region around the LEP
limit is evident from the plot.
1.3.2.2 Intermediate Mass Region
If 130 GeV . mH . 2mZ , the Higgs boson decays mostly into vector bosons:
WW (∗) or ZZ(∗), with one virtual gauge boson (noted by the “*” superscript)
below the 2mV (V =W ,Z) kinematical thresholds.
The main decay mode of the Z boson is hadronic (∼ 70%), typically resulting in
two jets. A large fraction of the leptonic decays are to two neutrinos (∼ 20%),
41
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
which are invisible. Decays to pairs of electrons, muons, or τs make up ∼ 10%
of the total. The W± boson decays mainly to hadrons as well (∼ 68%). The re-
maining leptonic decays are not to pairs, but always to a neutrino and a charged
lepton because of charge and lepton number conservation [1]. The neutrinos will
result in missing transverse energy.
A part from ZZ(∗) and WW (∗), the only other decay mode which survives is
the bb decay which has a branching ratio that drops from 50% at mH ∼ 130
GeV to the level of a few percent for mH ∼ 2mW . The WW (∗) decay starts to
dominate at mH ∼ 130 GeV and becomes gradually overwhelming, in partic-
ular for 2mW . mH . 2mZ where the W boson is real while the Z boson is
still virtual, strongly suppressing the H → ZZ(∗) mode and leading to a WW
branching ratio of almost 100% [16]. As can be seen in Figure 1.12 around 160
GeV, the threshold for the production of a pair of realW s, all BRs into fermions
and even into ZZ drop.
Therefore at this Higgs boson masses H → WW is the dominant decay mode,
with both W → lν. It has the highest discovery potential (see Figure 1.1324).
This is a consequence of the large branching ratios and rather clean signature.
The presence of two high-pT isolated leptons and large missing transverse energy
provides efficient trigger and great reduction against QCD processes. Unfortu-
nately, the involvement of neutrinos in the leptonic decays of W bosons implies
no narrow invariant mass peak can be reconstructed.
The irreducible backgrounds are due to the productions of the continuumW+W−
→ l+νl−ν 25 and ZZ → l+l−νν. The reducible backgrounds come from tt (the
two top quarks decay into a pair of W bosons and two b-jets), Wt, Wbb, bb,
W + jets, in which a jet is misidentified as an electron (lepton). The dominant
backgrounds are in particular WW and tt (as shown in Table 1.4) with real
leptons and neutrinos in the final state, they can be distinguished from signal
using jet activity and the angle between the leptons. For the signal, the charged
leptons tend to go together given the scalar nature of the Higgs and the chirality
of the neutrinos. The backgrounds in which the lepton pair come from a Z are
instead rejected requiring that the invariant mass of the di-leptonic system is
less than 80 GeV.
All the production modes can be explored, but an exclusive VBF analysis can
further improve signal over background ratios. The use of the vector boson
fusion production features, with the two hard jets in the forward and backward
regions usually identified with a specific forward tagging algorithm, and the lack
of hadronic activity in the central region indicated by applying a veto for jet
24In the decay H → W+W− → l+νl−ν plus two jets from the VBF production, thesignificance for 30 fb−1 of integrated luminosity is above 5σ for the full intermediate massrange.
25The WW continuum is of the order of the WW signal.
42
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
with transverse energy above some threshold, allows rejection of the reducible
background.
Up to mH ∼ 150 GeV and above mH ∼ 170 GeV the Higgs boson decay into
two Z bosons, which each decay into two leptons is important. Between the
two values, as already mentioned, the opening of the H → WW into on-shell
W bosons causes a drop in the H → ZZ BR, and thus reduces its possible con-
tribution for the discovery (see Figure 1.13). As the Higgs boson is not massive
enough, one of the Z bosons will be virtual. The most promising channels are
those where the Z bosons decay into electrons or muons. Although τ leptons
can be used, the reconstruction of a Higgs mass peak from Z → ττ decays is
difficult and relatively inefficient, because of the presence of neutrinos which
escape detection. Furthermore, the efficiency of identifying a τ -pair is rather
low. Therefore, only the decays into muons and electrons, and the backgrounds
that affect these channels, are to be considered.
The H → ZZ(∗) → 4l channel is rather clean and it is therefore referred to as
the “Golden Channel” at ATLAS. The excellent energy resolution and linearity
of the reconstructed electrons and muons leads to a narrow 4-leptons invariant
mass peak on top of a smooth background.
The presence of a real Z provide two high pT leptons in the final state together
with other two leptons coming from the virtual Z. A mass constraint can be
made on both lepton pairs. Then the analysis requires high quality lepton iden-
tification, lepton trajectory reconstruction, and lepton momentum resolution
from a detector. The former two are needed to find the four leptons signature
and the latter is important in order to be able to reconstruct the intermediate
(real) Z bosons, allowing to reduce the irreducible background coming from
the ZZ(∗) and Zγ(∗) continua, followed by only leptonic decay in the first case
and by also pair production in the latter case. The most important reducible
backgrounds are tt (tt→Wb+Wb→ lν + lνc+ lν + lνc) and Zbb (Zbb→ llll)
events that result in a four leptons final state. The former dominate because of
the large top production cross section, the latter are harder to reject because of
the real Z in the experimental signature, but requirements of isolation for the
leptons in the final state, an efficient ability to identify the b-jets and impact
parameter requirements26 provide a sufficient background rejection.
In this channel, the Higgs mass can be fully reconstructed and as a result the
mass resolution is very important for mH < 220 GeV, where the Higgs is a nar-
row resonance, therefore the mass resolution is a driving factor for the discovery
potential. H → 4l analyses will be discussed in detail in Chapter 3.
26The impact parameter has average values for the signal that are lower than that forbackground.
43
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
1.3.2.3 High Mass Region
In the “high mass” range (180 GeV . mH . 1 TeV) the Higgs boson decays
exclusively into the massive gauge boson channels with a branching ratio of
∼ 2/3 for WW and ∼ 1/3 for ZZ final states, slightly above the ZZ threshold
[16]. While the latter involves two identical particles (ZZ), the former includes
two different ones (W±W∓), two bosons with opposite charge, which justifies
the factor 2 between them. Then even at the ZZ kinematic threshold, the
WW final state is still more probable than ZZ because the neutral current
(NC) coupling is smaller than the charged current (CC) one. Finally around
mH & 350 GeV also the decay in tt is allowed, but its branching ratio remains
smaller than WW and ZZ ones (see Figure 1.12). In particular for high Higgs
masses: the H → tt branching ratio is at the level of 20% slightly above the
2mt threshold and starts decreasing for mH ∼ 500 GeV to reach a level below
10% at mH ∼ 800 GeV. The reason is that while the H → tt partial decay
width grows as mH , the partial decay width into (longitudinal) gauge bosons
increases as m3H (see 1.82 and 1.83). Then the decay H → tt is not a Higgs
boson discovery channel at LHC.
In the mass region 180 GeV . mH . 700 GeV, the H → ZZ → 4l decay
provides a powerful, almost background free, channel for the Higgs discovery.
Since leptons coming from real Z bosons decay have an high pT , the ZZ and
Zγ backgrounds can be rejected with an efficient pT cut on the reconstructed
Z.
If mH & 700 GeV the discovery in this channel is limited by the decrease of the
inclusive Higgs production cross section and by the increase of the Higgs width.
A larger width of the signal increases the irreducible continuum background.
The large width of a heavy Higgs also makes impossible to observe a mass peak.
The full coverage of the theoretically allowed mH range is therefore obtained
by looking at more effective channels, even though they suffer from large QCD
backgrounds, like:
• H → ZZ → l+l−νν;
• H →W+W− → lνjj;
• H → ZZ → lljj.
The first channel has a six times larger branching ratio than the four leptons
channel27, it has a large missing transverse energy signature, which can be
exploited if the detector has a complete measurement of the energy flow without
holes. The leptons in the channels provide additional handles for experimental
27since the branching ratio of the Z into neutrinos is larger than that into leptons.
44
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
identification. The H → W+W− → lνjj has a fourfold larger rate than the
process with a double leptonic decay. The last channel is dominant with respect
to the purely leptonic signature, because of the great branching ratio for the
hadronic Z decay.
The main sources of background for these channels consist of the WW , WZ,
and ZZ continua, which are of the same order of magnitude as the signal, as well
as, tt, Zjj, and W±jj events, which are three orders of magnitude larger. The
reconstruction of the intermediate vector bosons and the subsequent application
of a mass window allow for an effective rejection of the latter three background
processes. The intermediates can have a very large transverse momentum (pT >
350 GeV/c), because of the high mass of the Higgs boson. The same is then
true for the jets and the leptons that result from their decay. The jets will be
confined to a small region, because of the Lorentz boost, and, with a sufficient
high granularity measurement of the energy flow to disentangle the two jets,
provide an additional signature to reject background. Finally, if the Higgs boson
was produced through gauge boson fusion, then there will be a color suppression
in the central region (none other central jets besides that ones coming fromW/Z
decay) and one or two forward jets with a large transverse momentum (one in
the positive η hemisphere, the other in the negative one). These jets are the
result of the quarks left over from the fusion (see Figure 1.9). These kinematic
features allow in particular to reduce tt background, which is characterized by
a great number of central jets.
1.3.3 Discovery Potential
Figure 1.13: The significance for the SM Higgs boson discovery in variouschannels at ATLAS as a function of mH considering 30 fb−1 collected dataover the full mass region at
√s = 14 TeV [20].
45
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
The observation of a signal with a significance of 5 standard deviations (5σ),
defined according to the estimator S/√B, where S (B) is the expected number
of signal (background) events, can be claimed as discovery of the Higgs boson.
Figure 1.13 shows the sensitivity for the Higgs discovery in units of S/√B for
the individual channels as well as for the combination of the various channels,
assuming integrated luminosities of 30 fb−1 at√s = 14 TeV. In this evaluation
no K-factors, i.e. higher–order QCD corrections to the Higgs production cross
sections and distributions, they are the ratio between NLO and LO total cross
sections: K = σNLO/σLO, have been included28. Actually this plot does not
reflect the current situation in fact at the end of 2012 ∼ 20 fb−1 should be
collected by LHC running at√s = 8 TeV; the current collected luminosity is of
∼ 5.61 fb−1.
1.3.4 The Higgs Mass and Total Decay Width
After the detection of Higgs boson the LHC experiments with higher inte-
grated luminosity may also look into some of its properties like mass and width.
An integrated luminosity of 300 fb−1 at√s = 14 TeV is assumed in the follow-
ing [16, 21, 20].
The Higgs mass can be measured with a very good accuracy. In the range below
mH . 400 GeV where the total width is not too large, a relative precision of
∆mH/mH ∼ 0.1% can be achieved in the channel H → ZZ(∗) → 4l. In the low
Higgs mass range, a slight improvement can be obtained by reconstructing the
sharp Hγγ peak. In the range mH & 400 GeV, the precision starts to deterio-
rate because the Higgs boson width becomes large and because of the smaller
production rates which increase the statistical error. However a precision of the
order of 1% can still be achieved for mH ∼ 700 GeV if theoretical errors, such
as width effects, are not taken into account.
The total decay width of the Standard Model Higgs boson is shown in Figure
1.14. For low masses, below 130 GeV, the Higgs boson is very narrow resonance
with ΓH . 10 MeV, but after the real and virtual gauge boson decay channels
open, the width becomes rapidly wider, reaching ∼ 1 GeV slightly above the ZZ
threshold. For larger Higgs masses, mH & 500 GeV, the Higgs boson becomes
obese: its decay width is comparable to its mass because of the longitudinal
gauge boson contributions in the decays H →WW , ZZ. For mH ∼ 1 TeV, one
has a total decay width of ΓH ∼ 700 GeV, resulting in a very broad resonant
structure. The resonance is no longer visible in the invariant mass plots since it
is too spread, it makes discovery impossible. So the Higgs boson signal cannot
28This is a conservative assumption, provided the K-factor for the signal process of interestis larger than the square root of the K-factor for the corresponding background process.
46
1. The SM and The Higgs Boson 1.3 Higgs at the LHC
be detected in an invariant mass fit, but with an event counting experiment. In
this case the life time of the Higgs boson is so short that it would be improper
to call it “particle”. It would be better to look at the Higgs boson as an inter-
mediate state that enhances the cross sections or opens new final state channels
in the scattering of two particles. However, as previously discussed such masses
(mH & 500 GeV) are highly disfavoured by electroweak precision data.
The Higgs boson width can be experimentally obtained from a measurement
of the width of the reconstructed Higgs peak, after unfolding the contribution
of the detector resolution. This direct measurement is only possible, using
H → ZZ → 4l for Higgs masses larger than 200 GeV, above which the intrinsic
width of the resonance becomes comparable to or larger than the experimental
mass resolution, which is typically of the order of a few GeV. While the pre-
cision is rather poor near this mass value, approximately 60%, it improves to
reach the level of ∼ 5% around mH ∼ 400 GeV29 and the precision stays almost
constant up to a value mH ∼ 700 GeV. Below mH ' 200 GeV the width is too
Figure 1.14: SM Higgs total decay width in GeV as a function of mH [19].
small to be resolved experimentally and can only be determined indirectly.
29For the higher masses the intrinsic width becomes larger and its contribution to the totalresolution dominates compared to the detector resolution.
47
Chapter 2LHC and the ATLAS Detector
The Large Hadron Collider (LHC) [22] is currently the world’s largest hadron
collider, with protons accelerated in a 27 km circumference synchrotron and
colliding at never before reached energy. The LHC is designed to collide pp
pairs at a center of mass energy (√s) of 14 TeV at a high peak luminosity of
1034 cm−2s−1, and is currently colliding at a center of mass energy of 7 TeV. The
ATLAS (acronym for A Toroidal LHC ApparatuS) detector is one of the two
general purpose detectors positioned on the synchrotron ring, aimed to detect
new rare physics, in particular the Higgs boson.
This chapter presents an overview of the LHC in Section 2.1 and of the ATLAS
detector in Section 2.2, with a focus on the ATLAS subdetectors used to measure
energy and momentum of leptons, jet and missingET (missing transverse energy,
the neutrino signature). A brief description of the ATLAS trigger system and of
the lepton reconstruction is also presented in Section 2.2.8 and 2.2.9 and 2.2.10
respectively.
2.1 The Large Hadron Collider
The LHC is installed in a circular tunnel, previously occupied by LEP (Large
Electron Positron Collider), at the European Organization of Nuclear Research
(CERN1) near the French-Swiss border in the North-West of Geneva. The tun-
nel is about 27 km long and about 100 m deep underground.
The LHC is a high energy, high luminosity collider, aimed at discovering new
physics. The principal “modus operandi” is delivering as high a luminosity as
possible, expanding the statistical reach to very rare events, while delivering as
high an energy as possible, enabling the production of rare physics at higher
1Conseil Europeen pour la Recherche Nucleaire.
49
2. LHC and the ATLAS Detector 2.1 The Large Hadron Collider
probability. It is a proton-proton collider, protons are selected as collision ob-
jects to enable both factors. The use of protons rather than electrons enables a
higher collision energy by avoiding large energy losses by synchrotron radiation.
The choice of proton-proton instead of proton-anti-proton collisions is due to
the limited production capability of anti-protons.
As for accelerators like the LHC, the most important parameters are the beam
Figure 2.1: The cross section of the physics processes as a function of thecenter of mass energy at hadron collider. The dashed lines correspond tothe Tevatron (1.96 TeV) and LHC (7 and 14 TeV) collision energies [23].
energy and the number of interesting collisions, since the production rate of a
particular process varies with this two quantities (see Figure 2.1). The number
of collisions is quantified by the machine luminosity, defined as
L =N2
b nbfrevγr4πεnβ∗ F , (2.1)
where Nb is the number of particles per bunch, nb is the number of bunches
per beam, frev is the revolution frequency, γr is the relativistic factor, εn is the
50
2. LHC and the ATLAS Detector 2.1 The Large Hadron Collider
normalized transverse beam emittance, β∗ is the beta function at the collision
point, and F is the geometric luminosity reduction factor due to the crossing
angle at the interaction point.
The number of events per second generated in the LHC collisions is given by
Nevent = Lσevent, (2.2)
here σevent is the cross section for the event under study.
The LHC is designed to collide proton beams at a center mass of energy of
14 TeV and an instantaneous luminosity of L =1034 cm−2s−1, which will extend
the frontiers of particle physics. The nominal number of bunches and protons per
bunch are nb =2808 and Nb =1.15·1011 respectively. The revolution frequency
of a bunch of protons is 11.245 kHz. Beam crossings are 25 ns apart.
Considering only the inelastic cross section, that is about 60 mbarn at√s = 7
TeV, the inelastic event rate at nominal luminosity (NInEvent) is
NInEvent = 1034 · 60 · 10−310−24 = 600 million/s (2.3)
The average crossing rate (R) is given by
R = nbfrev = 2808 · 11245 = 31.6 MHz, (2.4)
so the number of inelastic events per crossing at nominal luminosity (NCevent)
turns out to be
NCevent =NInEvent
R' 19 (2.5)
The LHC has also the capacity to collide heavy ions, in particular lead nuclei,
at 2.76 TeV per nucleon, at a design luminosity of 1027 cm−2s−1.
When the bunch-spacing is 75 ns or less, the beams are brought to collision
at a crossing angle to reduce beam-beam interactions at points other than the
nominal collision point.
The particle energy is mainly limited by the magnetic field of the bending
solenoid keeping the beam in a circular orbit. The magnets can produce a
magnetic dipole field of 8.33 T, required at a beam energy of 7 TeV.
The specific parameters for LHC at the latest and nominal operation are sum-
marized in Table 2.1.
2.1.1 Architectural Overview
As for a particle-particle collider, there should be two rings with counter-
rotating beams, unlike particle-antiparticle colliders that can have both beams
51
2. LHC and the ATLAS Detector 2.1 The Large Hadron Collider
Parameter Late 2011 Nominal
Beam energy 3.5 TeV 7 TeVInstantaneous luminosity 3.6·1033 cm−2s−1 1034 cm−2s−1
Bunch spacing 50 ns 25 nsParticles per bunch 1.1·1011 1.15·1011Bunches per beam 1380 2808Crossing angle 120 µrad 285 µrad
β∗ 1 m 0.55 mεn 1.9-2.3 µm 3.75 µm
Table 2.1: The LHC parameters for the 2011 operation and their designvalues [22, 24]. The quoted value for the instantaneous luminosity is thepeak value.
sharing the same phase space in a single ring. Due to the limited space, it led to
the “two-in-one” superconducting magnet design. This design accommodates
the windings for the two beam channels in a common cold mass and cryostat,
with the magnetic flux circulating in the opposite sense through the two chan-
nels. The LHC synchrotron uses a total of about 1600 bending and focusing
magnets: 1232 identical dipole magnets, for keeping particles in their nearly cir-
cular orbits, and 392 identical quadruple magnets for focusing the beams. The
dipoles are placed in the curved sections of the LHC ring and the quadrupoles
in the straight sections. All of these magnets make use of superconducting
niobium-titanium (Nb-Ti) cables and operate at a low temperature of 1.9 K.
As shown in Figure 2.2(a), LHC is not a perfect circle. It is made of eight
(a) (b)
Figure 2.2: Figure (a): layout of the LHC ring with the four interactionpoint [25]. Figure (b): the LHC injection scheme [26].
arcs and eight straight sections. An insertion consists of a long straight section
plus two (one at each end) transition regions, the so-called “dispersion suppres-
52
2. LHC and the ATLAS Detector 2.1 The Large Hadron Collider
sors”. The exact layout of the straight section depends on the specific use of
the insertion: physics (beam collisions within an experiment), injection, beam
dumping, beam cleaning. A sector is defined as the part of the machine between
two insertion points. An octant starts from the middle of an arc and ends in
the middle of the following arc and thus spans a full insertion.
Four insertions are used as experimental insertions, where six experiments are
installed: ALICE (A Large Ion Collider Experiment), ATLAS, CMS (Common
Muon Solenoid), LHCb (Large Hadron Collider beauty), LHCf (Large Hadron
Collider forward) and TOTEM (TOTal Elastic and diffractive cross section
Measurement). ALICE, ATLAS, CMS and LHCb are installed in four huge
underground caverns built around the four collision points of the LHC beams,
in particular ATLAS and CMS are located at the two highest-luminosity in-
teraction points diametrically opposite. TOTEM is installed close to the CMS
interaction point and LHCf is installed near ATLAS.
• ATLAS and CMS are two general purpose detectors designed to cover the
widest possible range of physics at the LHC, from the search for Higgs
boson to supersymmetry (SUSY) and extra dimensions.
• ALICE is a detector specialized in analysing lead-ion collisions. It will
study the properties of quark-gluon plasma2.
• LHCb is designed mainly for the study of B-physics. A specialized study
of the slight asymmetry between matter and antimatter present in inter-
actions of B-physics may lead to the discovery of new physics.
• LHCf is a small experiment that is constructed to measure neutral parti-
cles produced very close to the direction of the beams at the LHC. The
motivation is to test models used to estimate the primary energy of the
ultra high-energy cosmic rays.
• TOTEM is also a small experiment dedicated to the measurement of the
total proton-proton cross section with a luminosity-independent method.
2.1.1.1 Acceleration Chain
Before entering the LHC main ring, protons need to go through a pre-
acceleration chain, called the LHC injector chain, which increases their energy
to 450 GeV. The pre-acceleration chain involves a linear accelerator (LINAC)
and three smaller synchrotron rings. Figure 2.2(b) shows the injection scheme
of LHC.
2A state of matter where quarks and gluons, under conditions of very high temperaturesand densities, are no longer confined inside hadrons. Such a state of matter probably existedjust after the Big Bang, before particles such as protons and neutrons were formed.
53
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
Hydrogen atoms are taken from a bottle containing hydrogen. Protons are pro-
duced by stripping electrons from hydrogen atoms, then they are preaccelerated
in a radio-frequency (RF) cavity to 750 KeV. After this, they are injected into
the LINAC2 which increases their energy to 50 MeV. Afterwards, they are in-
jected step by step into the Proton Synchrotron Booster (PSB), the Proton
Synchrotron (PS), and the Super Proton Synchrotron (SPS). The protons stay
in each of the three rings until they reach the targeted high energy. They leave
the PSB at 1.4 GeV and leave the PS at 25 GeV. In the SPS, they reach the en-
ergy 450 GeV, and are subsequently transferred to the LHC synchrotron, where
they are accelerated for 20 minutes to their nominal energy of 7 TeV. In Figure
2.2(b) the corresponding proton velocity at the end of each acceleration step is
denoted.
Once the desired energy is reached, the beams are kept at that energy for a few
hours (since 10 to 20 hours) and squeezed (compressed to be as thin as pos-
sible) so that the collisions at the experiments lead to a best-possible collision
rate (luminosity). The beams are then collided. The physics of the collisions is
recorded by the quoted experiments. Once the beams are depleted, the remains
are dumped and the current in the magnet ramped down. Then the whole story
restarts from the beginning.
On September 19th, 2008, the LHC suffered a quench incident during com-
missioning of the final LHC sector (sector 3-4) for operation at beam energy
5 TeV, resulting in a large helium leak into the tunnel and serious mechanical
damages to 24 dipole magnets and 5 quadruple magnets. After a yearlong shut-
down during which replacement magnets were installed, damages were repaired,
and quench monitoring system was improved, the LHC resumed operation in
late 2009. In December 2009, the first proton-proton collisions at the center
of mass energy of 900 GeV were delivered by the LHC. At the end of March
2010, first collisions at√s = 7 TeV were recorded at the LHC. LHC ran at 7
TeV until the end of 2011 and it has been decided that it will run at 8 TeV
until the end of 2012, after which there will be a 1.5 year shutdown to make
improvements. In 2014 the LHC will come online at 14 TeV.
2.2 ATLAS Detector
ATLAS (A Toroidal LHC Apparatus) [27, 28] is one of the four experiments
approved to run at the LHC, and it has been designed to be a general-purpose
detector, meaning it should be versatile enough to detect physics signals with a
wide range of signatures. The ATLAS detector is searching for new discoveries
in the head-on collisions of protons of extraordinarily high energy. ATLAS will
54
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
learn about the basic forces that have shaped our Universe since the beginning
of time and that will determine its fate. Among the possible unknowns are the
origin of mass, extra dimensions of space, unification of fundamental forces, and
evidence for dark matter candidates in the Universe.
2.2.1 Geometry and Definitions
In the following discussions, a right-handed cartesian coordinate system is
defined by placing the origin at the interaction point (I.P.) in the middle of the
detector. The z-axis is defined along the beam direction and the x − y plane
is transverse to the beam direction. The positive x-axis is defined as pointing
from interaction point to the center of the LHC ring and the positive y-axis is
defined as pointing vertically upwards. The side-A of the detector is defined as
that with positive z and side-C is that with negative z.
Customarily a cylindrical coordinate system is used as well, it is defined by R,
φ and θ.
• R is the radial vector from the interaction point and out (R =√x2 + y2);
• φ is the azimuthal angle. It is measured around the beam axis on the
perpendicular plane to z (−π < φ < π);
• θ is the polar angle, formed by the direction of the emitted particle with
the z-axis. It is measured from the beam axis (0 < θ < π).
At collider experiments it is quite common to introduce the pseudorapidity
variable η in place of θ. The pseudorapidity is related to θ as
η = − ln
(tan
θ
2
), (2.6)
it has the great advantage that transforms in addictive way under Lorentz trans-
formation, then the difference ∆η = η2 − η1 is a relativistic invariant.
In the case of massive objects such as jets, the rapidity
y =1
2ln
(E + pzE − pz
)(2.7)
is used instead, as the pseudorapidity can approximate the rapidity only when
it is possible to neglect the mass with respect to the energy.
The transverse momentum pT , the transverse energy ET , and the missing trans-
verse energy EmissT are defined in the x− y plane unless stated otherwise. Ad-
ditional parameters to fully describe the track of a particle are the transverse
55
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
impact parameter d0, which is the distance of the track’s point of the closest
approach to the beam axis in the transverse plane, and z0, the longitudinal
distance of this particular point (see Section 3.5.2). The spacial separation of
two particle tracks is expressed in terms of ∆R, in the pseudorapidity-azimuthal
angle space it is defined as ∆R =√
∆η2 +∆φ2.
2.2.2 Physics Requirements
The main goals of the ATLAS detector are to perform precision tests of QCD
and electroweak interactions and search for Higgs boson and new physics. In
particular, the search for the SM Higgs boson has been used as a benchmark
for the design of the ATLAS detector. Since the dominant decay channel of the
Higgs boson is unknown, due to its unknown mass, the detector has to be able to
cope the all possible decay scenarios. These benchmark requirements, combined
with the high luminosity, high beam energy and high background production at
the LHC have been translated into the following set of design requirements:
• Because of the very high luminosity and large particle flux, the detectors
need fast, radiation-hard electronics and sensor elements. In addition,
high granularity is needed to handle the large number of particles and to
reduce the influence of overlapping events.
• Large acceptance in pseudorapidity with almost full azimuthal angle cov-
erage is required. It ensures no high momentum particle can escape de-
tection.
• Good charged-particle momentum resolution and reconstruction efficiency
in the inner tracker are essential. For offline tagging of τ -leptons and b-
jets, vertex detectors close to the interaction region are required to observe
secondary vertices.
• Excellent electromagnetic (EM) calorimetry is needed for electron and
photon identification and measurements as well as a full coverage hadronic
calorimetry for accurate jet and missing transverse energy measurements.
• Good muon identification and momentum resolution over a wide range of
momenta and the ability to determine unambiguously the charge of high
pT muons are fundamental requirements.
• Highly efficient triggering on low transverse momentum objects with suffi-
cient background rejection is a prerequisite to achieve an acceptable trigger
rate for most physics processes of interest.
56
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
2.2.3 ATLAS Detector Overview
Figure 2.3: Schematic Atlas layout: the main subdetectors are depicted[27].
The ATLAS detector is housed in a hall about 100 m underground, in cor-
respondence of the “Point 1” of the LHC ring, the interaction point closest to
the Meyrin site. ATLAS covers almost the whole solid angle with its onion-like
structure. It is cylindrical, weighs approximately 7000 tons, is 44 m long and
25 m high. It is nominally forward-backward symmetric with respect to the
interaction point. The ATLAS detector is divided in three longitudinal regions,
one is central, the other two lateral; sub-detectors in the central part are named
with the barrel- prefix, the others with the extended barrel or end-cap prefixes.
In Figure 2.3 ATLAS and its sub-detectors are depicted: in the central part
near the beam line is housed the most internal sub-system, the tracker, embed-
ded into the 2 T solenoidal magnetic field; the solenoid is the structure around
the tracker and holds the electromagnetic calorimeter and this is surrounded by
the hadronic calorimeter. All around there are 8 giant coils providing for the
toroidal magnetic field: its goal is to bend the escaping muons, measured by the
external muon chambers. So the ATLAS detector is formed by six sub-systems:
the magnet system, the inner detector (ID), the calorimeters, the muon spec-
trometer, the trigger system and the data acquisition system. Each one of these
systems is formed by different parts which are distinguished by the technologies
used according to the functions to do.
Table 2.2 summarizes the general goals for the ATLAS sub-detectors and
their coverage. For the calorimeter, the requirement is on the energy resolution
σE/E; for the inner detector and the muon spectrometer, the requirement is on
57
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
Figure 2.4: Sagitta S in three-point measurement. l is the distance betweenthe outer measurements A and B. The sagitta S of the circular trajectorywith curvature radius ρ through the points A, D, B is defined as the shortestdistance CD from this trajectory to the the middle point C on the lineconnecting A and B.
the momentum resolution σpT/pT , which can be expressed as:
σpT
pT∼ σS
S∼ pTσS
l2B(2.8)
where B is the magnetic field strength, l is the length of the arc of the track
(determined by the size of the tracking detector) and S is the sagitta3, defined
as the maximum deviation of a circle from a straight line, see Figure 2.4. It is
determined from the measurement of the muon track position in three succes-
sive chamber stations. The precision track position measurement is performed
in bending plane of the toroidal magnetic field. This means measuring with high
precision the z coordinate of the track points in the barrel and the R coordinate
in the x − y plane in the end-caps. Note that the sagitta is larger and can be
measured with higher relative accuracy, when the distance l between the outer
measurements A and B in Figure 2.4 is larger.
In the following sections, descriptions will be given for the three major subde-
tector systems (inner detector, calorimeters and muon spectrometer) and for
the magnet system, followed by an introduction to the ATLAS trigger system
and lepton reconstruction.
2.2.4 The Inner Detector (ID)
The Inner Detector (ID), shown in Figure 2.5, is the heart of the ATLAS
detector with a length of 6.2 m and a diameter of 2.1 m. It is internally confined
by the beam pipe in which protons travel and exteriorly by the superconductor
central solenoid that provides a nominal magnetic field of 2 T along z-axis (see
Section 2.2.6). Due to close vicinity to the interaction point (I.P.), a high-
3The sagitta is linked to the transverse momentum pT of the muon:
pT =l2 ·B8S
(2.9)
58
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
Subdetector Required resolutionη coverage
Measurement Trigger
TrackingσpTpT
= 0.05%pT ⊕ 1% ±2.5
EM calorimetry σEE
= 10%√E
⊕ 0.7% ±3.2 ±2.5
Hadronic calorimetry (jets)
barrel and end-cap σEE
= 50%√E
⊕ 3% ±3.2 ±3.2
forward σEE
= 100%√E
⊕ 10% 3.1 < |η| < 4.9 3.1 < |η| < 4.9
Muon spectrometerσpTpT
= 10% at pT = 1 TeV ±2.7 ±2.4
Table 2.2: Required resolution and coverage of the main ATLAS sub-systems. Note that, for high-pT muons the muon spectrometer performanceis independent of the inner detector system. The units for E and pT are inGeV.
Figure 2.5: Cut-away view of the ATLAS inner detector [27].
59
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
granularity detector is required. It is designed to provide hermetic and robust
pattern recognition, excellent momentum resolution4 and both primary and
secondary vertex measurements for charged tracks above a given pT threshold
of ∼ 0.1 GeV and within the pseudorapidity range |η| < 2.5. It also provides
electron identification over |η| < 2.0 and a wide range of energies (between 0.5
GeV and 150 GeV).
The ID consists of three independent but complementary sub-detectors, with
the highest precision in the innermost layers near the interaction points:
• Closest to the interaction point a semiconductor pixel detector providing
3-dimensional space points and secondary vertex reconstruction;
• In the middle, a silicon strip detector (SCT, “Semiconductor Tracker”),
which provides 3-dimensional space points;
• Surrounding the other two, a straw tracker (TRT, “Transition Radiation
Tracker”), providing measurements in the bending plane and particle iden-
tification.
Each of these subdetector systems is separated into cylindrical barrel sections
with active detector elements perpendicular to the radial direction, and end-
cap sections with detector elements perpendicular to the beam; this design
optimizes the resolution perpendicular to the particle path and minimizes the
total material that particles pass through. A particle from I.P. traversing the
complete inner detector will cross on average at least 3 pixel layers, 4 SCT strip
layers and about 36 TRT tubes, see Figure 2.6(a). The main parameters of the
ID subdetectors are summarized in Table 2.3.
The inner detector will give a typical momentum resolution of
σpT
pT= 0.05%pT ⊕ 1%. (2.10)
The high-radiation environment imposes stringent conditions on the inner de-
tector sensors, on-detector electronics, mechanical structure and services. Over
the ten-year design lifetime of the experiment, the pixel inner vertexing layer
(B-layer) must be replaced after approximately three years of operation at de-
sign luminosity. In order to minimize dark current noise as radiation builds up
in the detection material, the silicon sensors must be kept at low temperature
approximately from −5 to −10 ℃. In contrast, the TRT is designed to operate
at room temperature.
4The magnetic field bends the charged particles thus allowing to measure the momentumby using the curvature of the tracks.
60
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
(a)
(b)
Figure 2.6: Figure (a): Drawing showing the sensors and structural ele-ments traversed by a charged track of pT = 10 GeV in the barrel inner de-tector (η = 0.3). The track traverses successively the beryllium beam-pipe,the 3 cylindrical silicon-pixel layers, the 4 cylindrical double layers (one axialand one with a stereo angle of 40 mrad) of barrel silicon-microstrip sensors(SCT), and approximately 36 axial straws contained in the barrel transition-radiation tracker modules. Figure (b): Plan view of a quarter-section of theATLAS inner detector showing each of the major detector elements with itsactive dimensions and envelopes. The labels PP1, PPB1 and PPF1 indicatethe patch-panels for the ID services. Figures are taken from [27].
61
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
System PositionResolution σ Channels
η coverage[µm] [106]
Pixels1 removable barrel layer Rφ = 10, z = 115 13.2 ±2.52 barrel layers Rφ = 10, z = 115 54 ±1.72× 3 end-cap disks Rφ = 10, R = 115 26.4 1.7-2.5
SCT4 barrel layers Rφ = 17, z = 580 3.2 ±1.42×9 end-cap disks Rφ = 17, R = 580 3.0 1.4− 2.5
TRT73 axial barrel straw panels 130(per straw) 0.1 ±0.7160 radial end-cap straw planes 130(per straw) 0.32 0.7− 2.5
Table 2.3: Main parameters of the inner detector [27].
2.2.4.1 The Pixel Detector
The pixel detector consists of three concentric cylindrical layers in the barrel
around the beam axis, at radii of 50.5 mm, 88.5 mm, 122.5 mm from the center
of ATLAS, so that all tracks in the region |η| <1.9 pass trough all three layers,
and three disks perpendicular to the beam axis in each and-cap at a distance of
495 mm, 580 mm, 650 mm, positioned in such a way that the region of three-
layer coverage is extended to |η| <2.5, as seen in Figure 2.6(b). Each layer
is made of overlapping, identical silicon sensors mounted on modules that are
segmented in R − φ and z into small rectangles, the pixels. The nominal pixel
size is of 50×400 µm2. Each charged particle will hit a cluster of sensors and
the amount of collected charge is used to determine the cluster’s center. The
resolution in the barrel is of 10 µm (R−φ) and 115 µm (z) and in the disks it is
10 µm (R−φ) and 115 µm (R). The innermost barrel layer, called the B-layer,
is only 50.5 mm away from the beam line and enhances the ability to identify
secondary vertices for b-tagging.
2.2.4.2 The SCT detector
Like the pixel detector, the SCT detector uses silicon sensors, which are
segmented into strips. The silicon microstrip tracker consists of four concentric
barrel layers and each of its end-caps have a total of nine layers ; the geometry
can be seen in Figure 2.6(b). Each of its layers have a double layer of strips
glued back-to-back, with the back-side strips rotated 40 mrad relative to the
front-side strips, so that when a charged particle passes through, information
on which strips on each side have charge deposited can be combined. The av-
erage width of the strips, the strip pitch, is 80 µm, which results in an intrinsic
point resolution of about 23 µm per single side measurement in the coordinate
perpendicular to the strip direction. The intrinsic accuracies per module in the
barrel are 17 µm (R − φ) and 580 µm (z) and in the disks are 17 µm (R − φ)
and 580 µm (R).
62
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
2.2.4.3 The TRT detector
The outermost component of the inner detector, the TRT, is a straw tracker
combined with transition-radiation detection for electron identification. The ba-
sic detecting unit of the TRT is polyimide drift tubes with 4 mm diameter. The
straws are installed in three cylindrical layers of barrel TRT modules and 20×2
disks of end-cap modules. Each module contains multiple layers of straws5. The
central anode wire, 31 µm in diameter, is made of tungsten plated with gold.
The cathode, on the inside of the tube itself, is made of aluminum protected by
a layer of graphite-polyimide. The standard gas mixture is 70% Xe, 27% CO2,
3% O2; the gas is ionized by passing particles, with an electron collection time
of 48 ns. The time until the ions reach the wire in the straw is measured to
compute the distance the particles traversed the tube from wire.
Polypropylene/polyethylene foils are installed in the space between the layers
of straws as the radiator, that cause high energy electrons (momentum above 2
GeV) to produce significant numbers of transition radiation photons6. The Xe
absorbs these low-energy photons, producing a significantly amplified signal; the
front-end electronics have a high-threshold discriminator to detect these signals,
allowing electrons to be identified as tracks with a significant number of high-
threshold hits. In fact there are two discriminators: one with a low threshold
to detect minimum ionizing radiation and the other with a high threshold to
detect transition radiation.
With a small average distance between the straws, the TRT provides a large
number of tracking points (typically 36) per track. The relatively low resolution
per tracking point (130 µm) is compensated by the large number of measure-
ments and the bigger size of the detector (Equation (2.8)).
2.2.5 The Calorimeters
The ATLAS calorimeters [29, 30] are sub-detectors used to measure the en-
ergy of particles. They comprise an electromagnetic calorimeter part and a
hadronic one, since different materials are needed for the measurement of elec-
trons and photons on one side and hadrons on the other. In Figure 2.7 a cut-
away view is shown of the ATLAS calorimeters. Each part consists of detectors
with full φ-symmetry and coverage around the beam axis. The electromagnetic
calorimeter is placed beyond the inner detector, and the hadronic calorimeter is
outside the electromagnetic one. The calorimeters closest to the beam line are
housed in three cryostats, one barrel and two end-caps. The barrel cryostat con-
5TRT contains up to 73 layers of straws interleaved with fibres (barrel) and 160 strawplanes interleaved with foils (end-cap).
6Transition radiation consists of x-ray photons emitted by charged particles traversing theboundary between materials with different dieletric constants.
63
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
Figure 2.7: Overview of the ATLAS calorimeter.
tains the electromagnetic barrel calorimeter, whereas the two end-cap cryostats
each contain an electromagnetic end-cap calorimeter (EMEC), at lower pseu-
dorapidity, a hadronic end-cap calorimeter (HEC), located behind the EMEC,
and a forward calorimeter (FCal) to cover the region closest to the beam. The
hadronic barrel calorimeter, normally called tile calorimeter (TileCal), is placed
outside the electromagnetic barrel calorimeter.
The calorimeters consist of alternate layers of an absorbing material where the
particles produce showers, loose their energy and are finally stopped and an
active material where the particle showers are measured. By this “sampling”
procedure the energy of the traversing particles can be determined.
The ATLAS calorimeter system cover the range |η| < 4.9, its segmentation is
such that several shower samplings are provided both in longitudinal and in
transverse direction. The calorimeters are designed in order to identify charged
and neutral particles and jets, and measure their energy. By measuring all these
energies, the missing energy in the transverse plane (EmissT ) can be calculated
by summing all the measured energy deposits. Missing energy can be caused
by neutrinos or possibly new physics, such as supersymmetry or models with
extra dimensions. Therefore calorimeters must provide good containment for
electromagnetic and hadronic showers, and must also limit punch-through7 into
the muon system. Hence, calorimeter depth is an important design consider-
ation. In the following sections the electromagnetic and hadronic calorimeters
are described in detail.
7Punch-through occurs when muons produced by light hadrons, such as pions and kaons,travel through the calorimetric system, reach the spectrometer and give rise to a backgroundsignal in it.
64
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
2.2.5.1 The Electromagnetic Calorimeter
The electromagnetic calorimeter [29, 30] is divided into a barrel (|η| < 1.475)
and two end-caps (1.375 < |η| < 3.2). Each region is housed in a separate cryo-
stat, which thermally isolates the detector and keeps it at ∼ 88.5 K.
The barrel calorimeter consists of two identical half-barrels, separated by a small
gap (4 mm) at z = 0, one covers the region with z < 0 and the other covers the
region with z > 0. Each end-cap calorimeter is mechanically divided into two
coaxial wheels: an outer wheel covering the region 1.375 < |η| < 2.5, and an
inner wheel covering the region 2.5 < |η| < 3.2. In the range |η| < 1.8 the elec-
tromagnetic calorimeter is preceded by a presampler designed in order to take
in account the energy lost by electrons and photons in the materials upstream
of the calorimeter.
Because of its resistance to radiation and its stability response over time the
sensing element is liquid argon (LAr), which fills the space between lead ab-
sorbers placed parallel to the beam direction in a geometric accordion structure,
which provides full coverage in φ without azimuthal cracks. Each lead absorber
is sandwiched between two sheets of stainless steel to provide structural sup-
port. In between absorber layers there are copper electrodes which are formed
by three sheets of copper separated by a thin insulating layer of polyimide. The
charged particle that passes through this elementary cell ionizes LAr: ionized
particles are picked-up by the external electrodes and these signals are coupled
in the inner one.
The barrel electromagnetic calorimeter and the precision region in the end-cap
calorimeters (1.5 < |η| < 2.5) are divided in depth into three longitudinal lay-
ers. The three segmentation levels vary in η and radial depth. The innermost
layer has the finest granularity along η. It is designed for γ/π separation and
for precise η measurements of neutral particles. The second layer collects the
largest fraction of the energy of the electromagnetic shower. The third layer
collects only the tail of the electromagnetic shower, it is used to help measure
high energy showers and separate between electromagnetic and hadronic show-
ers, therefore it is less segmented in η. The outermost region |η| < 1.5 of the
end-cap outer wheel and the inner wheel (2.5 < |η| < 3.2) are segmented in only
two longitudinal layers and have a coarser transverse granularity.
The total thickness of a module increases from 22 X08 to a maximum of 33
X0 in the barrel and from 24 X0 to 38 X0 in end-caps as |η| increases. The
8Radiation length is the typical length over which the energy of an electron is reduced bya factor e.
65
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
electromagnetic calorimeter is designed to have an energy resolution of9
σEE
=10%√E
⊕ 0.7% (2.11)
in the energy range spread from 2 GeV to 5 TeV. The resolution is worse in
the end-caps than in the barrel region due to the presence of more material.
The “crack-region”, i.e. the transition region between the barrel and the end-
cap cryostats (1.37 < |η| < 1.52), is usually not used for photon identification
nor for precision measurements with electrons since the energy resolution is
significantly degraded, despite the presence of scintillators to correct for the
energy lost in the barrel cryostat flange.
2.2.5.2 The Hadronic Calorimeter
The ATLAS hadronic calorimeter is composed by three parts with different
detection techniques in function of pseudorapidity:
Tile Calorimeter (TileCal) [31] is placed in the region |η| < 1.7, directly
outside the EM calorimeter envelope, behind the LAr electromagnetic
calorimeter. It is subdivided into a central barrel (|η| < 1.0) and two
extended barrels (0.8 < |η| < 1.7). It is a sampling calorimeter using
steel as the absorber and scintillating tiles as the active material. It is
segmented in depth in three layers for a total radial depth of 7.4 λ10. The
total detector thickness at the outer edge of the tile-instrumented region
is 9.7 λ at η = 0. The tiles are oriented radially and normally to the beam
line and are staggered in depth. Two side of scintillating tiles are read
out by wavelength shifting fibres into two separate photomultiplier tubes.
Between the barrel and the extended barrels there is a gap of about 68
cm, which is needed for the inner detector and LAr cables, electronics and
services. This gap region is instrumented with special modules, made of
steel-scintillator sandwiches and with thin scintillator counters allowing to
partially recover the energy lost in the crack regions. TileCal is designed
to have an energy resolution for the jets reconstruction of [31]
σEE
=50%√E
⊕ 3%. (2.12)
Hadronic End-cap Calorimeter (HEC) [32] is a copper/liquid argon sam-
9The first term is the stochastic term and reflects the statistical fluctuations in the develop-ment of the shower (like the number of particles and the fraction that is lost in the absorbers).The constant term represent local non-uniformities in the calorimeter response.
10Nuclear interaction length is the mean distance travelled by a hadronic particle beforeundergoing an inelastic nuclear interaction.
66
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
pling calorimeter with a flat-plate design, which covers the range 1.5 <
|η| < 3.2. The HEC located directly behind the end-cap electromagnetic
calorimeter shares each of the two liquid-argon end-cap cryostats with the
electromagnetic end-cap (EMEC) and forward (FCal) calorimeters. The
HEC consists of two indipendent wheels in each end-cap cryostat: a front
wheel (HEC1) and a rear wheel (HEC2), each wheel containing two lon-
gitudinal sections, for a total of four layers per end-cap. The wheels are
cylindrical with an outer radius of 2.03 m. The HEC1 and HEC2 modules
are made of 25 and 17 copper plates, respectively. In this region the en-
ergy resolution required for the jets reconstruction is the same of Equation
(2.12). Approximately 12 λ are required to fully contain the jets from the
14 TeV pp collisions at the LHC.
Forward Calorimeters. The forward calorimeters (FCal) are located in the
same cryostats as the end-cap calorimeters and provide coverage over
3.1 < |η| < 4.9. As the FCal modules are located at high η, at a dis-
tance of approximately 4.7 m from the interaction point, they are exposed
to high particle fluxes. This has resulted in a design with a very small
liquid argon gaps. The FCal is approximately 10 λ deep, and consists
of three modules in each end-cap made of a metal matrix with regularly
spaced longitudinal channels filled with the electrode structure consisting
of concentric rods and tubes parallel to the beam axis: the first (FCal1),
made of copper, is optimised for electromagnetic measurements, while the
other two (FCal2 and FCal3), made of tungsten, measure predominantly
the energy of hadronic interactions. The total thickness of the calori-
metric system allows to reduce the cascades and the background in the
spectrometer due to the punch-through, but the momentum resolution is
worst: FCal is designed to provide an energy resolution for hadrons of
σEE
=100%√E
⊕ 10%. (2.13)
2.2.6 The Magnet System
ATLAS features a unique hybrid system of four large superconducting mag-
nets. This magnetic system [33] is 22 m in diameter and 26 m in length, with a
stored energy of 1.6 GJ. The magnet system is composed of a central solenoid
(CS) and a system of three superconducting air toroids: a barrel toroid (BT)
and two end-cap toroids (ECT) arranged in a configuration so that there is
a zero magnetic field inside the calorimeter. The magnetic system covers the
pseudorapidity range |η| < 3 and is made with a structure in the air, i.e without
the use of iron, in order to minimize the multiple scattering of the muons, which
67
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
degrades the momentum measurement. All of the superconducting magnets op-
erate at 4.5 K.
The central solenoid is aligned on the beam axis and is placed outside the
inner detector before the electromagnetic calorimeter. The conductor is a com-
posite that consists of a flat superconducting cable located in the center of an
aluminum stabiliser with rectangular cross-section. The central solenoid is de-
signed to provide a 2 T axial magnetic field with a peak of 2.6 T. The solenoid
is designed to be as thin as possible and shares with the LAr calorimeter one
common vacuum vessel in order to minimize material thickness in front of the
barrel electromagnetic calorimeter.
The barrel and end-cap toroids provide magnetic field for the barrel and
end-cap muon tracking chambers, respectively. Each toroid consists of eight
rectangular coils (kept in position by 16 support rings) assembled radially and
symmetrically around the beam axis. The coils are of a flat racetrack type
with two double-pancake windings made of 20.5 kA aluminum stabilized Nb-Ti
superconductor. The field strength varies from 0.15 T to 2.5 T, with a peak
of 3.9 T, in each barrel coil and varies from 0.2 to 3.5 T, with a peak of 4.1
T, in end-cap coils. In order to reduce the amount of material in the spec-
trometer, the barrel coils are housed in separate cryostats, while there are two
end-cap cryostats housing eight coils each, because the end-cap coils are smaller
in size. The end-cap coils are rotated by 22.5 ℃ with respect to the barrel ones
in order to provide radial overlap with the barrel toroid and to optimize the
bending power in the interface regions of both coil systems. Due to the finite
number of coils the magnetic field is not perfectly toroidal, the transition region
(1.4 < |η| < 1.6) is marked by large changes of the field integral and the muon
momentum resolution will suffer most in this region due to uncertainty in the
bending power.
2.2.7 The Muon Spectrometer
2.2.7.1 Muon Spectrometer Design
Muons experience weak and electromagnetic interaction, but no strong in-
teraction. Therefore they rarely produce hadronic showers and, because of their
large mass compared to electrons, less frequently produce electromagnetic show-
ers via bremsstrahlung. Thus, the main energy loss mechanism for muons is
ionization. As a result, muons can pass through the calorimeters with little
perturbation and reach the muon spectrometer (MS).
The muon spectrometer is the outermost subsystem of the ATLAS detector. The
muon momentum can be determined by measuring the position of the muon at
three points in space. The trajectory of the muon is curved due to the magnetic
68
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
field and higher the momentum is lower the curvature is. The curvature is mea-
sured in the track fit where the magnetic field is known in detail. However, for
a good approximation and practical application the sagitta is used (see Section
2.2.3). In order to be able to reconstruct the momentum with the three-point
method, the muon spectrometer is designed such that every muon with |η| < 2.7
will cross at least three detector stations with the exception of a few regions with
less coverage, for example those regions with support structures or passages for
services. When a particle traverses only two stations, the I.P. is taken as the
third measurement and the momentum determination is based on the difference
between the angles to the I.P.. As there is a relatively large uncertainty on the
scattering in the calorimeter, such a measurement is less precise.
The muon spectrometer is designed with the requirement of a 2-3% accuracy
Figure 2.8: Cut-away view of the ATLAS muon spectrometer [27].
and a 10% precision on pT for <100 GeV and for 1 TeV muons respectively.
Given the magnet system, the sagitta will be about 0.5 mm for 1 TeV muons.
Therefore, to get a 10% error on the momentum, a 50 µm precision on the
sagitta is required. At low momentum (pT < 30 GeV), the resolution is dom-
inated by fluctuations in the energy loss of the muons traversing the material
in front of the spectrometer. Multiple scattering in the spectrometer plays an
important role in the intermediate momentum range (30 GeV < pT <200 GeV).
For pT > 300 GeV, the single-hit resolution, limited by detector characteristics,
alignment and calibration, dominates [18]. At the LHC, very high-energy (&100
GeV) muons can be produced. At such a high energy, the sagitta of the muon
track in the relatively small inner detector becomes too small to be accurately
measured, degrading the momentum resolution (Equation (2.8)). This makes
69
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
the muon spectrometer extremely important in detecting high-energy muons.
The ATLAS muon spectrometer has two main objectives: to provide a stan-
dalone11 and momentum dependent trigger and secondly to provide standalone
muon reconstruction. These objectives are each fulfilled by a separate system
of detectors: high-precision tracking chambers for accurate momentum mea-
surement in the pseudorapidity range |η| < 2.7 and fast response chambers for
effective triggering in the region |η| < 2.4. Figure 2.8 shows the layout of this
spectrometer. Muon momenta down to a few GeV (∼3 GeV, due to energy loss
in the calorimeters) may be measured by the spectrometer alone. Even at the
high end of the accessible range (∼3 TeV), the stand-alone measurements still
provide adequate momentum resolution and excellent charge identification.
Precision-tracking chambers in the barrel region are located between and on the
eight coils of the superconducting barrel toroid magnet, while the end-cap cham-
bers are in front and behind the two end-cap toroid magnets. The φ-symmetry
of the toroids is reflected in the symmetric structure of the muon chamber sys-
tem, consisting of eight octants. Each octant is subdivided in the azimuthal
direction in two sectors with slightly different lateral extensions, a large and a
small sector, leading to a region of overlap in φ . This overlap of the chamber
boundaries minimizes gaps in detector coverage and also allows for the relative
alignment of adjacent sectors using tracks recorded by both a large and a small
chamber.
The chambers in the barrel are arranged in three concentric cylindrical shells
around the beam axis, while in the two end-cap regions, muon chambers form
four large wheels, perpendicular to the z-axis. Figures 2.9 give cross-sections in
the planes transverse to, and containing, the beam axis. In the center of the
detector (|η| ≈ 0), a gap in chamber coverage has been left open to maintain
service access to the solenoid magnet, the calorimeters and the inner detector.
The size of the gap varies from sector to sector depending on the service neces-
sities, the biggest gaps of 1-2 m are located in the large sectors. Additional gaps
in the acceptance are in the feet region due to the supporting structure and in
the transition region, where barrel and end-cap parts overlap.
Because the expected rates vary with pseudorapidity, four different technolo-
gies are used to cover different η regions: Monitored Drift Tube Chambers
(MDT) and Chatode Strip Chambers (CSC) as tracking chambers, Resistive
Plate Chambers (RPC) and Thin Gap Chambers (TGC) as trigger chambers.
MDT covers the region up to |η| = 2.7, except for the innermost end-cap layers
where their coverage is limited to |η| < 2.0. In the forward region (2 < |η| < 2.7),
CSC are used in the innermost tracking layer. The trigger system covers the
11Muons only reconstructed using the muon spectrometer tracks are called standalonemuons, see the Section 2.2.10.
70
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
Figure 2.9: Left: Cross-section of the barrel muon system perpendicularto the beam axis (non-bending plane), showing three concentric cylindri-cal layers of eight large and eight small chambers. The outer diameter isabout 20 m. Right: Cross-section of the muon system in a plane containingthe beam axis (bending plane). Infinite-momentum muons would propa-gate along straight trajectories and typically traverse three muon stations.Figures taken from [27].
pseudorapidity range |η| < 2.4 and it serves a threefold purpose: provide well-
defined pT thresholds; provide bunch crossing identification; measure the muon
coordinate in the direction orthogonal to that determined by the precision-
tracking chambers. RPCs are used in the barrel and TGCs in the end-cap
regions. In Table 2.4 the parameters of the four technologies in the muon spec-
trometer are shown. The individuals devices will be discussed in the following
sections.
By design, each tracking station provides an error of approximately 35 µm. The
alignment system, based on tracks and an optical system, will give an additional
inaccuracy of 30 µm. These individual errors are sufficiently small to obtain the
required overall precision of 50 µm.
2.2.7.2 MDT
The barrel and most of the end-cap region are equipped with MDT cham-
bers for the precision measurement of muon trajectories. The chambers are
rectangular in the barrel and trapezoidal in the end-cap. A MDT is a drift
chamber formed by an aluminum tube with a diameter of ∼ 30 mm. The tube
wall functions as a cathode. The anode wire is a gold-plated tungsten-rhenium
wire with a 50 µm diameter and is positioned at the center of the tube.
As can be seen in Figure 2.10(b) a full MDT chamber consists of two groups
of tube layers, called “multi-layers”, separated by a spacer frame consisting of
71
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
Monitored Drift Tube (MDT)Coverage |η| < 2.7 (inner layer |η| < 2.0)Chambers number 1150Channels number 354384
Chamber resolution (RMS)− (φ)
35 µm (z)− (time)
Function Precision trackerCathode Strip Chambers (CSC)
Coverage 2.0 < |η| < 2.7Chambers number 32Channels number 31000
Chamber resolution (RMS)5 mm (φ)40 µm (R)7 ns (time)
Function Precision trackerResistive Plate Chambers (RPC)
Coverage |η| < 1.05Chambers number 606Channels number 373000
Chamber resolution (RMS)10 mm (φ)10 mm (z)
1.5 ns (time)Function Trigger, 2a coordinate
Thin Gap Chambers (TGC)Coverage 1.05 < |η| < 2.7 (2.4 for the trigger)Chambers number 3588Channels number 318112
Chamber resolution (RMS)3−7 mm (φ)2−6 mm (R)4 ns (time)
Function Trigger, 2a coordinate
Table 2.4: Parameters of the four sub-systems of the muon detector. Thequoted spatial resolution does not include chamber-alignment uncertainties.Contributions from signal-propagation and electronics need to be added tothe intrinsic time resolution of each chamber type [27, 34].
72
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
(a) (b)
Figure 2.10: Figure (a): Cross-section of a MDT tube is shown [27]. Figure(b): Mechanical structure of a MDT chamber. Three spacer bars connectedby longitudinal beams form an aluminum space frame, carrying two multi-layers of three or four drift tube layers. Four optical alignment rays, twoparallel and two diagonal, allow for monitoring of the internal geometry ofthe chamber. RO and HV designate the location of the readout electronicsand high voltage supplies, respectively [35].
three lateral support beams (“cross-plates”, RO, MI and HV in Figure 2.10(b))
interconnected by two “longitudinal beams”. An MDT chamber has an internal
alignment system, which continuously measures potential deformations of the
frame. The alignment system consists of a set of four optical alignment rays,
two running parallel to the tube direction and two in the diagonal direction.
MDT chambers in the middle or outer stations of the ATLAS spectrometer are
equipped with three layers of tubes per multi-layer. The MDT chambers clos-
est to the interaction point have been equipped with four layers of tubes per
multi-layer to optimize the pattern recognition performance at high background
rates. The tubes are filled with a 93:7 Ar:CO2 gas mixture at 3 bar absolute
pressure.
The basic detection principle of a MDT is that of a drift chamber, the tubes
operate in a proportional regime with a maximum drift time of ∼ 700 ns. The
drift tubes of the MDTs are aligned perpendicular to the beam axis and approx-
imately parallel to the magnetic field lines, providing z coordinate measurement
in the barrel and η(R) coordinate measurements in the end-caps. An MDT mea-
surement gives, instead of a precise position, a radius around the wire which
the particle has crossed perpendicular as shown in Figure 2.10(a). A single tube
measures the distance to the wire with a typical average resolution of 80 µm.
Therefore, the resolution on the central point of a track segment in a 3 (4)-tubes
multi-layer is 50 (40) µm; combining the two multi-layers into a chamber yields
an accuracy of 35 (30) µm. The position along the tube cannot be measured
and has to be provided by an external measurement. It can either be provided
73
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
using the information from the trigger chambers or by extrapolation of tracks
from the ID into the muon system.
2.2.7.3 CSC
In the end-cap region, in the inner station and for |η| > 2, the MDT cham-
bers are replaced by the CSC chambers. In this region, due to thermalised
neutrons coming from the calorimeter, the expected particle rates for high lu-
minosity running are expected to be higher than 150 Hz/cm2. Here, the CSC
technology is chosen because it combines high spatial, time and double track
resolution with high-rate capability and low neutron sensitivity. The CSCs are
segmented into large and small chambers in φ. The whole CSC system consists
of two disks with 8 chambers each. Each chamber contains four CSC planes
resulting in four independent measurements in η and φ along each track.
The CSCs are multi-wire proportional chambers with a cathode strip readout
and with a symmetric cell in which the anode-cathode spacing is equal to the
anode wire pitch. The (anode) wires are oriented in the radial direction, both
cathodes are segmented, one with the strips perpendicular to the wires provid-
ing the precision coordinate (η(R)) and the other parallel to the wires providing
the transverse coordinate (φ). A crossing muon will cause charges on several
strips. The precision coordinate is obtained by measuring the charge induced
on the segmented cathode by the avalanche formed on the anode wire, the CSC
resolution is of 60 µm per CSC plane, combining the eight measurements12,
the total chamber resolution in η is 40 µm. In the non-bending direction the
cathode segmentation is coarser leading to a resolution of 5 mm. Due to the
small gas volume and the used gas mixture of Ar:CO2=80:20, the sensitivity for
neutrons is low and the drift times are small, less than 40 ns, resulting in a time
resolution of 7 ns per plane.
2.2.7.4 RPC
In the barrel (|η| < 1.05) the muon trigger consists of RPCs. They are used
due to good spatial and time resolution as well as adequate rate capability. The
RPCs are positioned in three concentric layers around the beam axis, referred
to as the three trigger stations, as shown in figures 2.9 and 2.12. The two in-
ner chambers (RPC1 and RPC2) sandwich the middle MDT chambers, and the
outer layer (RPC3) is assembled on the outer MDT chambers: on top of the
MDT chamber for the large sectors, and below the MDT chamber for the small
12Each crossing muon will give four independent measurements in both η and φ.
74
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
Figure 2.11: Cross-section through a RPC, where two units are joined toform a chamber. Each unit has two gas volumes supported by spacers (thedistance between successive spacers is 100 mm), four resistive electrodes andfour readout planes, reading the transverse and longitudinal direction. Thesandwich structure (dashed) is made of paper honeycomb. The φ-strips arein the plane of the figure and the η-strips are perpendicular to it. Dimensionsare given in mm [27].
sectors. No gaps in φ are present in this configuration. Each station consists
of two independent detector layers, each measuring η and φ. Therefore, a track
going through all three stations deliver six measurements in η and φ.
A RPC is a gaseous parallel electrode-plate detector with a typical spatial reso-
lution of 1 mm and a time resolution of 1.5 ns. The basic detecting unit consists
of a thin gas gap formed by two resistive bakelite parallel plates13, separated by
insulating spacers. The gas gap (2 mm) is filled with a gas mixture of 94.7%
tetrafluorethane (C2H2F4), 5% isobutane (C4H10) and 0.3% SF6. The chambers
are operated in the avalanche mode with a typical electric field between plates
of 4.9 kV/mm.
As can be seen in Figure 2.11 a RPC trigger chamber is made of two rectangular
detectors, called units, contiguous to each other with a small overlap to avoid
dead areas for curvature tracks. Each unit consists of two such independent
detector layers, called gas volumes, which are separated by light-weight paper
honeycomb and are each read out by two orthogonal sets of metal pick-up strips
on the outer side of the plates. The η-strips are parallel to the MDT wires and
as such determine a position in the bending plane of the magnet, the φ-strips
are orthogonal to the MDT wires and measure the coordinate orthogonal to
the bending direction, so they provide a measurement of the position along the
MDT wire required for the precise calibration of the MDT tubes.
13The plates are made of phenolic-melaminic plastic laminate.
75
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
2.2.7.5 TGC
For the endcaps a slightly different trigger technology is chosen: the Thin
Gap Chambers (TGCs). They provide two functions in the end-cap muon spec-
trometer: the muon trigger capability and the determination of the second,
azimuthal coordinate (φ) to complement the measurement of the MDTs in the
bending (radial) direction. TGCs are positioned in four planes around the
beam axis, as depicted in figures 2.9 and 2.12. While the RPCs are physically
connected to an MDT counterpart, there is no such connection for the TGCs.
They are constructed as double-gap units, called doublets, and triple-gap units,
called triplets. At the end-cap middle (EM) station, one layer of TGC triplets
(TGC1, 1.05 < |η| < 2.7) is placed in front of the MDTs and two layers of dou-
blets (TGC2 and TGC3, 1.05 < |η| < 2.4) behind the MDTs. The EM TGCs
are mounted on the so-called wheels at |z| ∼ 14 m, they will give seven mea-
surements in total. The TGC1 layer provides second coordinate measurements
up to an |η| of 2.7, however since there are no coincidences in the other planes,
these measurements are not used for triggering. An additional layer of TGC
doublets (TGCI) is installed at the end-cap inner station (1.05 < |η| < 1.92), it
is located in front of the innermost tracking layer, it is segmented radially into
two non-overlapping regions: end-cap (EI) and forward (FI, also known as the
small wheel). EI TGCs are mounted on support structures of the barrel toroid
coils at |z| ∼ 7 m and they are only used for measuring the second coordinate.
TGCs are multi-wire proportional chambers operated in a saturated mode, with
the difference that the anode wire pitch is larger than the cathode-anode dis-
tance. The used gas mixture is CO2:n-C5H12=55:45. Position measurements
are obtained from both the pick-up strips (φ) and the wires (η). The TGCs
have a time resolution of 4 ns.
2.2.8 Trigger System
At high luminosity LHC running, the total pp collision rate reaches 40 MHz.
The resulting amount of data is far too large to be written to storage. To reduce
the total data flow without losing interesting physics events a preselection filter
was developed. The ATLAS trigger system is organized in three levels. Each
trigger level reduces the event rate by orders of magnitude. Each higher level
has more time per event available to make a more refined decision. The final
rate will be 200 Hz with an event size of about 1.3 MB.
76
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
Figure 2.12: Schematics of the muon trigger system. RPC2 and TGC3 arethe reference (pivot) planes for barrel and end-cap, respectively [27].
2.2.8.1 Level-1 Trigger
The level-1 trigger (L1) is a hardware based trigger, it performs the initial
event selection based on information from the calorimeters and muon detectors.
The calorimeter selection is based on information from all the calorimeters. The
L1 Calorimeter Trigger (L1Calo) aims to identify high-ET objects such as elec-
trons and photons, jets, and τ -leptons decaying into hadrons, as well as events
with large EmissT and large total transverse energy. A trigger on the scalar sum
of jet transverse energies is also available. For the e/γ and τ triggers, isolation
can be required. Isolation implies that the energetic particle must have a mini-
mum angular separation from any significant energy deposit in the same trigger.
The information for each bunch-crossing used in the L1 trigger decision is the
multiplicity of hits for 4 to 16 programmable ET thresholds per object type.
The L1 muon trigger is based on signals in the muon trigger chambers: RPCs
in the barrel and TGCs in the end-caps. The trigger searches for patterns of
hits consistent with high-pT muons originating from the interaction region. The
logic provides 6 independently-programmable pT thresholds: three associated
with the low-pT trigger (threshold range approximately 6−9 GeV) and three as-
sociated with the high-pT trigger (threshold range approximately 9−35 GeV).
The information for each bunch-crossing used in the L1 trigger decision is the
multiplicity of muons for each of the pT thresholds. Muons are not double-
counted across the different thresholds.
The L1 is designed to reduce the 40 MHz rate to approximately 75 kHz, with
77
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
the possibility to upgrade to 100 kHz. The decision time (latency), which is the
time from the collision until the L1 trigger decision, must be kept as short as
possible. The L1 trigger has a latency less than 2.5 µs.
The level-1 trigger defines also the so-called Regions of Interest (RoIs). These
are detector regions in η and φ coordinates, where interesting features have
been identified, hence where the L1 trigger has identified possible trigger ob-
jects within the event. These RoIs are used by the subsequent trigger as starting
point for more refined trigger algorithms. If an event is accepted by the L1 trig-
ger the full detector is readout and the data is passed to the level-2 trigger (L2).
Muon Trigger algorithm
The trigger in both the barrel and the end-cap regions is based on three trigger
stations each. The basic principle of the algorithm is to require a coincidence
of hits in the different trigger stations within a road, which tracks the path
of a muon from the interaction point through the detector. Each coincidence
pattern corresponds to a certain deviation from straightness, i.e. curvature of
the track, which is used as a criterion for the track to have passed a predefined
momentum threshold. The deviation from straightness is the deviation of the
slope of the track segment between two trigger chambers from the slope of a
straight line between the interaction point and the hit in a reference layer called
the pivot plane, which is the second layer in the barrel (RPC2) and the last
layer in the end-cap (TGC3), as illustrated in Figure 2.12. The width of the
road is a function of the desired cut on pT : the smaller the road, the higher the
cut on pT .
In the barrel the trigger algorithm operates in the following way: if a track
hit is generated in the second RPC doublet (the pivot plane), a search for
a corresponding hit is made in the first RPC doublet, within a road whose
center is defined by the line of conjunction of the hit in the pivot plane with
the interaction point. Only a 3-out-of-4 coincidence of the four layers of the
two doublets is required for the low-pT trigger. The high-pT algorithm also
requires 1-out-of-2 possible hits of the RPC3 doublet. The scheme of the L1
muon end-cap trigger is shown on the right hand side of Figure 2.12. A 3-out-
of-4 coincidence is required for the doublet pair planes of TGC2 and TGC3, for
both wires and strips, a 2-out-of-3 coincidence for the triplet wire planes, and
1-out-of-2 possible hits for the triplet strip planes. Trigger signals from both
doublets and the triplet are involved in identifying the high-pT candidates, while
in case of the low-pT candidates the triplet station may be omitted.
78
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
2.2.8.2 Level-2 Trigger
The L2 trigger is a software trigger which uses the output of the L1: it
uses the RoI information on coordinates, energy, and type of signatures to limit
the amount of data which must be transferred from the detector readout. L2
selections use, at full granularity and precision, all the available detector data
within the RoIs (approximately 2% of the total event data) to further reduce
the data rate to approximatley 3.5 kHz, with an event processing time of about
40 ms, averaged over all events.
2.2.8.3 Level-3 trigger
Events selected by the L2 trigger are passed on to the L3 trigger (Event
Filter (EF)) which uses the complex reconstruction algorithms also used in AT-
LAS offline event reconstruction. The L2 and the EF together are called the
High Level Trigger (HLT). EF further selects events down to a rate which can
be recorded for subsequent offline analysis. It reduces the event rate to ap-
proximately 200 Hz, with an average event processing time of order 4 s. The
HLT algorithms use the full granularity and precision of calorimeter and muon
chamber data, as well as the data from the inner detector, to refine the trigger
selections. Better information on energy deposition improves the threshold cuts,
while track reconstruction in the inner detector significantly enhances the par-
ticle identification (for example distinguishing between electrons and photons).
The decision for accepting an event is based on trigger menus. A trigger menus
is a set of one or more event characteristics (like EmissT or a muon) with certain
thresholds. The set of trigger menus can be adjusted depending on the luminos-
ity to use the full capacity of the bandwidth. Those events that have passed the
selection criteria are sorted into data streams: electrons, muons, jets, photons,
EmissT , and τ -leptons, and B -physics. As ATLAS uses inclusive streaming, an
event can be recorded in more than one stream. In addition to the physics
streams, there are also calibration streams that are used to calibrate the detec-
tors, and express streams that are used for monitoring and perform data quality
checks.
2.2.9 Electron Reconstruction and Identification
Electron reconstruction uses information from the calorimeter and the inner
detector to reject events that fakes an electron like photons (with or without
pair conversions), QCD jets (u/d/s-hadron decays), π0/η Dalitz decays (π0/η →e+e−γ), charged hadrons and muons (because of the potential emission of a
Bremsstrahlung photon), and to identify isolated (Z, W , t, τ or µ decays) and
79
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
non-isolated electrons (J/ψ, b-hadron or c-hadron decays).
In the moderate pT region (20-50 GeV), a jet-rejection factor exceeding 105 will
be needed to extract a relatively pure inclusive signal from genuine electrons
above the residual background from jets faking electrons. The required rejection
factor decreases rapidly with increasing pT to ∼ 103 for jets in the TeV region.
At present, two electron reconstruction algorithms in the range |η| < 2.5 have
been implemented in the ATLAS offline software, both integrated into one single
package and a common event data model [18]:
• The standard one (egammaBuilder), which is seeded from the electromag-
netic (EM) calorimeters, starts from clusters reconstructed in the calorime-
ters and then builds the identification variables based on information from
the inner detector and the EM calorimeters.
• A second algorithm (softeBuilder), which is seeded from the inner de-
tector tracks, is optimized for electrons with energies as low as a few GeV,
and selects good-quality tracks matching a relatively isolated deposition
of energy in the EM calorimeters. The identification variables are then
calculated in the same way as for the standard algorithm.
A third algorithm is dedicated to the reconstruction of forward electrons (|η| >2.5), where no track matching is required because of the limited coverage range
of the ID (|η| < 2.5). Then, in contrast to the central electrons, forward electron
reconstruction can only use information from the calorimeters. Reconstructed
electrons from different algorithms are merged and the overlap between differ-
ent algorithms is removed during the AOD (Analysis Object Data) production.
The variable ‘‘author’’ is defined to indicate which algorithm created a certain
electron. The standard electron is defined by: author= 1||author= 3, where
“1” means the electron comes from the egammaBuilder, and “3” means both
the egammaBuilder and the softeBuilder find the electron. In cases where
both the algorithms find the same electron, the overlap is resolved and most
parameters from the standard electrons are kept with a few exceptions.
In the standard algorithm electron and photon reconstruction begins with the
creation of a preliminary set of clusters in the EM calorimeter whose size cor-
responds to 3 × 5 cells in η × φ in the middle layer. Electron and photon
reconstruction is seeded from such clusters with ET > 2.5 GeV, using a sliding
window algorithm14 over the full acceptance of the EM calorimeter. The final
cluster size is dependent on the particle hypothesis and the region of the detec-
tor: 3 × 5 for unconverted photons in the barrel, 3 × 7 for converted photons
14A sliding window algorithm with fixed size looks for regions of approximately 0.1×0.1 in∆η × ∆φ where the deposits exceed 2.5 GeV and defines the cluster position such that theenergy inside the window is maximized.
80
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
and electrons in the barrel, 5× 5 in all other cases15. Then a matching track is
searched for among all reconstructed tracks which do not belong to a photon-
conversion pair reconstructed in the inner detector. The track is required to
match the cluster within a broad ∆η × ∆φ window of 0.05 × 0.10. The ratio,
E/p, of the energy of the cluster to the momentum of the track is required to
be < 10. Approximately 93% of true isolated electrons, with ET > 20 GeV and
|η| < 2.5, are selected as electron candidates. The inefficiency is mainly due to
the large amount of material in the inner detector and is therefore η-dependent.
Various identification techniques can be applied to the reconstructed electron
candidates, combining calorimeter and track quantities and the TRT informa-
tion to discriminate jets and background electrons from the signal electrons.
Standard identification of high-pT electrons is based on many cuts which can all
be applied independently. Three reference sets of cuts have been defined: loose,
medium and tight, as summarised in Table 2.5 [18].
2.2.9.1 Loose cuts
This set of cuts performs a simple electron identification based only on lim-
ited information from the calorimeters. Cuts are applied on the hadronic leakage
and on shower-shape variables, derived from only the middle layer of the electro-
magnetic calorimeter (lateral shower shape and lateral shower width). This set
of cuts provides excellent identification efficiency of ∼ 88%, but low background
rejection ∼ 600 [36].
2.2.9.2 Medium cuts
This set of cuts improves the quality by adding cuts on the strips in the first
layer of the EM calorimeter and on the tracking variables:
• Strip-based cuts are effective in the rejection of π0 → γγ decays. Since the
energy-deposit pattern from π0’s is often found to have two maxima due
to π0 → γγ decay, showers are studied in a window ∆η×∆φ = 0.125×0.2
around the cell with the highest ET to look for a second maximum. If more
than two maxima are found the second highest maximum is considered.
The variables used include:
– ∆Es = Emax2 − Emin: the difference between the energy associated
with the second maximum (Emax2) and the energy reconstructed in
15In the barrel, electrons need larger clusters than photons in φ to collect bremsstrahlungphotons, as the electrons bend in φ due to the solenoid magnetic field. In the end-cap, all theparticles use the same window since the effect of the magnetic field is smaller. The windowsizes were chosen as a compromise between the spread of the energy deposits and the noise(the inclusion of more cells increases the noise).
81
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
Type DescriptionVariablename
Loose cutsAcceptance of the detector ? |η| < 2.47
Hadronic leakage ? Ratio of ET in the first sampling of thehadronic calorimeter to ET of the EM cluster.
Second layer ? Ratio in η of cell energies in 3× 7 versus 7× 7 cells. Rη
of EM calorimeter ? Ratio in φ of cell energies in 3× 3 versus 3× 7 cells. Rφ
? Lateral width of the showers.Medium cuts(include loose cuts)
First layer
? Difference between energy associated with ∆Es
of EM calorimeter
the second largest energy depositand energy associated with the minimal value
between the first and second maxima.? Second largest energy deposit Rmax2
normalised to the cluster energy.? Total shower width. wstot
? Shower width for three strips around maximum strip. ws3
? Fraction of energy outside core of three central strips Fside
but within seven strips.
Track quality? Number of hits in the pixel detector (at least one).? Number of hits in the pixels and SCT (at least nine).
? Transverse impact parameter (< 1 mm).Tight (isol)(include medium cuts)
Isolation? Ratio of transverse energy in a cone ∆R < 0.2
to the total cluster transverse energy.Vertexing-layer ? Number of hits in the vertexing-layer (at least one).
Track matching
? ∆η between the cluster and the track (< 0.005).? ∆φ between the cluster and the track (< 0.02).
? Ratio of the cluster energyto the track momentum. E/p
TRT? Total number of hits in the TRT.
? Ratio of the number of high-thresholdhits to the total number of hits in the TRT.
Tight (TRT) (includes tight (isol) except for isolation)? Same as TRT cuts above,
but with tighter values corresponding to about 90%efficiency for isolated electrons.
Table 2.5: Definition of variables used for loose, medium and tight electronidentification cuts. The cut values are given explicitly only when they areindependent of η and pT .
82
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
the strip with the minimal value, found between the first and second
maxima (Emin);
– Rmax2 = Emax2
1+9×10−3ET, where ET is the transverse energy of the clus-
ter in the electromagnetic calorimeter and the constant value 9 is in
units of GeV−1;
– wstot: the shower width over the strips covering 2.5 cells of the second
layer (20 strips in the barrel for instance);
– ws3: the shower width over three strips around the one with the
maximal energy deposit;
– Fside: the fraction of energy deposited outside the shower core of
three central strips.
• The tracking variables include the number of hits in the pixels, the number
of silicon hits (pixels plus SCT) and the tranverse impact parameter.
The medium cuts increase the jet rejection by a factor of 3-4 (up to 2000) with
respect to the loose cuts, while reducing the identification efficiency by ∼ 10%
(it is ∼ 77%) [36].
2.2.9.3 Tight cuts
This set of cuts makes use of all the particle-identification tools currently
available for electrons. In addition to the cuts used in the medium set, cuts are
applied on the number of vertexing-layer hits (to reject electrons from conver-
sions), on the number of hits in the TRT, on the ratio of high-threshold hits to
the number of hits in the TRT (to reject the dominant background from charged
hadrons), on the difference between the cluster and the extrapolated track po-
sitions in η and φ , and on the ratio of cluster energy to track momentum, as
shown in Table 2.5.
Two different final selections are available within this tight category: they are
named tight (isol) and tight (TRT) and are optimised differently for isolated
and non-isolated electrons. In the case of tight (isol) cuts, an additional energy
isolation cut is applied to the cluster, using all cell energies within a cone of
∆R < 0.2 around the electron candidate. This set of cuts provides, in general,
a reasonable electron identification efficiency of ∼ 64% (but the highest isolated
electron identification) and the highest rejection against jets (∼ 105) [36]. The
tight (TRT) cuts do not include the additional explicit energy isolation cut, but
instead apply tighter cuts on the TRT information to further remove the back-
ground from charged hadrons.
83
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
2.2.10 Muon Reconstruction and Identification
The ATLAS detector has been designed for efficient muon identification and
momentum resolutions as low as 3% for transverse momenta of pT = 200 GeV
and less then 10% up to pT = 1 TeV. This is achieved by a combination of mea-
surements from the inner detector and the muon spectrometer. For pT roughly
in the range between 30 and 200 GeV, the momentum measurements from the
inner detector and muon spectrometer may be combined to give precision better
than either alone. The inner detector dominates below this range, and the spec-
trometer above it [18]. As shown in the Section 2.2.8, muons with a momentum
higher than 6 GeV are triggered. However, muons with a lower momentum can
still be reconstructed in the muon spectrometer, where muons are identified and
measured with momenta ranging from 3 GeV to 3 TeV [27].
Muon reconstruction and identification is based on a combined usage of data
from three ATLAS sub-detectors: the muon spectrometer (MS), the inner detec-
tor (ID), and the calorimeter. The calorimeter, with a thickness of more than 10
λ, provides an effective absorber for hadrons, electrons and photons produced
by pp collisions at the center of the ATLAS detector. Energy measurements
in the calorimeter can aid in muon identification because of their characteristic
minimum ionizing signature and can provide a useful direct measurement of the
energy loss.
At ATLAS, four types of muons are defined to achieve high purity, efficiency and
momentum resolution: combined, stand-alone, segment tagged and calorimeter
tagged muons. The current ATLAS baseline reconstruction includes two algo-
rithms for each strategy. The algorithms are grouped into two families such
that each family includes one algorithm for each strategy. The output data
intended for use in physics analysis includes two collections of muons, one for
each family, in each processed event. The collections (and families) are referred
by the names of the corresponding combined algorithms: Staco and Muid. The
Staco collection is the current default for physics analysis [18].
Stand-Alone Muons (SA): Only the hits in the muon spectrometer are used
to reconstruct the track, see Figure 2.13(a). The standalone algorithms,
both families, start by identifying Regions of Activity, which are seeded
by the muon trigger chambers, and then employ a pattern recognition
algorithm to form local segments in each of the three muon stations in
these regions of activity. Next, the local segments are connected via a
three dimensional continuous track fit in the magnetic field to form track
candidates. Once the tracks have been found, they are extrapolated to the
beam line. The extrapolation must account for both multiple scattering
and energy loss in the calorimeter. Then at this step, despite the different
84
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
(a) Stand-Alone Muon: A track inthe muon spectrometer (blue), extrapo-lated through the calorimeter (orange) butwithout a matching inner detector track(dashed line).
(b) Combined muon: A track in themuon spectrometer (blue), extrapolatedthrough the calorimeter (orange) andmatched with a track in the inner detector(yellow).
(c) Segment Tagged Muon: An in-ner detector track (yellow) matched withone hit segment in the muon spectrometer(blue).
(d) Calorimeter Tagged Muon: An in-ner detector track (yellow) extrapolatedinto the calorimeter (orange) and compati-ble with the signature of a minimum ioniz-ing particle.
Figure 2.13: The four types of muon candidates defined at ATLAS: stand-alone (a), combined (b), segment tagged (c) and calorimeter tagged (d).
implementations, the general procedures are essentially the same for both
families.
The Staco family algorithm is called Muonboy. On the Muid side, Moore is
used to find the tracks and MuidStandalone for the inward extrapolation.
Muonboy assigns energy loss based on the material crossed in the calorime-
ter. Muid additionally makes use of the calorimeter energy measurements
if they are significantly larger than the most likely value and the muon
appears to be isolated.
Standalone algorithms have the advantage of slightly greater |η| coverage,up to 2.7 compared to 2.5 for the inner detector, but there are holes in
the coverage at |η| near 0 and 1.2. Very low momentum muons (around
a few GeV) may be difficult to reconstruct because they do not penetrate
to the outermost stations.
Combined Muons (CB): Muon spectrometer and inner detector perform an
independent track reconstruction. After successful combination, a joint
track is formed (see Figure 2.13(b)). Calorimeter measurements are taken
into account to reduce the fake signals of the standalone reconstruction.
Combined muons are the standard muon objects for physics analysis and
85
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
provide candidates of highest purity. The combined reconstruction covers
the range |η| < 2.5 due to the inner detector acceptance.
The Staco muon reconstruction attempts to statistically merge the two
independent measurements from the ID track and the MS track (this al-
goritm is called STACO). Muid does a partial refit: it does not directly use
the measurements from the inner track, but starts from the inner track
vector and covariance matrix and adds the measurements from the outer
track. The fit accounts for the material (multiple scattering and energy
loss) and magnetic field in the calorimeter and muon spectrometer. This
latter algorithm is called MuidCombined.
Segment Tagged Muons (ST): If the hits in the muon spectrometer are not
sufficient for a proper measurement, an inner detector track is still con-
sidered a muon, if the extrapolated track can be associated with a recon-
structed muon segment (see Figure 2.13(c)). In other words, tagged muons
are produced by propagating all ID tracks with sufficient momentum out
to the MS and searching for matching segments in the inner and middle
stations of the MS. The tagged muon reconstruction mainly aims to re-
construct low-pT muon tracks. Therefore segment tagged muons are used
to recover low detector efficiencies in low-pT and badly covered η regions.
The muon tagging covers |η| < 2 only. This strategy will provide infor-
mation in detector regions where standalone reconstruction is degraded,
such as the region near η = 0 and the transition region between barrel
and end-cap (|η| ∼ 1.2).
In the Staco muon reconstruction the tagged algorithm is referred to as
MuTag. In the Muid muon reconstruction, the tagged muons are found
by the MuGirl or the MuTagIMO algorithm. MuGirl considers all inner de-
tector tracks and redoes segment finding in the region around the track.
MuTag only makes use of inner detector tracks and muon spectrometer
segments not used by Staco combined algorithm. Thus MuTag serves only
to supplement STACO while MuGirl attempts to find all muons. In the
Muid collection, the overlap between combined (MuidCombined) and seg-
ment tagged muons (MuGirl) has to be taken into account, these overlaps
are removed by creating a single muon when both have the same inner
detector track.
Calorimeter Tagged Muons (CT): A trajectory in the inner detector is
identified as a muon if the associated energy depositions in the calorime-
ters are compatible with the hypothesis of a minimum ionizing particle
(mip16). Calorimeter tagged muons are reconstructed by separate algo-
16Particle that passing the matter releases the minimum ionization energy.
86
2. LHC and the ATLAS Detector 2.2 ATLAS Detector
rithms with respect to Staco and Muid and they recover efficiency at η ∼ 0.
The standalone, the combined, and the tagged muons are merged to improve
the muon finding efficiency, and possible overlaps between different algorithms
are removed, i.e. cases where the same muon is identified by two or more
algorithms. The overlap removal requires muons have different inner detector
tracks and merges standalone muons that are too close to one another. Closeness
is defined by η − φ separation with a default limit of 0.4 [18]. Similar to the
electron case, the variable ‘‘author’’ is defined to indicate the algorithm by
which a certain muon is built.
87
Chapter 3Higgs search in the decay channel
H → ZZ(∗) → 4l
The search for the Standard Model Higgs boson is a major goal of the LHC.
As already said in 1.3.2, the experimentally cleanest signature for the discovery
of the Higgs boson is its “golden” decay to four leptons (electrons and muons):
H → ZZ → 4l. The excellent energy resolution and linearity of the recon-
structed electrons and muons leads to a narrow 4-leptons invariant mass peak
on top of a smooth background. At the same time Higgs analyses in four leptons
final states have great impact on the discovery sensitivity and if a discovery takes
place, H → 4l offers several possibilities to study the properties of the Higgs
boson.
This chapter details the first one of the two physics analyses studied in this
thesis, the cut-based analysis, both related to the searches for a SM Higgs boson
in four leptons final states. The present analysis was published by the ATLAS
Collaboration in [37]: “Search for the Standard Model Higgs boson in the decay
channel H → ZZ(∗) → 4l with 4.8 fb−1 of pp collisions at√s = 7 TeV” and all
details are available on the online Higgs group twiki page [38].
The chapter is organized as following: characteristics of the signal and main
backgrounds are discussed in Section 3.1, afterwards the pileup reweighting, the
lepton reconstruction and the applied corrections are described, then Section 3.5
explains the analysis strategy (event selection and mass reconstruction). Finally
the background estimation and the results are presented.
89
3. Higgs search in 4l 3.1 Signal and Main Backgrounds
3.1 Signal and Main Backgrounds
The main characteristics of the signal and main backgrounds, already men-
tioned in section 1.3.2, are here summarized. The importance of four leptons
final states on Higgs searches is partially due to the high branching ratio of
H → ZZ. For Higgs masses greater than 120 GeV, decays to a pair of Z
bosons, as already underlined in section 1.3.2, are above the 1% level, becoming
the sub-leading process for mH ∼ 160 GeV and contributing with roughly 1/3
of the branching fraction from 200 GeV on. The Zs then decay to charged lep-
tons, neutrinos or quarks, the latter ones being hardly accessible due to QCD
backgrounds. While both the taus and the neutrinos induce significant amounts
of missing transverse energy, events with only electrons and muons can be fully
reconstructed. From the experimental point of view, these are the cleanest sig-
natures available. The excellent transverse energy and momentum resolutions
for electrons and muons provide narrow invariant mass distributions when the
Z and Higgs bosons are reconstructed. The Higgs signal can be identified by
a peak on a 4-leptons invariant mass spectrum, sitting on top of a relatively
smooth background. For mH ≥ 180 GeV, H → 4l is the “golden channel”, with
the Higgs decaying to two on-shell Z bosons.
The main and almost only background in this region is the non-resonant pro-
duction of Z boson pairs, which is nearly irreducible, possessing the same char-
acteristics as the signal. The leading diagrams for H → 4l and ZZ → 4l are
represented in figures 3.1 and 3.2(a) respectively. In both cases, each Z can
decay to electrons or muons, leading to three final states: four electrons (4e),
four muons (4µ) or two electrons and two muons (2e2µ). The last one has twice
the yield of each of the other two modes.
Additionally to ZZ, the most important ones are SM processes which gener-
Figure 3.1: Main diagram for Higgs to four leptons production.
ate four real leptons with high-pT , such as Zbb and tt, represented in figures
3.2(b) and 3.2(c). In both cases the dominant contribution is from two leptons
originating from the leptonic decay of the Z or the W s, and the other two from
decays of the b-quarks. Lepton production from top quark decays is illustrated
in Figure 3.3.
90
3. Higgs search in 4l 3.1 Signal and Main Backgrounds
(a) ZZ (b) Zbb (c) tt
Figure 3.2: Some of the diagrams for the backgrounds to H → 4l searches.
Despite the low fake rates, the role of events with three leptons plus a fake
Figure 3.3: Lepton production chains from top quark decays.
one or even two fakes must be carefully evaluated as their cross section can
be significantly higher than the ones with four leptons. The list of potentially
dangerous processes is thus complemented by WZ and Z + jets, with the W s
and Zs decaying leptonically, and finally by gg2ZZ, which is the loop-induced
gluon-fusion process gg → Z(∗)(γ(∗))Z(∗)(γ(∗)) → lll′ l′. In this last process for
Higgs masses below the Z-pair threshold, where one Z boson is produced off-
shell, in particular the photon contribution to the background is important [39].
Below mH = 2mZ , one of the Zs coming from the Higgs is off-shell and decays
to low momentum leptons, making the Zbb and tt backgrounds more harmful.
Their yields can exceed the ones for the signal by a few orders of magnitude
in this region, imposing additional selection cuts. Apart from the leptons from
W s and Zs, the dominant contribution is expected from semi-leptonic decays
of heavy flavour quarks (b and c). Exploiting the hadronic activity around the
leptons and the large lifetime of hadrons containing these quarks, both track
and calorimetric isolation provide good discriminating power, complemented
by impact parameter requirements, it will be discussed in Section 3.5.4. Re-
jection factors above 10 can be achieved with each discriminant, keeping the
backgrounds under control. On processes that do not contain vector bosons,
the rejection of the isolation cuts is much higher and thus their contribution is
91
3. Higgs search in 4l 3.1 Signal and Main Backgrounds
negligible.
The challenges related to the analysis arise mainly from two factors. On one side
is the low signal cross section. The reduced branching ratio for Z → ll (∼ 3.4%)
and the presence of four leptons impose ultimate identification efficiencies for
both muons and electrons, which can be particularly intricate at low pT . At
the same time, the background yields are known with large uncertainties.
3.1.1 Data and Monte Carlo Samples
3.1.1.1 Data Samples
The data used in this analysis were recorded with the ATLAS detector dur-
ing the 2011 LHC run. The data are subject to a number of quality requirements
ensuring that all essential elements of the ATLAS detector are working as ex-
pected (see Section 3.5.1). The integrated luminosities are 4.81 fb−1, 4.81 fb−1
and 4.91 fb−1 corresponding to data analysed for the 4µ, 2e2µ and 4e final
states, respectively.
3.1.1.2 Monte Carlo Samples
The H → ZZ(∗) → 4l signal is modelled in the range 110 to 600 GeV using
the powheg Monte Carlo (MC) event generator[40, 41], which calculates sep-
arately the gluon and vector boson fusion production mechanisms of the Higgs
boson with matrix elements up to next-to-leading order (NLO). The Higgs boson
transverse momentum (pT ) spectrum in the gluon fusion process is reweighted in
order to include quantum chromodynamics (QCD) corrections up to NLO and
QCD soft-gluon resummations up to next-to-next-to-leading logarithm (NNLL).
powheg is interfaced to pythia [42] for showering and hadronization, which in
turn is interfaced to photos [43] for QED radiative corrections in the final-state
and to tauola [44, 45] for the simulation of τ decays. The cross sections for
Higgs boson production are derived to next-to-next-to-leading order (NNLO) in
QCD for the gluon fusion and vector boson fusion. In addition, QCD soft-gluon
resummations up to next-to-next-to-leading log (NNLL) are available for the
gluon fusion process [46], while the NLO electroweak (EW) corrections are ap-
plied to both the gluon fusion and vector boson fusion. The cross section times
the branching ratio values used for signal samples in the following are listed in
Table 3.1 [47, 38].
The simulated background samples considered in this analysis along with their
cross sections provided by the generators and their total number of events are
reported in Table 3.2. The background samples are generated in the following
ways:
92
3. Higgs search in 4l 3.1 Signal and Main Backgrounds
mH MC Total σ·BR[GeV] Generator Events [nb]
130 powheg-pythia 200000 5.89 · 10−6
150 powheg-pythia 199998 8.95 · 10−6
180 powheg-pythia 50000 4.12 · 10−6
200 powheg-pythia 50000 13.60 · 10−6
360 powheg-pythia 50000 7.08 · 10−6
400 powheg-pythia 49999 5.53 · 10−6
600 powheg-pythia 50000 0.90 · 10−6
Table 3.1: Signal samples along with their Monte Carlo generator, theirtotal events number and cross section times the branching ratio values forthe gluon fusion processes in pp collisions at
√s = 7 TeV.
• The irreducible ZZ(∗) → 4l background is generated using pythia. pythia
implements the qq initial state and takes into account the Z − γ interfer-
ence.
• The inclusive Z boson1 and Zbb production is modelled using alpgen
[48]. alpgen generator is interfaced to jimmy [49] for simulation of the
underlying event. Indeed in the Zbb process the b quarks can lead also
to the emission of one or more partons, besides the two leptons. For the
inclusive Z boson and Zbb processes overlaps between the two samples are
removed2.
• For the tt production mc@nlo [50] is employed. mc@nlo generator is
interfaced to jimmy for simulation of the underlying event.
• The WZ background is produced with herwig [51].
• The gg2ZZ background is generated with jimmy.
ProcessesMC Total σ·BR Filter
Generator Events [nb] Efficiency
ZZ → 4l pythia 597958 7.3467 · 10−5 0.62
WZ herwig 249949 1.1481 · 10−2 0.31
gg2ZZ jimmy 65000 2.7900 · 10−6 0.60
tt mc@nlo 14965993 1.4562 · 10−1 0.54
Continues to the following page
1The inclusive Z boson term is referred to the Z+ jets process with subsequent Z → l+l−
decay. This process can contribute to the background if the accompanying jets are mis-identified as leptons.
2Namely, bb pairs with separation ∆R ≥ 0.4 between the jets are taken from the matrix-element calculation, whereas for ∆R < 0.4 the parton-shower jets are used.
93
3. Higgs search in 4l 3.2 Pileup Reweighting
Table 3.2 – continues from previous page
ProcessesMC Total σ·BR Filter
Generator Events [nb] Efficiency
Zbb
→ e+e−bb+ 0p (NoFilter) alpgen-jimmy 150000 6.5529 · 10−3 1
→ e+e−bb+ 1p (NoFilter) alpgen-jimmy 100000 2.4782 · 10−3 1
→ e+e−bb+ 2p (NoFilter) alpgen-jimmy 40000 8.8469 · 10−4 1
→ e+e−bb+ 3p (NoFilter) alpgen-jimmy 10000 3.9393 · 10−4 1
→ µ+µ−bb+ 0p (NoFilter) alpgen-jimmy 149950 6.5650 · 10−3 1
→ µ+µ−bb+ 1p (NoFilter) alpgen-jimmy 100000 2.4782 · 10−3 1
→ µ+µ−bb+ 2p (NoFilter) alpgen-jimmy 40000 8.8620 · 10−4 1
→ µ+µ−bb+ 3p (NoFilter) alpgen-jimmy 9999 3.9149 · 10−4 1
Z inclusve
→ e+e− + 0p (pT = 20GeV) alpgen-jimmy 6615302 6.6960 · 10−1 1
→ e+e− + 1p (pT = 20GeV) alpgen-jimmy 1333903 1.3452 · 10−1 1
→ e+e− + 2p (pT = 20GeV) alpgen-jimmy 404999 4.0706 · 10−2 1
→ e+e− + 3p (pT = 20GeV) alpgen-jimmy 110000 1.1262 · 10−2 1
→ e+e− + 4p (pT = 20GeV) alpgen-jimmy 30000 2.8447 · 10−3 1
→ e+e− + 5p (pT = 20GeV) alpgen-jimmy 10000 7.5691 · 10−4 1
→ µ+µ− + 0p (pT = 20GeV) alpgen-jimmy 6614248 6.6956 · 10−1 1
→ µ+µ− + 1p (pT = 20GeV) alpgen-jimmy 1334296 1.3455 · 10−1 1
→ µ+µ− + 2p (pT = 20GeV) alpgen-jimmy 403253 4.0642 · 10−2 1
→ µ+µ− + 3p (pT = 20GeV) alpgen-jimmy 110000 1.1279 · 10−2 1
→ µ+µ− + 4p (pT = 20GeV) alpgen-jimmy 30000 2.8370 · 10−3 1
→ µ+µ− + 5p (pT = 20GeV) alpgen-jimmy 10000 7.6123 · 10−4 1
→ τ+τ− + 0p (pT = 20GeV) alpgen-jimmy 10609203 6.6955 · 10−1 1
→ τ+τ− + 1p (pT = 20GeV) alpgen-jimmy 3332443 1.3466 · 10−1 1
→ τ+τ− + 2p (pT = 20GeV) alpgen-jimmy 1004847 4.0647 · 10−2 1
→ τ+τ− + 3p (pT = 20GeV) alpgen-jimmy 509847 1.1256 · 10−2 1
→ τ+τ− + 4p (pT = 20GeV) alpgen-jimmy 144999 2.8440 · 10−3 1
→ τ+τ− + 5p (pT = 20GeV) alpgen-jimmy 45000 7.5770 · 10−4 1
Table 3.2: Background samples along with their Monte Carlo generator,their total events number, filter efficiencies and the LO cross sections timesthe branching ratio values at
√s = 7 TeV, except for tt, which is NLO. It
is noted that in this table l = e, µ, τ . Exclusive channels were used for theZbb background with the number of additional partons,“p”, listed above.
3.2 Pileup Reweighting
As discussed in Section 2.1, in the LHC protons will collide every 25 ns at a
design instantaneous luminosity of 1034 cm−2 s−1. In each recorded event, apart
from the hard scattering interaction, on average 23 minimum bias proton-proton
interactions, varying according to a Poisson distribution, will be present. These
interactions contaminate the event of interest with additional charged tracks in
94
3. Higgs search in 4l 3.2 Pileup Reweighting
the inner detector and account as a considerable background. The challenge for
ATLAS then is understanding which tracks and energy deposits to attribute to
which interaction. The described phenomenon is called “pileup”. Pileup is dis-
tinct from “underlying events” in that it describes events coming from additional
proton-proton interactions, rather than additional interactions originating from
the same proton collision.
On top of this so-called “in-time pileup”, which refers to the additional min-
imum bias collisions “piled up” in each bunch crossing, comes concern about
“out-of-time pileup”, which refers to events from successive bunch crossings.
Eventually, the LHC will carry 2808 proton bunches per orbit, making them
exceptionally close in space and time (just 25 ns apart). This is faster than the
read-out response of many of the ATLAS sub-detectors, making the detector
sensitive to events from many bunch crossings. As a result the out-of-time pile-
up occurs because the signal from the calorimeter cells is integrated over a time
window larger than the time spread between two proton collisions.
The actual LHC bunch structure, however, is more complicated than this. The
bunches are arranged in “trains” of varying length, partially dependent on the
spacing between individual bunches. In 2011, the LHC ran with 75 ns spacing
between bunches for ATLAS data periods B-D, then switched to 50 ns spacing
for periods thereafter. With the 50 ns spacing, trains extend up to 144 bunches
in length. In addition to the finite train length, there are also bunch-to-bunch
variations in intensity, so different bunches within a train have varying lumi-
nosity. The machine parameter’s evolution over time results in variations of the
number of interactions occurring per bunch crossing and in the distance between
consecutive bunches. Figures 3.4(a) and 3.4(b) show the luminosity recorded
versus the average number of interactions per bunch crossing per group of pe-
riod and for each individual period.
The measured luminosity is multiplied by the inelastic cross section σinelasticpp
to obtain 〈µ〉, the average number of interactions per bunch crossing, which is
typically used to characterize the average amount of pileup. There is an appar-
ent 〈µ〉 variation between bunches, since details about previous bunch crossings
are not available for an event of interest, out-of-time pileup must be estimated
on average. As 〈µ〉 comes from the luminosity measurement, it is available for
each luminosity block (LB), which is the smallest possible portion of data where
the luminosity is determined. Then within a LB, it can either be averaged over
all bunches (〈µ〉|LB,BCID, where BCID is the bunch crossing ID), or calculated
separately for each bunch (〈µ〉|LB(BCID), averaged across the LB).
Usually, Monte Carlo samples are produced before or during a given data tak-
ing period. By that design, only a best-guess of the data pileup conditions can
be put into the Monte Carlo. Thus, there is the need at the analysis level to
95
3. Higgs search in 4l 3.2 Pileup Reweighting
(a) (b)
(c)
Figure 3.4: (a) Integrated luminosity for data period B-D, E-H, I-K and L-M versus average number of interactions per bunch crossing. (b) Integratedluminosity per data period B to M versus average number of interactionsper bunch crossing. (c) Distribution of the average number of interactionsper bunch crossing for MC11b and for period B-D, period E-H, period I-Kand period L-M. Figures are taken from [37].
96
3. Higgs search in 4l 3.3 Lepton Reconstruction and Identification
reweight the Monte Carlo pileup conditions to what is found in the data taken.
In this analysis a reweighting dependent on the distribution 〈µ〉|LB,BCID is ap-
plied to the Monte Carlo data and Figure 3.4(c) shows the average number of
interactions per bunch crossing for the different periods of the Monte Carlo that
simulates the data period. It has been proved that the reweighted Monte Carlo
correctly reproduces the different data periods.
3.3 Lepton Reconstruction and Identification
Lepton identification and reconstruction is of particular importance for the
H → 4l channel. Electron candidates consist of electromagnetic clusters to
which inner detector (ID) tracks are matched in a window between the clus-
ter position and the extrapolated track. The electron transverse energy (ET )
is computed from the cluster energy and the track direction at the interac-
tion point. The baseline electron identification in ATLAS relies on cuts using
variables that provide good separation between isolated electrons and jets, as
detailed in Section 2.2.9. These variables include calorimeter, tracker and com-
bined calorimeter/tracker information. Cuts on those variables can be applied
independently and the following new three reference selections have been defined
for the data Release 17, on which is based the current analysis, with increasing
background rejection power: loose++, medium++ and tight++. More gener-
ally, the ++menu offers better balanced performance (efficiency/rejection) than
the standard menu (loose, medium, tight).
Shower shape variables of the first and second calorimeter layer, hadronic leak-
age variables, track quality and the ∆η between the extrapolated track and the
cluster are used in the loose++ selection. Then the loose++ requirement adds
additional cuts to the standard loose operating point but cuts on them in a
looser way (standard loose cuts on shower shapes variables at the same value as
medium and tight). In particular the used variables for cuts are [52]:
• Shower shapes:
– el reta: the ratio in η of cell energy in 3× 7 versus 7× 7 cells (see
Table 2.5);
– rHad: the ratio between the transverse energy ET leakage in the
hadronic calorimeter and ET of the electromagnetic cluster;
– rHad1: the ratio between ET leakage in the first sampling of the
hadronic calorimeter and ET of the electromagnetic cluster;
– el weta2: the 3×5 window lateral width, it is defined as the η cluster
97
3. Higgs search in 4l 3.3 Lepton Reconstruction and Identification
variance weighted by the energy3;
– el wstot: the total shower width;
– el f1: the ratio between ET in the first sampling and the cluster
energy.
• Number of hits in the pixels and in the SCT: the number of pixel hits plus
the number of the pixel outliers is required to be ≥ 1, while the sum of
the number of hits in the pixel and SCT outliers and of hits in the silicon
detector (it is the number of hits in the pixels plus the number of hits in
the SCT) is required to be ≥ 7.
• Loose tracker-cluster matching in η: el deltaeta1, it is ∆η of the track
extrapolated to first calorimeter sampling and the cluster, it should be
less than 0.015.
• DEmaxs1:(el emaxs1-el Emax2
)/(el emaxs1+el Emax2
), where el emaxs1
is the maximum energy in strips and el Emax2 is the second maximum in
strips.
The medium++ requires one B-layer hit (if the module is not dead) and adds
extra selections on the impact parameter of the matched track and on the TRT
high threshold hits ratio. The medium++ menu offers efficiencies a few % lower
than of medium but with background rejections closer to those of tight.
The tight++ selection adds requirements on E/p (ratio of the cluster energy
to the track momentum), on the ∆φ between the extrapolated track and the
cluster, and on the number of TRT hits and also checks for overlaps with re-
constructed photon conversions. No substantial gains are made in tight++ over
tight. Tight++ offers slightly better efficiency (1-2%) in most bins with slightly
better rejection.
Muons are simply identified, as described in Section 2.2.10, by reconstruction of
tracks in the muon spectrometer alone (standalone), by the fitted combination
of inner detector and muon spectrometer tracks (combined), or by matching
an inner detector track of sufficient momentum with a reconstructed track seg-
ment of the muon spectrometer (segment tagged), or with energy deposits in
the calorimeters compatible with the hypothesis of a minimum ionizing parti-
cle (calorimeter tagged). Throughout this analysis, the loose++ electron and
combined or segment tagged muon selection are used if not explicitly stated
otherwise.
3wη2 =√(∑
i Ei · η2)/∑
i Ei −(∑
i Ei · η/∑
i Ei
)2.
98
3. Higgs search in 4l 3.4 Lepton Corrections
3.3.1 GSF Electrons
In the standard e/γ reconstruction, all tracks in the inner detector are fit-
ted using the pion hypothesis, this means that the algorithm does not allow
for any energy loss along the track. As a result the track momentum is un-
derestimated and the track parameters, especially on the bending plane, will
not be optimal. In fact electrons in ATLAS lose on average between 20% and
50% of they energy (depending on |η|) by the time they have left the SCT. The
bremsstrahlung emission introduces, in general, non-Gaussian contributions to
the event-by-event fluctuations of the calorimetry and tracking measurements.
By fitting electron tracks in such a way as to allow for proper modeling of the
energy loss due to bremsstrahlung, it is possible to improve the reconstructed
track parameters.
In this Higgs search analysis the Gaussian-sum filter (GSF) was used in order
to account for energy losses due to bremsstrahlung. GSF is a non-linear gener-
alization of the Kalman filter4, which takes into account non-Gaussian noise by
modeling it as a weighted sum of Gaussian components and therefore acts as
weighted sum of Kalman filters operating in parallel. By allowing for changes
in the curvature of the track, the bremsstrahlung recovery algorithms follow the
track better and correctly associate more of the hits. In this work, a dedicated
algorithm has been used in order to re-process the electrons (egammaBremRec).
This new algorithm can only use the existing e/γ clusters and the available track
particles, so it does not allow to “recover” electrons but it only does the “refit”,
allowing to change the best match between track and cluster, providing better
track parameters and it is also expected to reduce the rate of the misidentified
charge.
3.4 Lepton Corrections
For electrons and for muons various corrections are provided by the e/γ and
by the muon combined performance (MCP) group respectively and are applied
to the data and Monte Carlo samples. In the following these corrections are
only listed, for a more detailed description see the documentation provided by
the group experts.
• Energy scale corrections must be applied to the data only and just for
electrons. The used tool corrects electromagnetic cluster energy by apply-
ing the energy scales obtained from resonances such as Z → ee, J/ψ → ee
4The Kalman filter is a recursive estimator. This means that only the estimated state fromthe previous timestep and the current measurement are needed to compute the estimate forthe current state.
99
3. Higgs search in 4l 3.5 Event Selection
or E/p studies using isolated electrons from W → eν. The code is trivial
and simply rescales the energy of the electromagnetic cluster in certain η/φ
bins by applying energy scales using the formula Ecorr = E/(1 + scale).
• Since the Monte Carlo samples does not reproduce the lepton momentum
resolution in data, by default a smearing procedure should be applied both
to the electron ET and muon pT in Monte Carlo.
In particular muon momentum smearing is performed separately on the
inner detector and muon spectrometer tracks of the muon. The momenta
of calorimeter tagged and MS-segment tagged muons are the momenta of
the associated inner detector tracks. Therefore the momentum smearing
of the inner detector momenta has to be applied to the calorimeter or seg-
ment tagged muons. The momenta of standalone muons are based on the
muon spectrometer momentum measurement. Hence the MS momentum
smearing must be applied to standalone muons. In the current analysis
the smearing correction is applied to the q/pT distribution, where q is the
charge of the lepton, instead of to the simple pT one.
• The reconstruction scale factor (SF) is the ratio of the measured recon-
struction efficiency in data and in Monte Carlo and is used to correct the
Monte Carlo to more correctly model the observed data. In general, this
represents inefficiencies in the quality cuts used in selecting high pT lep-
tons. For muons, this is largely due to mismeasurements in the transition
region of the muon spectrometer, and for the electrons, this is largely due
to problems in electron identification. Therefore for electrons a recon-
struction SF, including track quality requirement, and an identification
efficiency SF are provided by the e/γ group. These two scale factors have
to be multiplied and the associated errors should be added quadratically.
It should be stressed that the energy corrections and smearing functions are
applied at the beginning of the analysis, the scale factors at the end.
3.5 Event Selection
After the description of the applied corrections, it can be discussed the cut-
based analysis strategy, consisting in making some cuts in order to select Higgs
events. The event selection criteria for this study can be divided into five parts:
1. the Good Run List (GRL), larError, vertex and trigger cuts application;
2. the event preselection, which includes the basic kinematic requirements
on leptons;
100
3. Higgs search in 4l 3.5 Event Selection
3. the creation of quadruplet candidates and the application of the mass
dependent criteria for the selection of a Higgs boson candidate, which are
based on the invariant mass of each of the two di-leptons from the two Z
boson decays;
4. the criteria for the rejection of reducible backgrounds, which rely on over-
lap removal and on isolation and impact parameter properties of the lep-
tons;
5. the Higgs boson mass reconstruction.
3.5.1 Preliminary Cuts
At the beginning of the analysis there are four important requirements that
the events must have in order to survive:
• The ATLAS Data Quality group monitors the functionality of the detector
and the quality of the taken data online, i.e. while data taking, as well as
offline. The data are required to satisfy a number of conditions ensuring
that all essential elements of the ATLAS detector - detectors, magnets,
trigger, etc. - were performing as expected while the data were collected
during LHC collisions. If there were problems of any kind, the corre-
sponding detector sub-system would get flagged accordingly. An analysis
depending on the proper functionality of one or more particular parts of
the detector can implement a list of run periods which were flagged “good”
for these sub-systems: a GoodRunsList (GRL). In this specific analysis the
events must satisfy the 4l GRL, that is applied only to data and in a sep-
arate way for each final state (4e, 4µ, 2e2µ) to maximize the integrated
luminosity.
• In electron and photon analysis a selection has to be applied to reject bad
quality clusters or fake clusters originating from calorimeter problems.
In particular, events with noise bursts and data integrity errors in the
LAr calorimeter can be identified with the flag larError; its value is 0
to indicate “OK” events, it is 1 in case of events with noise burst and
it is 2 if they are events with data integrity errors. For release 17 it is
recommended to remove all events with larError>1.
• The events must outgo the vertex cut, in which it is required at least one
reconstructed vertex with 3 associated tracks.
• The events must outgo the trigger cut: for the present study the lowest pT
101
3. Higgs search in 4l 3.5 Event Selection
Single-lepton triggers
Period B-I J K L-M
4µ EF mu18 MG EF mu18 MG medium EF mu18 MG medium EF mu18 MG medium4e EF e20 medium EF e20 medium EF e22 medium EF e22vh medium1
2e2µ 4µ OR 4e
Di-lepton triggers
Period B-I J K L-M
4µ EF 2mu10 loose EF 2mu10 loose EF 2mu10 loose EF 2mu10 loose4e EF 2e12 medium EF 2e12 medium EF 2e12T medium EF 2e12Tvh medium
2e2µ 4µ OR 4e
Table 3.3: Triggers used in data. In each data taking period, the OR ofsingle- and di-lepton triggers is used to select each signature.
MC trigger according to the data taking period
4µ EF mu18 MG, EF mu18 MG medium OR EF 2mu10 loose4e EF e20 medium, EF e22 medium, EF e22 medium1 OR EF 2e12 medium, EF 2e12T medium
2e2µ 4µ OR 4e
Table 3.4: Triggers used in the Monte Carlo samples.
single or double lepton unprescaled5 triggers are being considered. The
single lepton triggers with thresholds of 20 GeV or 22 GeV for electrons,
depending on the LHC instantaneous luminosity, and 18 GeV for muons;
and di-lepton triggers with thresholds of 12 GeV for electrons and 10 GeV
for muons were chosen. The list of triggers used is provided in Table 3.3
for data and in Table 3.4 for Monte Carlo, in this case the trigger matches
the unprescaled trigger during data taking. The efficiency of these triggers
on signal events, with respect to the offline selection, is close to 100% [37].
3.5.2 Event Preselection
Events passing the trigger selection are required to satisfy additional lepton
preselection criteria [38].
An electron must have been generated by the author 1 or 3 (see Section 2.2.9)
and it must be identified as a Loose++ GSF Electron, as discussed in Section
3.3. The kinematic requirements that it must have are:
• a pseudorapidity of the electromagnetic cluster, including the crack region,
of |ηCluster| < 2.47.
5A prescale is a random selection of events accepted by the trigger, used in order to reducethe rates of a given trigger signature, usually to cope with the limited bandwidth of eventrecording.
102
3. Higgs search in 4l 3.5 Event Selection
• a transverse energy of ET > 7 GeV (ET is the ratio between the electro-
magnetic cluster energy and the track direction);
In 2011 data and Monte Carlo, the quality of the electron object has to be
checked using the Object Quality Flag, it is required that el GSF OQ&1446==0.
Besides the electrons must have the longitudinal impact parameter with respect
to the primary vertex, the z value at the point of the closest approach, z0 (see
Figure 3.5), less than 10 mm to reduce the mean number of pileup vertices.
After these cuts it is kept the electron with highest ET on the cluster between
electrons sharing the same inner detector track. Finally, electrons sharing the
same inner detector track with a muon candidate, within ∆R =√
∆φ2 +∆η2 <
0.02, are removed.
The muons must be identified as Tight Muons6 for the Muid algorithm, instead
they must have been generated by the author 6 or 7 if it is considered the Staco
algorithm. The muons are selected by requiring:
• a transverse impact parameter relative to the primary vertex, defined as
the reconstructed vertex with the highest∑p2T of associated tracks among
the reconstructed vertices with at least three associated tracks, less than
1 mm to reject cosmic rays (|d0| < 1 mm);
• a pseudorapidity of |η| < 2.77.
• a transverse momentum of pT > 7 GeV;
Then in order to select “high η” muons, in the region of 2.5 < η < 2.7, if
the muon is identified as standalone, it is required that it has hits in all three
stations, otherwise if it is identified as combined or segment tagged, the following
inner detector hits requirements are applied:
• A pixel B-layer hit on the muon track except the extrapolated muon track
passed an uninstrumented or dead area of the B-layer.
• The number of pixel hits plus the number of crossed dead pixel sensors
> 1.
• The number of SCT hits plus the number of crossed dead SCT sensors
≥ 6.
6The classification of loose and tight muons depends on the level of calorimeter and trackerisolation of the candidate. The isolation in the calorimeter is based on the cell energies ina hollow cone of 0.1 < ∆R < 0.4. The tracker isolation is defined as the scalar sum of thetransverse momenta of all tracks in a cone of ∆R < 0.4 around the muon track. The energiesfor both calorimeter and tracker isolation are required to be less than 2.5 GeV (4 GeV) fortight (loose) muons.
7The η requirement has been removed completely for combined and segment-tagged mouns.The acceptance is limited by the acceptance of the muon spectrometer and the inner detector.
103
3. Higgs search in 4l 3.5 Event Selection
(a) (b)
Figure 3.5: An illustration of track parameters in the transverse (a) andlongitudinal (b) planes expressed with respect to the origin of the detectorand the primary vertex. Indeed these parameters can also be expressed withrespect to the point of closest approach to the interaction vertex (primaryvertex), or the beam-spot, indicated by the superscript “PV” and “BS“,respectively. d0 is the transverse impact parameter, i.e. the distance ofthe closest approach of the trajectory to the origin of the detector in thetransverse (x − y) plane. The point of the closest approach is referred toas perigee. z0 is the longitudinal impact parameter, i.e. the z coordinate ofthe trajectory at perigee. φ0 is the angle at perigee of the trajectory in thetransverse plane. The polar angle, θ, is the angle with respect to the z axismade by the trajectory.
• The number of pixel holes plus the number of SCT holes < 3.
• A successful TRT extension where expected (i.e. in the η acceptance of
the TRT). An unsuccessful extension corresponds to either no TRT hit
associated, or a set of TRT hits associated as outliers. Therefore defining
n = nhitsTRT + noutliersTRT , where nhitsTRT denotes the number of TRT hits on
the muon track and noutliersTRT denotes the number of TRT outliers on the
muon track, the technical recommendation is:
– if |η| < 1.9, it is required n > 5 and noutliersTRT < 0.9n;
– if |η| ≥ 1.9 and n > 5, it is required that noutliersTRT < 0.9n.
Finally these selected muons must fulfill the condition |z0| < 10 mm, as a con-
servative choice offering protection against the possibility of discarding physics
information and at the same time keeping the pileup contribution as small as
possible.
In the final stage of the above described event preselection, it is required that
the event contains at least four selected leptons (4µ, or 4e, or 2e2µ).
104
3. Higgs search in 4l 3.5 Event Selection
3.5.3 Quadruplet Candidates and
Higgs Candidate Selection
After the selection cuts, the candidate quadruplets are formed by select-
ing two same flavour (SF) and opposite sign (OS) lepton pairs with at least
two of these leptons having pT > 20 GeV in order to suppress the reducible
backgrounds, because leptons from on-shell Z bosons are expected to have con-
siderable transverse momentum, while those from the reducible backgrounds are
usually softer. Within a quadruplet, the SFOS di-lepton pair with a mass m12
closest to the nominal Z-boson mass is considered the primary di-lepton, while
the second di-lepton pair of the quadruplet with a mass m34 is the sub-leading
one. The physical argument for this choice comes from the Breit-Wigner distri-
bution: the Z is more likely to be found near the pole mass and this method
selects the correct candidates in more than 90% of the cases. The analysis is
splitted in 4 final states: 2µ2µ, 2e2µ, 2µ2e, 2e2e, where the primary di-lepton is
mentioned first. For 2e2µ and 2µ2e channels it is also required that: mee ≥ 15
GeV and mµµ ≥ 15 GeV.
Physics analysis are usually interested in knowing if a reconstructed offline ob-
ject cause the considered trigger chain to fire, meaning matches to a trigger
object passing the trigger. Thus, it is important to determine the relationship
between objects generated by offline reconstruction and objects generated by the
trigger algorithms (L1, L2, and EF), to easily map between reconstructed and
trigger objects. Therefore before the mass requirement on the leading di-lepton,
it is necessary to check the trigger matching: it must ask Higgs candidates to
match event trigger, in particular it is required to match at least one (for single
lepton triggers) or two (for di-lepton triggers) of the leptons in the quadruplet
to the trigger object.
Passed this check, for each quadruplet there is a mass window requirement ap-
plied to the invariant mass of each of the two di-lepton pairs. The cut values
are chosen event-by-event using the reconstructed four-lepton invariant mass.
m12 is required to be within 15 GeV of the nominal Z mass. m34 is required
to exceed a threshold, mthreshold, which varies as a function of the four-lepton
invariant mass, m4l, and it should always be below 115 GeV. A set of threshold
cut values is shown in Table 3.5, where the actual cut value used for any other
reconstructed Higgs mass is obtained by linear interpolation between these mass
points.
Finally the four leptons of the quadruplets are required to be well separated,
min[∆R(li, lj)] > 0.10, it will be justified in Section 3.5.4. When more than
one quadruplet is found, the one with a primary di-lepton mass closest to the
nominal Z mass, and with the highest pT leptons associated to the second Z,
105
3. Higgs search in 4l 3.5 Event Selection
m4l [GeV] ≤120 130 140 150 160 165 180 190 ≥ 200
mthreshold [GeV] 15 20 25 30 30 35 40 50 60
Table 3.5: Summary of thresholds applied to m34 for reference values ofm4l, the reconstructed invariant 4-leptons mass. For other m4l values, theselection requirement is obtained via linear interpolation.
which corresponds to the highest off-shell Z mass, is chosen. In this way only
one lepton quadruplet is selected for each event.
3.5.4 Reducible Background Rejection
Reducible backgrounds processes require additional lepton criteria to further
decrease their contributions since their cross sections are larger than that of the
Standard Model Higgs boson.
Further discrimination can be achieved, since the leptons originating from the Z
boson decays are expected to be significantly more isolated compared to those
originating from the leptonic decays of the heavy quark.
Leptons from Z boson decays are also expected to originate from the main
interaction point, while the leptons from b and c quarks should come from sec-
ondary displaced vertices. In the following, discriminators based on the above
discussed properties of the leptons will be studied in more detail. The discrimi-
nant variable cuts, adopted in the analysis, have been optimized by the ATLAS
Collaboration using the expected distributions for signal and backgrounds.
3.5.4.1 Lepton Isolation
In order to provide a strong suppression of the main reducible backgrounds,
first of all it has been required, as already seen, that the minimum value of ∆R
of leptons in the quadruplet satisfies: min[∆R(li, lj)] > 0.10; then calorimetric
and track-based isolation criteria have both been imposed on each muon and
electron8. Even though the two quantities are physically correlated, they carry
statistically uncorrelated information as they are measured in different parts
of the detector. Their combination can therefore improve the rejection of the
reducible backgrounds.
The track isolation discriminant is defined as the sum of the transverse mo-
menta of the inner detector tracks in a cone of radius ∆R < 0.20 around the
lepton normalized to the lepton pT (∑pT /pT ). Summed tracks are of good
quality and pass a minimum pT cut, in particular the considered tracks have at
least four silicon hits and pT > 1 GeV [37]; so that no significant bias by pileup
8Although partial calorimetric isolation along the η direction is already part of the electron-id requirement, an extra calorimetric isolation is applied.
106
3. Higgs search in 4l 3.5 Event Selection
interactions is introduced in the track isolation estimate with the introduced
requirements. After having defined the requirements for inclusion of an inner
detector track in the isolation cone, it can be noted that in the Higgs analysis
each lepton is required to have a normalized track isolation less than 0.15. The
inner detector track corresponding to the lepton of interest is excluded from
the sum. Moreover care is taken to exclude possible contributions to the lep-
ton isolation variables originating from overlap with other leptons of the Higgs
candidate quadruplet; the contribution of overlapping leptons is removed for
∆R < 0.20, in fact in this case pT of the leptons entering this cone size are
subtracted from the isolation energy of the lepton of interest.
The calorimetric isolation is defined in a similar manner, summing the trans-
verse energy ET deposited in the calorimeter cells inside the isolation cone
around the lepton and normalizing to the muon pT or electron ET . The calorime-
ter cells which are expected to receive energy deposit from the lepton itself, in
the case of electrons these are the cells containing the electromagnetic shower,
are excluded from the sum. All leptons of the quadruplet must have a normal-
ized calorimetric isolation, inside a cone of ∆R < 0.20, less than 0.30. The
contribution of overlapping leptons is removed for ∆R < 0.18. This strategy is
now implemented only for electrons. When multiple electrons are present in the
final state a performance degradation of the calorimetric isolation is observed
owing to events where the electromagnetic shower of an electron in the Higgs
quadruplet coincides with the isolation cone of another signal electron (or muon
in the 2e2µ channel). Thus, if the angular distance between two electrons (or
between a muon and an electron) in the Higgs candidate quadruplet is smaller
than 0.18, the ET of the former is subtracted from the total isolation energy of
the latter (it is always the muon in the 2e2µ channel).
Besides, the lepton energy is corrected to remove pileup effects before using it
to compute the normalized calorimetric isolation contribution. Indeed at a lu-
minosity L > 1033 cm−2s−1, the pileup effect has to be considered. It is known
that the track isolation is not affected, whereas the calorimetric isolation has
been chosen in order to minimize the pileup effects [37]. The chosen cone size
of ∆R = 0.20, for both track and calorimetric isolation, is found to have the
optimum performance in the desired signal efficiency region and, in addition, it
is expected to have less effect from pileup with respect to wider cones.
In general, the EtconeXX calorimeter isolation variables are calculated by tak-
ing a simple sum of calorimeter cell energies inside of a cone of a certain radius
around the cluster barycenter (∆R < 0.XX), excluding a 5× 7 grid of cells in
(η, φ) in the center of the cone; in this analysis the Etcone20 variable is used.
There are at least two effects that modify this value in unwanted ways [53]:
107
3. Higgs search in 4l 3.5 Event Selection
• An electron will leak some of its energy outside of this central core, and
will cause the isolation energy to grow as a function of ET .
• Soft energy deposits from pileup interactions will change the isolation
energy depending on the amount of activity in the current event (in-time
pileup) as well as previous events (out-of-time pileup).
Typically, in-time pileup increases the energy of the EtconeXX variables, while
out-of-time pileup actually tends to decrease it, as is discussed below. Both in-
time pileup and out-of-time pileup tend to increase the width of the EtconeXX
distributions. In-time pileup is fairly straightforward in its effects: particles
from other interactions in the same bunch crossing leave energy deposits in
the calorimeter, and when these fall within the cone used for the EtconeXX
variable, the observed energy increases. The larger the cone size used, the more
additional energy is caught. More energy deposits contributing to the isolation
sum increase the spread and thus the measured width.
To parameterize in-time pileup, tracking-related variables are typically used, as
the tracks found should be almost entirely from the bunch crossing of interest,
in particular in this analysis the number of reconstructed primary vertices with
at least two tracks associated (NPV ) is used. This gives a direct handle on the
number of additional interactions in the current bunch crossing, but does not
provide any information on the “hardness” of those interactions. In events with
many interactions, the vertex reconstruction may suffer and vertices may be
merged, leading to an underestimation of the number of additional interactions.
For the out-of-time pileup a typical handle is the average number of interactions
per bunch crossing, 〈µ〉, which was described above. Studies, carried out by the
experts, show that the isolation energy slowly decreases as the out-of-time pileup
increases, because it cancels the in-time pileup, and the larger cone sizes exhibit
a larger dependence on out-of-time pileup. For events with high NPV and low
〈µ〉, the peak of the EtconeXX distribution is higher, because in-time pileup
wins out. Conversely, for low NPV and high 〈µ〉, EtconeXX is low as out-of-time
pileup wins out. Looking at the width, it increases with both NPV and 〈µ〉consistent with expectations.
To make a simple, first-pass correction for pileup effects, the slopes of EtconeXX
versus NPV were taken from fits. A simple linear calorimetric energy correction
is then parameterized in NPV both for electrons and muons, because adding 〈µ〉
108
3. Higgs search in 4l 3.5 Event Selection
information little improves this9:
EtconeXX Npv corrected = EtconeXX − slope ·NPV (3.1)
After the corrections, the width of the EtconeXX variable distribution is slightly
improved, and because energy is subtracted off, the peak of the distribution is
shifted to smaller values.
It should be stressed that in this analysis there is no more a further correc-
tion for the electron leakage energy outside the cone, because the same results
were obtained using the leakage corrected EtconeXX variables and the EtconeXX
variables with no corrections applied [53].
3.5.4.2 Lepton Impact Parameter
As previously said, leptons from Zbb and tt are expected to originate from
displaced vertices, so they will have a larger impact parameter. The approach
that allows to reject the reducible background is to require that the impact
parameter significance, defined as the impact parameter of the lepton normal-
ized to its measurement error (|d0|/σd0), of all four leptons in the experiment
does not exceed a predefined value. The impact parameter is the distance of
the closest approach on the transverse plane and it is calculated with respect to
the event vertex fitted using a set of tracks reconstructed in the ID. This allows
to remove the effect of the transverse spread of the vertex position, which at
LHC is 15 µm. The sensitivity of the impact parameter significance is limited
by the uncertainties entering the impact parameter estimation, i.e. the intrinsic
impact parameter resolution (18 µm) and the uncertainty in the primary vertex
position (10 µm). However is more likely that the lowest pT leptons will origi-
nate from the decay of the b quarks. For this reason the adopted approach is to
apply the impact parameter significance cut only to the two lowest pT leptons10
of the quadruplet form4l <190 GeV and it is required to be < 3.5 for muons and
<6 for electrons. The difference is explained by the emission of bremsstrahlung
photons that limits the accuracy on the electron track reconstruction, indeed,
for electrons, bremsstrahlung smears the impact parameter distribution, hence
reducing the discriminating power of this cut with respect to muons [54].
Both for electrons and muons the efficiencies for the isolation cut and for
92D parameterized corrections, using NPV and 〈µ〉, were also considered by thee/γ group experts: EtconeXX Npv corrected = EtconeXX − p1 · NPV − p2µ orEtconeXX npv corrected = EtconeXX − (p1 + p2µ) · NPV . For both of these, however,p2 was found to be at least an order of magnitude smaller than p1, so the 1D correction basedon NPV was adopted.
10It is considered pT for muons and ET for electrons.
109
3. Higgs search in 4l 3.5 Event Selection
the impact parameter cut are above 99% [37]. These efficiencies decrease by few
percent when moving to lower ET electrons.
3.5.5 Higgs Boson Mass Reconstruction
The final discrimination variable is the mass of the lepton quadruplet. The 4-
leptons Higgs boson candidate mass reconstruction proceeds after selecting one
single lepton quadruplet in an event. For low Higgs masses, below 220 GeV,
the mass resolution on the Higgs candidates is directly affecting the sensitivity
of the Higgs searches. The resolution of the di-lepton mass can be improved by
applying a Z mass constraint to the pair with a mass closest to the Z invariant
mass. For Higgs boson masses of 200 GeV and above, when both Zs are on-
shell, the Z mass constraint can be applied to both lepton pairs. However for
Higgs masses larger than 230 GeV the Higgs natural width dominates over the
detector resolution and as a result the improvement in the Higgs mass resolution
is less important for the discovery potential. One additional advantage offered
by the Z mass constraint is the reduced sensitivity of the obtained resolution
on mis-calibrations and mis-alignments of the detector.
In the current application of the constraint fit, two particles are fitted using a
single constraint:
m2ll −m2
Z = 0 (3.2)
Nevertheless, in the case of the Z mass constraint there is an additional compli-
cation due to the intrinsic width of the Z boson, which is of similar magnitude
with the di-lepton mass resolution. Given the Breit-Wigner distribution of the
Z boson BW (m;mZ ,ΓZ), neglecting interference effects, and the di-lepton mass
resolution, which is described by a Gaussian distribution around the generated
mass of the di-lepton, i.e. centered at the measured Z value, G(mReco;m,σm)
with σm equal to the experimental resolution (∼ 1.7 GeV), the observed recon-
structed distribution for the di-lepton mass is given by the convolution of the
two distributions
f(mReco) =
∫ ∞
0
BW (m;mZ ,ΓZ) ·G(mReco;m,σm)dm. (3.3)
For a given reconstructed di-lepton mass the most likely mass at generation
level can be estimated by maximizing the quantity
L(m;mReco, σm,mZ ,ΓZ) = BW (m;mZ ,ΓZ) ·G(mReco;m,σm), (3.4)
this most likely mass is the one used as the constraint in the fit. Then this
mass divided for the reconstructed di-lepton mass provides a scale factor used
110
3. Higgs search in 4l 3.6 Background Estimation
to rescale the leptons momenta, and these rescaled momenta are used to com-
pute the Higgs mass. Finally, the Higgs candidate is accepted if the invariant
mass of the quadruplet is found within ±2σ with respect to the input mass.
It is noted that in the low masses, due to phase-space suppression, the Z line
shape could exhibit a stronger tail towards the lower masses with respect to
the distribution predicted by the Breit-Wigner distribution, however this effect
should not affect significantly these results since the leading di-lepton is required
to be within 15 GeV from the Z mass. This procedure does not introduce signif-
icant biases in the mean mass. When the electrons are included, the distribution
has an important tail towards low values due to bremsstrahlung upstream the
calorimeter. In the case of muons there is a small component probably due to
final-state radiation from Z decays, therefore in the 4µ channel the bias intro-
duced by the Z mass constraint is negligible. In general the shift on the mean
reconstructed mass as a function of the input value is way below the percent
level. Besides it has been shown that the Z mass constraint improves the mass
resolution by 10% to 17% [18].
Then the full set of cuts performed in this analysis are summarized in Table
3.6.
3.6 Background Estimation
The dominant ZZ(∗) background is estimated using MC simulation. Gen-
erated events are required to pass the complete analysis selection and the final
yield is normalized to its theoretical cross section and to the integrated lumi-
nosity.
For the Z+jets and tt processes data-driven methods are used. The estimation
of the dominant reducible backgrounds can be extracted by selecting appropri-
ate control regions where no signal is expected.
The control sample is formed by selecting events with a pair of same-flavour,
opposite-sign isolated leptons consistent with the Z boson mass, |mZ−m12| < 15
GeV, and a second same-flavour, opposite-sign lepton pair where kinematic,
but no isolation or impact parameter, requirements are applied. The Zbb back-
ground dominates the Z + µµ sample, and the Z + light jets background dom-
inates in the Z + ee sample. The heavy flavour contribution in the Z + µµ
final state is estimated by subtracting from data the light jet component ob-
tained from measurements of the rate at which other particles are misidentified
as muons. The Z + light jets contribution in the Z + ee final state is estimated
by extrapolation, using MC simulation, from a background-dominated region
defined by inverting the electron identification requirement on the transverse
111
3. Higgs search in 4l 3.6 Background Estimation
Preliminary cuts
4l GRL & larErrror 6= 2Vertex cutTrigger cut
Event Preselection
Combined orauthor==6 or 7 for Staco; tight for Muid|d0| <1 mm
segment-tagged|η| <2.7pT >7 GeV
muons:ID Hits requirements|z0| < 10 mm
GSF electrons:
author==1 or 3Loose++ quality|ηCluster| <2.47ET >7 GeVObject Quality requirement ((el GSF OQ&1446)==0)|z0| < 10 mme− e overlap removale− µ overlap removal
Event Selection
Kinematics Cuts:
At least 4 leptons.At least one quadruplet of 2 pairs of SFOS leptonsfulfilling the following requirements.At least 2 leptons with pT >20 GeV.Trigger matching.|mZ −m12| < 15 GeV.mthreshold < m34 < 115 GeV as in Table 3.5.
Isolation Cuts:
min[∆R(li, lj)] > 0.10 for all leptons in the quadruplet.
Track isolation: pT [cone20]/pT < 0.15overlap removalfor ∆R < 0.20
Calo isolation: µ : ET [cone20]/pT < 0.30 overlap removale : ET [cone20]/ET < 0.30 for ∆R < 0.18
IP cuts:
Apply IP cut to the 2 less energetic leptons.For µ : |d0|/σd0 < 3.5For e : |d0|/σd0 < 6For m4l > 190 GeV no requirement applied.
Table 3.6: Summary of the event selection requirements for the analysis(SFOS pairs means pairs of same flavour opposite sign and “l” stands forlepton).
112
3. Higgs search in 4l 3.7 Results
shower shape of the electromagnetic energy deposit11. Then these data-driven
backgrounds are extrapolated to the signal region by applying the efficiencies
found in MC simulation.
The normalization of the tt background, which also contributes substantially in
the Z + µµ final state, is verified using a control region of events containing
an opposite-sign electron-muon pair consistent with the Z boson mass and two
additional same-flavour leptons. In this control sample the observed events are
then compared to the tt expectation from Monte Carlo. The expected num-
bers of background events, with their systematic uncertainty, obtained by the
ATLAS Collaboration studies are summarized in Table 3.7 [37, 55].
3.7 Results
With the set of cuts described above, the reducible backgrounds were kept
under control as desired. The contribution of Z inclusive, Zbb, and tt is well be-
low the irreducible ZZ component. The survival rate of WZ and gg2ZZ events
is negligible. The signal is affected as much as ZZ throughout the analysis,
except when it comes to the Higgs mass window cut.
In 2011, as already said, LHC delivered an integrated luminosity of 5.6 fb−1
of pp collisions at 7 TeV center of mass energy. This outstanding performance
enabled the ATLAS experiment to collect and analyse an integrated luminosity
corresponding to 4.9 fb−1 of data fulfilling all quality requirements to search for
the Standard Model Higgs boson.
The search in the channel H → ZZ(∗) → 4l, here discussed, has been per-
formed for mH hypotheses in the full 110 GeV to 600 GeV mass range using
data corresponding to an integrated luminosity of 4.8 fb−1. The number of
events observed in each final state, evaluated separately for m4l < 180 GeV and
m4l ≥ 180 GeV, are compared with the expectations for background and signal
for various mH hypotheses by the ATLAS Collaboration studies. Their values
are summarized in Table 3.7. In total 71 candidate events are selected by the
analysis: 24 4µ, 30 2e2µ , and 17 4e events, while in the same mass range 62±9
events are expected from the background processes: 18.6 ± 2.8 4µ, 29.7 ± 4.5
2e2µ and 13.4± 2.0 4e. Figure 3.6 shows the m4l spectrum with superimposed
the total expected background and the Higgs signal expected from three mass
hypotheses.
The discovery of a SM Higgs boson can be claimed once the signal is considered
statistically significant, which means that it is unlikely to be reproduced by a
11Rη , defined in Table 2.5, is required to be < 0.7, in order to reduce the contributions toZ + ee, in which the additional electrons can originate from photons, e.g. from π0 decays, orheavy quark mesons decaying semi-leptonically.
113
3. Higgs search in 4l 3.7 Results
4µ 2e2µ 4eLow-m4l High-m4l Low-m4l High-m4l Low-m4l High-m4l
Int. Luminosity 4.8 fb−1 4.8 fb−1 4.9 fb−1
ZZ(∗) 2.1± 0.3 16.3± 2.4 2.8± 0.6 25.2± 3.8 1.2± 0.3 10.4± 1.5Z + jets and tt 0.16± 0.06 0.02± 0.1 1.4± 0.5 0.17± 0.08 1.6± 0.7 0.18± 0.08
Total Background 2.2± 0.3 16.3± 2.4 4.3± 0.8 25.4± 3.8 2.8± 0.8 10.6± 1.5Data 3 21 3 27 2 15
mH = 130 GeV 1.00± 0.17 1.22± 0.21 0.43± 0.08mH = 150 GeV 2.1± 0.4 2.9± 0.4 1.12± 0.18mH = 200 GeV 4.9± 0.7 7.7± 1.0 3.1± 0.4mH = 400 GeV 2.0± 0.3 3.3± 0.5 1.49± 0.21mH = 600 GeV 0.34± 0.04 0.62± 0.10 0.30± 0.06
Table 3.7: The expected numbers of background events, with their sys-tematic uncertainty, separated into “Low-m4l” (m4l < 180 GeV) and “High-m4l” (m4l ≥ 180 GeV) regions, compared to the observed numbers of events.The expectations for a Higgs boson signal for five different mH values arealso given [55].
mere fluctuation of the background. Upper limits are set on the Higgs boson
production cross section at 95% C.L., using the CLs modified frequentist for-
malism with the profile likelihood test statistic. The test statistic is evaluated
with a maximum likelihood fit of signal and background models to the observed
m4l distribution. Figure 3.7(a) shows the expected and observed 95% C.L. cross
section upper limits calculated by the ATLAS Collaboration using ensembles of
simulated pseudo-experiments as a function of mH . The SM Higgs boson is ex-
cluded at 95% C.L. in the mass ranges 134 GeV−156 GeV, 182 GeV−233 GeV,
256 GeV−265 GeV and 268 GeV−415 GeV. The expected exclusion ranges are
136 GeV−157 GeV and 184 GeV−400 GeV [55].
The significance of an excess is given by the p0-value, it is the probability of
upward fluctuations in the background as high as or higher than the excesses
observed in data. The consistency of the observed results with the background-
only hypothesis expressed as p0-values is shown in Figure 3.7(b) over the full
mass range of the analysis. The most significant upward deviations from the
background-only hypothesis are observed for mH = 125 GeV with a local p0-
value of 1.6% (2.1σ), mH = 244 GeV with a local p0-value of 1.3% (2.2σ) and
mH = 500 GeV with a p0-value of 1.8% (2.1σ). The median expected local
p0-values in the presence of a SM Higgs boson are 10.6% (1.3σ), 0.14% (3.0σ)
and 7.1% (1.5σ) for mH = 125 GeV, 244 GeV and 500 GeV, respectively.
These values do not account for the so-called “look-elsewhere effect” (LEE),
which takes into account that such an excess (or a larger one) can appear any-
where in the search range as a result of an upward fluctuation of the background.
When considering the complete mass range of this search, the global p0-value for
each of the three excesses becomes of O(50%). Thus, once the look-elsewhere
114
3. Higgs search in 4l 3.7 Results
effect is considered, none of the observed local excesses is significant by itself
[55].
[MeV]4lm100 200 300 400 500 600 700
310×
Eve
nts/
10 G
eV
0
2
4
6
8
10
4l→(*)ZZ→H
-1Ldt=4.8 fb∫=7 TeVs
Background =150 GeV)H
Signal (m
=180 GeV)H
Signal (m =360 GeV)H
Signal (m
DATA
Figure 3.6: m4l distribution of the selected candidates, compared to thebackground expectation for the full mass range of the analysis. The signalexpectation for several mH hypotheses is also shown. The resolution of thereconstructed Higgs mass is dominated by detector resolution at low mH
values and by the natural Higgs boson width at high mH .
3.7.1 Combined Results
It is also important to consider a preliminary combination of Standard Model
Higgs searches with the ATLAS experiment, using datasets corresponding to
integrated luminosities from 4.6 fb−1 to 4.9 fb−1 of pp collisions collected at√s = 7 TeV. It has been presented in [56], the individual channels contributing
to the combination are reported in Table 3.8, while the combined esclusion plot
is shown in Figure 3.8. The Higgs boson mass ranges from 110.0 GeV to 117.5
GeV, 118.5 GeV to 122.5 GeV and 129 GeV to 539 GeV are excluded at the 95%
confidence level, while the expected Higgs boson mass exclusion in the absence
of a signal ranges from 120 GeV to 555 GeV at the 95% C.L. or higher. An
exclusion of the Standard Model Higgs boson production cross section at the
99% C.L. is reached in the regions between 130 GeV and 486 GeV. An excess
of events is observed in the H → γγ and H → ZZ(∗) → 4l channels for a Higgs
boson mass hypothesis close to mH ∼ 126 GeV. The expected sensitivity in
terms of local significance for a 126 GeV Higgs boson for both of these channels
is approximately 1.4σ, while the observed local significances of the individual
excesses are 2.8σ and 2.1σ, respectively. The combined local significance of
115
3. Higgs search in 4l 3.7 Results
the observed excess is 2.5σ, where the expected significance in the presence of
a Standard Model Higgs boson with mH = 126 GeV is 2.9σ. A preliminary
estimate of the global probability for such an excess to occur anywhere in the
full explored Higgs mass domain (from 110 GeV to 600 GeV) is approximately
30%, and in the range not excluded at the 99% confidence level by the LHC
combined Higgs boson search results (from 110 GeV to 146 GeV) it amounts to
approximately 10%.
(a) (b)
Figure 3.7: Figure (a): the expected (dashed) and observed (full line)95% C.L. upper limits on the Standard Model Higgs boson production crosssection as a function of mH , divided by the expected SM Higgs boson crosssection. The dark (green) and light (yellow) bands indicate the expectedlimits with ±1σ and ±2σ fluctuations, respectively. Figure (b): the observedlocal p0, the probability that the background fluctuates to the observednumber of events or higher, is shown as the solid line. The dashed curveshows the expected median local p0 for the signal hypothesis when tested atmH . The two horizontal dashed lines indicate the p0 values correspondingto local significances of 2σ and 3σ. Figures are taken from [55].
116
3. Higgs search in 4l 3.7 Results
Higgs Subsequent mH range LDecay Decay [GeV] [fb−1]H → γγ − 110-150 4.9
H → ZZlll′l′ 110-600 4.8llνν 200-280-600 4.7llqq 200-300-600 4.7
H →WWlνlν 110-300-600 4.7lνqq′ 300-600 4.7
H → τ+τ−ll4ν 110-150 4.7
lτhad3ν 110-150 4.7τhadτhad2ν 110-150 4.7
V H → bbZ → νν 110-130 4.6W → lν 110-130 4.7Z → ll 110-130 4.7
Table 3.8: Summary of the individual channels contributing to the com-bination of the Higgs Standard Model searches. The central number in thethree-part mass ranges indicates the transition from low-mH to high-mH
optimised event selections [56].
Figure 3.8: The observed (full line) and expected (dashed line) 95% C.L.combined upper limits on the SM Higgs boson production cross section di-vided by the Standard Model expectation as a function of mH in the fullmass range considered in this analysis. The dotted curves show the medianexpected limit in the absence of a signal and the green and yellow bandsindicate the corresponding 68% and 95% intervals [56].
117
Chapter 4Angular Analysis and TMVA
The “golden channel“ for the Higgs boson search is characterized by a good
separation between signal and background thanks to the good lepton identi-
fication and momentum resolution of the ATLAS detector. The traditional
search strategy using the golden channel focuses on measuring the invariant
mass spectrum of the four leptons. However, given that four-momenta of all
decay products can be reconstructed with sufficient resolution, it is possible to
measure more than just the total invariant mass of the four leptons. In fact,
there are a total of five angles that can be measured. Obviously it would be
advantageous to incorporate all available kinematic information when searching
for the Higgs boson. Additional kinematic variables can be included in an ex-
perimental measurement by multivariate analyses.
Angular correlations of Higgs decays in the golden channel have been studied
previously to determine the spin and CP properties of the putative Higgs reso-
nance. Recent works include the computation of the angular correlations of the
final state leptons resulting from the production of a resonance (with arbitrary
spin less than or equal to two) which in turn decays, via general couplings, to a
pair of Z bosons, which subsequently decay leptonically. The present work, in-
stead of comparing angular correlations for different spin and CP assumptions
for a singly produced resonance, aims at distinguishing the SM Higgs boson
signal from the dominant irreducible background qq → ZZ(∗) → 4l using the
TMVA, a Toolkit for Multivariate Data Analysis. This can be done along two
different lines: first, due to the different spin properties of the Higgs, that is a
scalar, with respect to ZZ, that is a mixture of different spin states, the decay
angles of the two Z and of the leptons are expected to have also different distri-
butions; second, due to the different production mechanism of the Higgs boson
with respect to the ZZ, the transverse momentum pT of the 4 leptons system
119
4. Angular Analysis and TMVA 4.1 Angular Analysis
is expected to be different in the two cases.
In this work it has been considered the possibility to use a similar set of vari-
ables to study the discrimination capability of ATLAS using angular variables in
the 4 leptons analysis. In the following the angular distributions based on MC
samples of signal and backgrounds are presented. Two Higgs masses (130 GeV,
360 GeV) have been considered, and the analysis has been limited to the ZZ
background, neglecting the lower contribution of other surviving background
events. For the ZZ background it has been considered a low mass region (110
GeV < mH < 150 GeV) as background for the 130 GeV Higgs signal, and a high
mass region (300 GeV < mH < 420 GeV) as background for the 360 GeV Higgs
signal. In the following, after a definition of the angles and a description of
how to compute them, the MC samples used are mentioned, the discrimination
power using decay or production angles is discussed considering reconstructed
events. Then the same discrimination power is evaluated exploiting multivariate
analysis.
4.1 Angular Analysis
4.1.1 Kinematics
As noted above, in this study it has been considered events in which two Z
bosons are produced from the decay of a SM Higgs boson produced either in the
gluon fusion channel or in the vector boson fusion channel. Each Z boson, which
could be either on or off the mass shell, decays to a lepton (l) and an anti-lepton
(l). Events with additional particles in the final state are not considered; thus
the transverse momentum of the 4l system is assumed to be negligible. In other
words, only exclusive ZZ(∗) → 4l processes are considered.
In these events, the final state can be completely reconstructed. In general
the kinematics can be specified in terms of two production angles of the ZZ(∗)
system, one of which is irrelevant; four decay angles describing ZZ(∗) → 4l; and
the invariant masses of the Zs. In hadron colliders it is also necessary to know
the momentum fractions of the initial massless partons, x1, and x2, in order to
compute the differential cross sections. In the following section the convention
for the angles which specify the event and how to obtain them for a particular
event are described. In particular, it is provided Lorentz-invariant definitions of
all angles, allowing for their determination from four-momenta reconstructed in
the laboratory (Lab) frame [57].
120
4. Angular Analysis and TMVA 4.1 Angular Analysis
(a) (b)
Figure 4.1: Figure (a): two decay planes of Zi → li li, i = 1, 2. The polarangles θi shown are defined in the rest frames of Zi with respect to ki, whilethe azimuthal angles shown are in fact 2π−φ1 = −φ1 and π−φ2. Figure (b):the coordinate system in the CM frame and the definition of the productionangle Θ.
4.1.1.1 Definitions of Angles
Let p1 and p2 be the momenta of the lepton pair coming from Z1, and p3
and p4 be the momenta of the lepton pair from Z2, while k1,2 are the momenta
of Z1,2. The notation used is such that p1 = l1, p2 = l1, p3 = l2, p4 = l2,
i.e. p1 is the momentum of the lepton from Z1 decay, p2 the momentum of
the antilepton from Z1 decay, etc. The momenta of the incoming partons are
denoted by kq and kq. Therefore the total momentum of the ZZ(∗) system is
P = kq + kq = k1 + k2 = p1 + p2 + p3 + p4, which satisfies P 2 = s ≡ M2. For
Higgs production in the gluon fusion channel, the incoming partons are self-
conjugate, kq = kgluon,1, kq = kgluon,2, and the total momentum P is the Higgs
momentum.
As shown in Figure 4.1, it is chosen the coordinate system in the center of mass
(CM) frame of the two Z’s system as:
zCM = k1, yCM =kq × k1
|kq × k1|, xCM = yCM × zCM =
−kq + k1(kq · k1)|kq × k1|
.
(4.1)
Furthermore, Z1 is defined as the rest frame of the Z1 boson by boosting the CM
frame along k1, while Z2 is obtained by first rotating CM frame with respect
to yCM by π and then boosting along k2. The production angle Θ and decay
angles {θ1, θ2, φ1, φ2} are defined as follows:
• Θ is the polar angle of the momentum of the incoming quark in the CM
frame, i.e. it is the angle between the 2 decaying Zs direction in the Higgs
reference frame and the z axis.
• θ1,2 is the polar angle of the momentum of the negative lepton (l1,2) in
the Z1,2 frame, i.e. it is the angle between the decaying leptons and the
121
4. Angular Analysis and TMVA 4.1 Angular Analysis
Z direction in the two Zs center of mass;
• φ1,2 is the azimuthal angle of the negative lepton (l1,2) in the Z1,2 frame;
The azimuthal production angle is irrelevant and chosen to be zero. In these
definitions, three-momenta of l1,2 in the Z1,2 frame can be written as
~pli in the Zi frame = |~pli |(sin θi cosφi, sin θi sinφi, cos θi), i = 1, 2, (4.2)
therefore
cos θi =~pli · ~ki|~pli~ki|
(4.3)
where ~ki is the Z 3-momentum in the 2Zs center of mass, whereas the three-
momentum of the incoming parton in the CM frame is
~kq in the CM frame = |~kq|(− sinΘ, 0, cosΘ). (4.4)
In hadron colliders the CM frame of the two Z’s system is different from the
Lab frame and the event as a whole will be boosted along the beam axis with
respect to the Lab frame, P = (P 0, 0, 0, P z). Also, it has been chosen to define
the coordinate system in the CM frame such that the z axis is defined by the Z1
three-momentum, rather than by the three-momentum of the incident partons,
as is natural in the Lab frame.
The total energy and momentum of the event P in the Lab frame can be
used to determine the momentum fractions of the incident partons: kq =
x1(Ecm, 0, 0, Ecm) and kq = x2(Ecm, 0, 0,−Ecm), where Ecm =√s/2 is the
CM energy of the colliding protons. From P = kq+kq, it results that s = x1x2s
and
kq =1
2(P 0 + P z, 0, 0, P 0 + P z), (4.5)
kq =1
2(P 0 − P z, 0, 0, P z − P 0), (4.6)
where are valid in the Lab frame.
4.1.1.2 Lorentz-Invariant Construction of Angles
In the CM frame, ~k1 and ~k2 are back to back and of equal magnitude, as
are kq and kq. Using P =M(1, 0, 0, 0), the energy and three-momentum of the
incoming partons can be expressed as,
Eq = Eq = |~kq| = |~kq| =√s
2, (4.7)
122
4. Angular Analysis and TMVA 4.1 Angular Analysis
as well as that of the two Z’s,
Ei =P · kiM
, |~ki| =
√(P · k1M
)2
−m212 ≡ λZ , i = 1, 2, (4.8)
where m2ij = (pi + pj)
2 = 2pi · pj . Alternatively, λZ =
√(P · k2/M
)2 −m234.
Since cosΘ = kq · k1, by computing kq · k1, it is simple to derive
cosΘ =−kq · k1 + EqE1
|~kq||~k1|=
(kq − kq) · k1MλZ
(4.9)
By definition cosΘ changes sign under kq ↔ kq, which is manifest in Equation
(4.9). Thus when the direction of the incoming quark cannot be distinguished
from the anti-quark, as is the case for hadron colliders, or when the incoming
partons are self-conjugate as in the Higgs gluon fusion production channel, one
can only determine cosΘ up to a minus sign. Because Θ is only defined between
0 and π, it is not necessary to compute sinΘ.
Focusing now on θi, since k1 = p1+p2, in the CM frame k1 = (E1+E2, 0, 0, |~p1+~p2|) and considering the boost that takes k1 from CM frame to the rest frame
of Z1 where it is (m12, 0, 0, 0), it can be written:(m12
0
)=
(γ −γβ
−γβ γ
)(E1 + E2
|~p1 + ~p2|
), (4.10)
from which it is derived
β =|~p1 + ~p2|E1 + E2
, γ =E1 + E2
m12. (4.11)
The inverse boost would then take p1 = (m12/2)(1, sin θ1 cosφ1, sin θ1 sinφ1, cos θ1)
in the Z1 rest frame to p1 = (E1, ~p1) in the CM frame, implying the following
relation:
E1 = γm12
2(1 + β cos θ1), (4.12)
from which the definition of cos θ1 can be obtained:
cos θ1 =E1 − E2
|~p1 + ~p2|=E1 − E2
|~k1|=
1
MλZP · (p1 − p2). (4.13)
The last expression is stemmed using Equation (4.8). For cos θ2, simply replace
p1 and p2 by p3 and p4, respectively, and then
cos θ2 =1
MλZP · (p3 − p4). (4.14)
123
4. Angular Analysis and TMVA 4.1 Angular Analysis
mH MC Total σ·BR[GeV] Generator Events [nb]
ggF Signal
130 powheg-pythia 20000 5.89 · 10−6
360 powheg-pythia 50000 7.08 · 10−6
VBF Signal
130 powheg-pythia 30000 0.481 · 10−6
360 powheg-pythia 29900 0.605 · 10−6
Table 4.1: Signal samples along with their cross section times the branchingratio values used for the angular distributions.
To compute φi, the unit normal vectors of the two decay planes are constructed,
N1 =~p1 × ~p2|~p1 × ~p2|
, N2 =~p3 × ~p4|~p3 × ~p4|
, (4.15)
so that
N1 · xCM = sinφ1, N1 · yCM = − cosφ1, (4.16)
N2 · xCM = − sinφ2, N2 · yCM = − cosφ2. (4.17)
A more detailed calculation of these last angles can be found in [57]. It is just
pointed out that when kq → −kq, φi → π + φi. So in hadron colliders or
gluon fusion production it cannot be distinguished between an event described
by angles (Θ, θ1, θ2, φ1, φ2) and an event described by angles (π−Θ, θ1, θ2, φ1+
π, φ2 + π).
Besides these angles, the Φ angle, defined as Φ ≡ π − φ1 − φ2 has been also
considered in this work.
4.1.2 Angular and pT Distributions
In order to exploit the angular distributions of the four leptons in the chan-
nel H → ZZ(∗) → 4l to disentangle signal from background, the same angular
distributions, whose theoretical study has been already presented in [57], have
been studied for the reconstructed events in the Higgs search analysis. Note
that for these distributions are not applied scale factors.
For the ZZ background sample the same pythia generator of the Higgs cut-
based analysis has been used and the Monte Carlo signal samples relative to an
Higgs of 130 GeV and 360 GeV, in which powheg interfaced to pythia, have
been considered (see Table 4.1).
Figure 4.2 shows normalized singly differential distributions in cosΘ, cos θ1,
124
4. Angular Analysis and TMVA 4.2 TMVA
cos θ2, Φ, φ1 and φ2 for the gluon fusion signal at mH =√s = 130 GeV and
mH =√s = 360 GeV and the corresponding low mass (110 < mH < 150 GeV)
and high mass (300 < mH < 420 GeV) ZZ irreducible background. The same
distributions for the vector boson fusion signals are very similar to these, are
not significant, therefore they are reported in appendix A. The distributions in
the production angle Θ indicate that the ZZ pair produced by the background
process tend to be in the forward region inside the detector; this is especially
pronounced when the invariant mass is high. In the signal case, on the contrary,
the Zs are produced isotropically, as expected from the fact that the Higgs is a
scalar particle (spin 0), thus the signal shape has to be flat within the angular
ranges. These variables are not used in the present Higgs analysis. A quantita-
tive assessment of the improvement in sensitivity is in progress, in view of the
use of them in the 2012 analysis.
Besides the transverse momentum distributions for the same samples of signal
and backgrounds are visible in Figure 4.3.
The study presented in this section shows that adding angular and pT infor-
mation in the signal search can give additional discriminating power between
signal and ZZ background. In particular the decay angular distributions cosΘ,
cos θ1 and cos θ2 can be useful at high Higgs masses, but their discriminating
power is negligible at low mass. The discriminating power is such, that these
variables will have to be combined in a multivariate analysis. In the following
section the multivariate analysis will be applied to the samples at high Higgs
mass, being variables more discriminating in this mass range.
4.2 TMVA
The Toolkit for Multivariate Analysis (TMVA), the principal tool used in this
work for the optimization of the Higgs signal observables, will be introduced. By
applying TMVA this work is, after finding the right observables, able to split the
signal from background, or at least to increase the ratio between signal efficiency
and background rejection with respect to a cut-based Higgs search analysis.
The advanced techniques applied are boosted decision trees and artifical neural
networks, they are described in Section 4.2.2 and 4.2.3. Then the obtained
results are showed in Section 4.3.
4.2.1 What is TMVA
In high energy and nuclear physics, the signal which is searched for, for
example a signature of a Higgs boson, is typically overlaid by background pro-
cesses with a similar signature. Commonly used methods of classification into
125
4. Angular Analysis and TMVA 4.2 TMVA
Θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10
0.01
0.02
0.03
0.04
0.05
0.06
0.07
1θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10
0.01
0.02
0.03
0.04
0.05
0.06
2θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Φ0 1 2 3 4 5 6
0
0.01
0.02
0.03
0.04
0.05
0.06
1φ
0 1 2 3 4 5 60
0.01
0.02
0.03
0.04
0.05
0.06
0.07
2φ
0 1 2 3 4 5 60
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
(a) mH = 130 GeV
Θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10
0.02
0.04
0.06
0.08
0.1
0.12
0.14
1θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
2θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Φ0 1 2 3 4 5 6
0
0.01
0.02
0.03
0.04
0.05
0.06
1φ
0 1 2 3 4 5 60
0.01
0.02
0.03
0.04
0.05
0.06
2φ
0 1 2 3 4 5 60
0.01
0.02
0.03
0.04
0.05
0.06
0.07
(b) mH = 360 GeV
Figure 4.2: Angular distributions for the gluon fusion Higgs signal atmH =130 GeV and at mH = 360 GeV (the blue histogram) and for the ZZirreducible background in the mass window 130 GeV < mH < 150 GeV or300 GeV < mH < 420 GeV, respectively (the red histogram). These plotsshow angular distributions for reconstructed events.
126
4. Angular Analysis and TMVA 4.2 TMVA
[MeV]T
p0 50 100 150 200 250 300
310×0
0.02
0.04
0.06
0.08
0.1
0.12
(a) mH = 130 GeV
[MeV]T
p0 50 100 150 200 250 300
310×0
0.02
0.04
0.06
0.08
0.1
(b) mH = 360 GeV
Figure 4.3: pT distributions of reconstructed events for the gluon fusionsignal at mH = 130 GeV and at mH = 360 GeV (the blue histogram)comparing to the same distribution for the irreducible background in themass window 110 GeV < mH < 150 or 300 GeV < mH < 420 GeV (the redhistogram), respectively.
signal and background events reach their limitations when the signal is very
small and/or part of the information of if an event is signal or background is
hidden in not well known correlations between the observables. The automated
multivariate analysis toolkit, TMVA, provides the ability to exploit the available
information from the observables efficiently. The advantage of MVA classifier
is, generally speaking, that it can achieve a better discrimination power with
respect to a simple cut analysis, especially in presence of poorly discriminating
variables. These variables are usually called weak variables and are character-
ized by having similar distributions for signal and background samples.
Indeed since in ATLAS there is a need for searching for a small Higgs signal
in a large data set, it is essential to extract the maximum of available informa-
tion from the signal characteristics. TMVA has been designed to find the best
separating function between the signal and background. It contains a variety
of multivariate classification algorithms. For this work the following methods
have been selected: boosted decision tree and artificial neural network.
The algorithms consist of two independent phases. The first one is the training,
where the program learns to classify data from a finite but representative set of
samples. In the second phase, the already trained classification system is tested
against new samples unknown to it. In this way it is possible to assess its real
classification capabilities for arbitrary samples of data. In fact after training
and testing the multivariate methods, the chosen methods are applied to the
concrete classification problem they have been trained for, i.e. to data.
127
4. Angular Analysis and TMVA 4.2 TMVA
4.2.1.1 Evaluation of MVA Methods
Classification in TMVA [58, 59] derives from the input variables (observables)
a classifier output where signal- (background-) like events have values close to
1 (0). The mapping to the signal and background class is done by defining
all events with a classifier output y > ycut as signal and all other events as
background. For each trained classifier the distribution of the classifier output
y for signal and background events can be inspected by the user (see Figure
4.4(a) for example). Signal and background efficiencies are computed for a set
of cuts on the classifier output. For each cut value ycut the signal efficiency
εeff,signal, purity and background rejection (1− εeff,background) are calculated1.
From the sets of signal efficiencies and background rejections defined by the cuts
MLPBFGS response0 0.5 1 1.5 2
dx / (1
/N)
dN
0
0.5
1
1.5
2
2.5
3
SignalBackground
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.0
, 0.0
)%
TMVA response for classifier: MLPBFGS
(a) (b)
Figure 4.4: (a) Example of a classification output y. Classification outputsclose to 0 (1) denote the event being background-like (signal-like). Since forthe training the true class of the event (signal/background) is known, theclassification output is plotted for both classes independently. (b) The ROCcurve which shows the relationship between signal efficiency (εeff,signal) andbackground rejection (1− εeff,background) [59].
on y the Reciever Operating Characteristics (ROC) curve is plotted. In Figure
4.4(b) several exemplary ROC curves with different classification performances
are shown. The larger the area below the curve, the better the separation
of signal and background which can be achieved. Which point on the ROC
curve the user should choose as working point (i.e. which signal efficiency and
which background rejection) depends on the type of analysis the user wants to
perform. For trigger selection, a high efficiency will be chosen to prevent from
signal events being discarded at a too early stage. For a signal search, the best
cut is where S/√B has a maximum. When a signal is found, the best cut for
measuring the cross section is where S/√S +B has a maximum. Finally for
precision measurements one aims for a high purity.
1The efficiency ε is defined as ε = Events passing the cut selectionTotal events
.
128
4. Angular Analysis and TMVA 4.2 TMVA
Therefore in order to ease the choice on the best MVA classifier to employ for
a particular classification problem, TMVA computes a number of benchmark
quantities that assess the performance of the methods on the independent test
sample. For classification these are:
• The signal efficiency at three representative background efficien-
cies obtained from a cut on the classifier output.
• The area of the background rejection versus signal efficiency function.
• The separation 〈S2〉 of a classifier y, defined by the integral
〈S2〉 = 1
2
∫ (yS(y)− yB(y)
)2yS(y) + yB(y)
dy, (4.18)
where yS and yB are the signal and background PDFs (Probability Density
Functions) of y, respectively. The separation is zero for identical signal
and background shapes, and it is one for shapes with no overlap.
• The discrimination significance of a classifier, defined by the difference
between the classifier means for signal and background divided by the
quadratic sum of their root-mean-squares.
The results of the evaluation are printed to standard output. In addition to the
MVA response value y of a classifier, TMVA also provides the classifier’s signal
and background PDFs, yS(B). The PDFs can be used to derive classification
probabilities for individual events, or to compute the Rarity transformation.
• Classification probability: The probability for event i to be of signal
type is given by,
PS(i) =fS · yS(i)
fS · yS(i) + (1− fS) · yB(i), (4.19)
where fS = NS/(NS +NB) is the expected signal fraction, and NS(B) is
the expected number of signal (background) events (default is fS = 0.5).
• Rarity: The Rarity R(y) of a classifier y is given by the integral
R(y) =
∫ y
−∞yB(y
′)dy′, (4.20)
which is defined such that R(yB) for background events is uniformly dis-
tributed between 0 and 1, while signal events cluster towards 1. The signal
distributions can thus be directly compared among the various classifiers.
129
4. Angular Analysis and TMVA 4.2 TMVA
The stronger the peak towards 1, the better is the discrimination. An-
other useful aspect of the Rarity is the possibility to directly visualize
deviations of a test background (which could be physics data) from the
training sample, by exhibition of non-uniformity.
4.2.1.2 Overtraining
When choosing what kind of classifier one wants to use in the analysis,
one should keep in mind that everything brings benefits, but also risks. The
main risk of using MVA classifier is overtraining. In such case, the classifica-
tion method specializes itself in differentiating the specific examples given for
training, learning them by memory instead of learning the underlying rules and
principles necessary to classify them. Therefore overtraining leads to a seeming,
artificial increase in the classification performance over the objectively achiev-
able one if measured on the training sample, and to an effective performance
decrease when measured with an independent test sample. A convenient way to
detect overtraining and to measure its impact is therefore to compare the per-
formance results between training and test samples. Such a test is performed
by TMVA with the results printed to standard output.
Overtraining occurs when a machine learning problem has too few degrees of
freedom, when the size of the decision system is too big for the complexity of
the training data, when the size of the training set is too small, or when it is
not representative of the whole input space, when the training sample events
are not statistically independent (oversampling) or in general in the case of a
wrong tuning of the classifier’s parameters.
The sensitivity to overtraining depends on the MVA method, so when one has
to choose what is the best classifier for own purposes, one has to pay attention
to achieve a good discrimination power, but also to avoid overtraining. Several
techniques are employed to check for overtraining and reduce its impact. Over-
training is quantified using a Kolmogorov-Smirnov (KS) test. The result is the
likelihood that the distribution obtained from the test sample could have been
obtained from the training sample, which for an overtrained classifier is unlikely.
4.2.2 Boosted Decision Tree (BDT)
A decision tree is a binary tree structured classifier similar to the one sketched
in Figure 4.5. Repeated left/right (yes/no) decisions are taken on one single vari-
able at a time until a stop criterion is fulfilled. The phase space is split this
way into many regions that are eventually classified as signal or background,
depending on the majority of training events that end up in the final leaf node.
The main issue in using tree-based methods is the high variance of this method,
130
4. Angular Analysis and TMVA 4.2 TMVA
Figure 4.5: Schematic view of a decision tree. Starting from the root node,a sequence of binary splits using the discriminating variables xi is appliedto the data. Each split uses the variable that at this node gives the bestseparation between signal and background when being cut on. The samevariable may thus be used at several nodes, while others might not be usedat all. The leaf nodes at the bottom end of the tree are labeled “S” for signaland “B” for background depending on the majority of events that end up inthe respective nodes [58].
i.e. the possibility of having very different results with a small change in the
training sample; this problem is solved using the boosting algorithms described
below.
4.2.2.1 Boosting
Boosting is a general procedure whose application is not limited to decision
trees. The boosting of a decision tree extends the concept illustrated above from
one tree to several trees which form a forest. The classification of an event is
made on a majority vote of the classifications done by each tree in the forest.
A simple algorithm for boosting works like this: it starts by applying some
method, in this case a tree classifier, to the learning data, where each observa-
tion is assigned an equal weight. It computes the predicted classifications, and
applies weights to the observations in the learning sample that are inversely pro-
portional to the accuracy of the classification. In other words, it assigns greater
weight to those observations that were difficult to classify (where the misclas-
sification rate was high), and lower weights to those that were easy to classify
(where the misclassification rate was low). Boosting will generate a sequence
of classifiers, where each consecutive classifier in the sequence is an “expert”
in classifying observations that were not well classified by those preceding it.
Therefore the trees are derived from the same training ensemble by reweighting
131
4. Angular Analysis and TMVA 4.2 TMVA
(boosting) events, i.e the same classifier is trained several times using a suc-
cessively boosted training event sample; an event will then be processed by all
trees and they are finally combined into a single classifier which is given by a
(weighted) average of the individual decision trees. Boosting stabilizes the re-
sponse of the decision trees with respect to fluctuations in the training sample
and is able to considerably enhance the separation performance compared to a
single tree. In many cases, the boosting performs best if applied to trees (clas-
sifiers) that, taken individually, have not much classification power, i.e. small
trees.
The boosting algorithm used in this analysis is the Gradient Boost. The func-
tion F (x) (being x the tuple of input variables) under consideration is assumed
to be a weighted sum of parametrized base functions f(x; am), so-called “weak
learners”:
F (x;P ) =M∑
m=0
βmf(x; am); P ∈ {βm; am}M0 . (4.21)
Thus each base function in this expansion corresponds to a decision tree. The
boosting procedure is now employed to adjust the parameters P such that the
deviation between the model response F (x) and the true value y obtained from
the training sample, where y = 1 for signal and y = −1 for backgrounds, is
minimized. The deviation is measured by the so-called loss-function L(F, y).
The current TMVA implementation of Gradient Boost uses the binomial log-
likelihood loss
L(F, y) = ln
(1 + e−2F (x)y
)(4.22)
for classification. As the boosting algorithm corresponding to this loss function
cannot be obtained in a straightforward manner, one has to resort to a steepest-
descent approach to do the minimization. This is done by calculating the current
gradient of the loss function and then growing a regression tree whose leaf values
are adjusted to match the mean value of the gradient in each region defined by
the tree structure. Iterating this procedure yields the desired set of decision trees
which minimizes the loss function. Gradient Boost is typically less susceptible
to overtraining. Its robustness can be enhanced by reducing the learning rate of
the algorithm through the Shrinkage parameter (see Table 4.2), which controls
the weight of the individual trees. A small shrinkage (0.1− 0.3) demands more
trees to be grown but can significantly improve the accuracy of the prediction
in difficult settings.
In certain settings Gradient Boost may also benefit from the introduction of
a bagging-like resampling procedure using random subsamples of the training
events for growing the trees. This is called stochastic gradient boosting and can
be enabled by selecting the UseBaggedGrad option. The sample fraction used in
132
4. Angular Analysis and TMVA 4.2 TMVA
each iteration can be controlled through the parameter GradBaggingFraction,
where typically the best results are obtained for values between 0.5 and 0.8.
4.2.2.2 BDT Training
The training or growing of a decision tree is the process that defines the
splitting criteria for each node. The training starts with the root node, where
an initial splitting criterion for the full training sample is determined. The split
results in two subsets of training events that each go through the same algorithm
in order to determine the next splitting iteration. This procedure is repeated
until the whole tree is built. At each node, the split is determined by find-
ing the variable and corresponding cut value that provides the best separation
between signal and background. The node splitting stops once each terminal
node, called leaf, is pure signal or pure background, or once it has reached the
minimum number of events which is specified in the BDT configuration (option
nEventsMin). A standard value from the literature is:
nEventsMin = max
(40,
Ntraining events
N2variables
1
10
)(4.23)
A variety of separation criteria can be configured (option SeparationType in
Table 4.2) to assess the performance of a variable and a specific cut requirement.
Imagine the events are weighted with each event having weight wi. The purity
of the sample in a branch is defined by [60]
P =
∑s ws∑
s ws +∑
b wb(4.24)
where∑
s is the sum over signal events and∑
b is the sum over background
events. It has to be noted that P (1−P ) is 0 if the sample is pure signal or pure
background. For a given branch the Gini index is defined as
Gini =
( n∑i=1
wi
)P (1− P ) (4.25)
where n is the number of events on that branch. The criterion is to minimize
Ginileft daughter +Giniright daughter (4.26)
To determine the increase in quality when a node is split into two branches, one
maximizes
Criterion = Ginifather −Ginileft daughter −Giniright daughter. (4.27)
133
4. Angular Analysis and TMVA 4.2 TMVA
Actually there are three major measures of node impurity used in practice. If
p is defined as the proportion of the signal in a node2, then the three measures
are:
• Gini Index, defined by p(1− p);
• Cross entropy, defined by −p ln p− (1− p) ln(1− p);
• Misclassification error, defined by 1−max(p, 1− p).
All separation criteria have a maximum where the samples are fully mixed, i.e.,
at purity p = 0.5, and fall off to zero when the sample consists of one event
class only. The three measures are similar, but the Gini Index and the Cross
entropy are differentiable, and hence more amenable to numerical optimization.
Therefore the splitting criterion being always a cut on a single variable, the
training procedure selects the variable and cut value that optimizes the increase
in the separation index between the parent node and the sum of the indexes of
the two daughter nodes, weighted by their relative fraction of events.
The cut values are optimized by scanning over the variable range with a granu-
larity that is set via the option nCuts. The default value of nCuts=20 proved to
be a good compromise between computing time and step size. Finer stepping
values did not increase noticeably the performance of the BDTs.
At the end, the leaf nodes are classified as signal or background according to
the class the majority of events belongs to. If the option UseYesNoLeaf is set
the end-nodes are classified in the same way. If UseYesNoLeaf is set to false the
end-nodes are classified according to their purity, i.e. if a leaf has purity greater
than 1/2 (or whatever is set), then it is called a signal leaf and if the purity is
less than 1/2, it is a background leaf. The resulting tree is a decision tree.
In principle, the splitting could continue until each leaf node contains only sig-
nal or only background events, which could suggest that perfect discrimination
is achievable. However, such a decision tree would be strongly overtrained. To
avoid overtraining a decision tree must be pruned.
Pruning is the process of cutting back a tree from the bottom up after it has
been built to its maximum size. Its purpose is to remove statistically insignif-
icant nodes and thus reduce the overtraining of the tree. It has been found
to be beneficial to first grow the tree to its maximum size and then cut back,
rather than interrupting the node splitting at an earlier stage. This is because
apparently insignificant splits can nevertheless lead to good splits further down
the tree. Whereas this technique is useful for a single tree, it is not completely
clear yet if this also applies for the tree in the forest. Currently it looks as if
2 p (also called purity of a node) is given by the ratio of signal events to all events in thatnode. Hence pure background nodes have zero purity.
134
4. Angular Analysis and TMVA 4.2 TMVA
in TMVA, better results for the whole forest are often achieved when pruning
is not applied, but rather the maximal tree depth is set to a relatively small
value (3 or 4) already during the tree building phase. In particular the Gradient
Boost does not apply a pruning algorithm. In this case it is recommended to
restrict the number of nodes in the tree to values between 5 to 20 by using
option NNodesMax or the maximal allowed depth of the tree (MaxDepth option).
4.2.3 Artificial Neural Network (ANN)
Figure 4.6: Multilayer perceptron with one hidden layer [58].
A more complex estimator that is able to manage also with highly corre-
lated variables, or variables which have a very poor discrimination power is
the Artificial Neural Network (ANN). The ANN is composed by some nodes
called neurons, which are arranged in different layers and connected each other.
An Artificial Neural Network is any simulated collection of interconnected neu-
rons, with each neuron producing a certain response at a given set of input
signals. One can therefore view the neural network as a mapping from a space
of input variables x1, . . . , xnvar onto a one-dimensional (e.g. in case of a signal-
versus-background discrimination problem) or multi-dimensional space of out-
put variables y1, . . . , ymvar . The mapping is nonlinear if at least one neuron has
a nonlinear response to its input.
135
4. Angular Analysis and TMVA 4.2 TMVA
4.2.3.1 Network Architecture
While in principle a neural network with n neurons can have n2 directional
connections, the complexity can be reduced by organizing the neurons in layers
and only allowing directional connections from one layer to the immediate next
one (see Figure 4.6). This kind of neural network is termed multi-layer percep-
tron (MLP). The first layer of a multilayer perceptron is the input layer, the
last one the output layer, and all others are hidden layers. For a classification
problem with nvar input variables and 2 output classes, the input layer consists
of a bias neuron and nvar neurons that hold the input values, x1, . . . , xnvar , and
the output layer consists of one neuron that holds the output variable, the neu-
ral net estimator yANN .
Each directional connection between the output of one neuron and the input of
another has an associated weight. The output value of each neuron is multiplied
with the weight to be used as input value for the next neuron.
Each neuron has a neuron response function ρ, which maps the neuron in-
put i1, . . . , in in onto the neuron output. Often it can be separated into a
Rn 7→ R synapsis function κ, and a R 7→ R neuron activation function α, so
that ρ = α ◦ κ. The functions κ and α can have the following forms:
κ :(y(l)1 , . . . , y(l)n |w(l)
0j , . . . , w(l)nj
)→
w
(l)0j +
∑ni=1 y
(l)i w
(l)ij Sum,
w(l)0j +
∑ni=1
(y(l)i w
(l)ij
)2Sum of squares,
w(l)0j +
∑ni=1
∣∣y(l)i w(l)ij
∣∣ Sum of absolutes,
(4.28)
α : x→
x Linear,
11+e−kx Sigmoid,
ex−e−x
ex+e−x Tanh,
e−x2/2 Radial.
(4.29)
where yli is the output of the ith neuron in the lth layer and wlij is the weight
of the connection between the ith neuron in the lth layer and the jth neuron in
the (l + 1)th layer.
4.2.3.2 ANN Training
There are two algorithms for adjusting the weights that optimize the clas-
sification performance of a neural network: the so-called back-propagation and
BFGS.
136
4. Angular Analysis and TMVA 4.2 TMVA
Back-propagation (BP)
The most common algorithm is the so-called back-propagation. It belongs to the
family of supervised learning methods, where the desired output for every input
event is known. The output of a network (here for simplicity assumed to have
a single hidden layer with a tanh activation function, and a linear activation
function in the output layer) is given by
yANN =
nh∑j=1
y(2)j w
(2)j1 =
nh∑j=1
tanh
( nvar∑i=1
xiw(1)ij
)· w(2)
j1 , (4.30)
where nvar and nh are the number of neurons in the input layer and in the
hidden layer, respectively, w(1)ij is the weight between input-layer neuron i and
hidden-layer neuron j and w(2)j1 is the weight between the hidden-layer neuron
j and the output neuron. Simple summation was used in Equation (4.30) as
synapse function κ.
During the learning process the network is supplied with N training events xa =
(x1, . . . , xnvar)a, a = 1, . . . , N . For each training event a the neural network
output yANN,a is computed and compared to the desired output ya ∈ {1, 0} (1
for signal events and 0 for background events). An error function E, measuring
the agreement of the network response with the desired one, is defined by
E(x1, . . . ,xN|w) =
N∑a=1
Ea(xa|w) =
N∑a=1
1
2(yANN,a − ya)
2, (4.31)
where w denotes the ensemble of adjustable weights in the network. The set
of weights that minimizes the error function can be found using the method
of steepest or gradient descent, provided that the neuron response function is
differentiable with respect to the input weights. Starting from a random set of
weights w(ρ) the weights are updated by moving a small distance in w-space
into the direction −∇wE where E decreases most rapidly
w(ρ+1) = w(ρ) − η∇wE (4.32)
where η is a positive number called learning rate, which is responsible to avoid
serious overtraining of the network. The weights connected with the output
layer are updated by
∆w(2)j1 = −η
N∑a=1
∂Ea
∂w(2)j1
= −ηN∑
a=1
(yANN,a − ya)y(2)j,a (4.33)
137
4. Angular Analysis and TMVA 4.2 TMVA
and the weights connected with the hidden layers are updated by
∆w(1)ij = −η
N∑a=1
∂Ea
∂w(1)ij
= −ηN∑
a=1
(yANN,a − ya)y(2)j,a(1− y
(2)j,a)w
(2)j1 xi,a (4.34)
where it has been used tanh′ x = tanhx(1−tanhx). This method of training the
network is denoted bulk learning, since the sum of errors of all training events is
used to update the weights. An alternative choice, which is that implemented in
TMVA, is the so-called online learning, where the update of the weights occurs
at each event. The weight updates are obtained from Equation (4.33) and (4.34)
by removing the event summations.
BFGS
The Broyden-Fletcher-Goldfarb-Shannon (BFGS) method differs from back-
propagation by the use of second derivatives of the error function to adapt
the synapse weight by an algorithm which is composed of four main steps.
1. Two vectors, D and Y are calculated. The vector of weight changes D
represents the evolution between one iteration of the algorithm (k − 1)
to the next (k). Each synapse weight corresponds to one element of the
vector. The vector Y is the vector of gradient errors.
D(k)i = w
(k)i − w
(k−1)i , (4.35)
Y(k)i = g
(k)i − g
(k−1)i , (4.36)
where i is the synapse index, gi is the i-th synapse gradient, wi is the
weight of the i-th synapse, and k denotes the iteration counter.
2. Approximate the inverse of the Hessian matrix, H−1, at iteration k by
H−1(k) =D ·DT ·
(1 + Y T ·H−1(k−1) · Y
)Y T ·D
−D·Y T ·H+H·Y ·DT+H−1(k−1),
(4.37)
where superscripts (k) are implicit for D and Y .
3. Estimate the vector of weight changes by D(k) = −H−1(k) · Y (k).
4. Compute a new vector of weights by applying a line search algorithm. In
the line search the error function is locally approximated by a parabola.
The algorithm evaluates the second derivatives and determines the point
where the minimum of the parabola is expected. The total error is eval-
uated for this point. The algorithm then evaluates points along the line
defined by the direction of the gradient in weights space to find the ab-
solute minimum. The weights at the minimum are used for the next
138
4. Angular Analysis and TMVA 4.2 TMVA
iteration.
The learning rate can be set with the option Tau, which is line search size
step. The learning parameter, which defines by how much the weights are
changed in one epoch along the line where the minimum is suspected, is
multiplied with the learning rate as long as the training error of the neural
net with the changed weights is below the one with unchanged weights.
If the training error of the changed neural net were already larger for the
initial learning parameter, it is divided by the learning rate until the train-
ing error becomes smaller. The iterative and approximate calculation of
H−1(k) turns less accurate with an increasing number of iterations. The
matrix is therefore reset to the unit matrix every ResetStep steps.
4.2.4 Optimization of the MVA methods
In order to improve the performance of a multivariate analysis, in general
classifiers have tuning parameters to optimize separation between the signal and
background candidates. First of all it is necessary to fix the MC samples and
the variables that one intends to use, thereafter one can proceed to the tuning
of the parameters.
4.2.5 Monte Carlo Samples
The tuning of the MVA methods’ parameters and so the following analysis
have been obtained using:
• for the signal:
– 7856 mc11b’ gluon fusion (gg) signal selected events at mH = 360
GeV (from a collection holding 12024 selected events);
– 4703 mc11b’ vector boson fusion (VBF) signal selected events at
mH = 360 GeV (from a collection holding 7207 selected events);
• for the ZZ background: 26938 mc11b’ ZZ background selected events
(from a collection holding 28938 selected events), of which only the events
in the mass window 300 GeV < mH < 420 GeV have been considered.
The signal and background samples, from which these are stemmed, are the
same already mentioned in Table 4.1 and for the Higgs cut-based analysis, re-
spectively. The signal file is created merging the two above signal sub-samples
rescaled according to their corresponding cross section and in turn the back-
ground file is created rescaling the sub-sample with its cross section.
139
4. Angular Analysis and TMVA 4.2 TMVA
4.2.6 Input Variables
[MeV]Hm200 300 400 500 600
310×
1.4e
+04
MeV
/ (1
/N)
dN
0
5
10
15
20
25
-610×SignalBackground
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.1
, 0.0
)%
HInput variable: m
(a)
Θcos-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
0.05
12
/ (1
/N)
dN
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8 SignalBackground
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.0
, 0.0
)%
ΘInput variable: cos
(b)
[MeV]T
p100 200 300 400 500 600
310×
1.74
e+04
MeV
/ (1
/N)
dN
0
5
10
15
20
25
30
-610×SignalBackground
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.1
, 0.0
)%
TInput variable: p
(c)
Figure 4.7: Input Variables for the multivariate analysis.
The idea of analyzing all possible information coming from an experiment
in order not to miss anything may sound tempting. However, the inclusion
of trivial or correlated variables can introduce noise into the system without
actually providing new information, thus not improving the performance. The
observables used for the optimization are the ones that turn out to be the most
discriminating at high mass, as can be seen from Figure 4.2 the natural choice
is:
1. Invariant mass of the four leptons selected event (mH).
2. The angular variable cosΘ.
3. Transverse momentum of the four leptons selected event (pT ).
These variables are not very correlated, as can be seen from Figure 4.8, and
the optimization of the MVA methods has to be done adding one at a time the
variables, starting from the mass.
140
4. Angular Analysis and TMVA 4.2 TMVA
-100
-80
-60
-40
-20
0
20
40
60
80
100
Θcos Hm
Tp
Θcos
Hm
Tp
Correlation Matrix (signal)
100 -2 1
-2 100 1
1 1 100
Linear correlation coefficients in %
(a) Signal
-100
-80
-60
-40
-20
0
20
40
60
80
100
Θcos Hm
Tp
Θcos
Hm
Tp
Correlation Matrix (background)
100 -3
-3 100 5
5 100
Linear correlation coefficients in %
(b) Backgound
Figure 4.8: Variables correlation matrices for signal (a) and background(b).
4.2.7 Tuning Parameters for the implemented BDTG
In this work a Gradient Boosted Decision Tree (BDTG) has been imple-
mented, taking into account the considerations expressed in Section 4.2.2 the
main features of this classifier have been chosen and their optimal values are
summarized in Table 4.2.
Since the purity in the leaf nodes is sensitive to overtraining and therefore
typically overestimated, the end-nodes are classified as signal or background
according to the class the majority of events belongs to. The used separation
criteria is the Gini Index, defined by p · (1− p), as already described in Section
4.2.2. It is adopted because it is the default separation criterion and tests have
revealed no significant performance disparity between the most important sep-
aration criteria. Besides no pruning algorithm is applied by the Gradient Boost
decision tree, then the maximum depth of tree is just reduced to three levels.
The choice of the maximum dimension of a tree should take into account two
considerations: a very large tree might overfit the data, while a small tree might
not capture the important structure. Tree size is a tuning parameter govern-
ing the model’s complexity. The number of trees and the maximum number of
nodes have been chosen in such way to minimize overtraining and to maximize
the ROC curve area in order to obtain the best performance.
4.2.7.1 mH
Using the Monte Carlo samples above quoted and as input variable only the
invariant mass of the four leptons selected event mH (see Figure 4.7(a)), the
141
4. Angular Analysis and TMVA 4.2 TMVA
curves in Figure 4.9 are obtained setting NNodesMax to 53 or to 6 and varying the
number of trees. It has also been investigated the BDTG’s response when the
number of nodes increases, but after 6 nodes the situation remains unchanged.
These curves are very similar, then in order to choice the BDTG’s parameters
that assure the best performance, the plots in Figure 4.10 have been done. The
performance difference is more evident from these graphs which show that since
a too small number of nodes or trees is meaningless and an excessive number of
nodes or trees increases the overtraining problems, the best results are obtained
for NTrees = 800 and NNodesMax = 6.
Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgr
reje
ctio
n (1
-eff)
0
0.2
0.4
0.6
0.8
1
NNodesMax=5=30TreesN =50TreesN=100TreesN =200TreesN=300TreesN =400TreesN=500TreesN =600TreesN=700TreesN =800TreesN=900TreesN =1000TreesN=1100TreesN
(a)
Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgr
reje
ctio
n (1
-eff)
0
0.2
0.4
0.6
0.8
1
NNodesMax=6=30TreesN =50TreesN=100TreesN =200TreesN=300TreesN =400TreesN=500TreesN =600TreesN=700TreesN =800TreesN=900TreesN =1000TreesN=1100TreesN
(b)
Figure 4.9: ROC curves produced from BDTG varying the number of treesand fixing the number of nodes to 5 or 6.
4.2.7.2 mH and cosΘ
Using as input variables the mass of the four leptons selected event (mH)
and the angular variable cosΘ, that is the most discriminating at high mass
3The performance results for NNodesMax=4 and NNodesMax=5 are the same.
142
4. Angular Analysis and TMVA 4.2 TMVA
TreesN0 200 400 600 800 1000 1200
Sig
nific
ance
0.54
0.55
0.56
0.57
0.58
0.59
=5NodesMaxN=6NodesMaxN
TreesN0 200 400 600 800 1000 1200
RO
C
0.714
0.716
0.718
0.72
0.722
0.724
0.726
0.728
=5NodesMaxN=6NodesMaxN
(a)
Figure 4.10: Significance and ROC curve integral values produced fromBDTG using only the invariant mass mH as input variables and varying thenumber of trees and fixing the number of nodes to 5 or 6.
between all the angular variables described in Section 4.1, the same above study
has been carried out varying the maximum number of nodes from 4 to 14 and
the number of trees from 30 to 1100. As above the ROC curves are not useful
in order to make a choice for the parameters since they are very overlaid, but
analysing Figure 4.11, it has been concluded that the setting NNodesMax = 6
and Ntrees = 900 is the best choice.
TreesN0 200 400 600 800 1000 1200
Sig
nific
ance
0.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86
0.88
0.9
=5NodesMaxN =6NodesMaxN
=8NodesMaxN =10NodesMaxN
=12NodesMaxN =14NodesMaxN
(a)
TreesN0 200 400 600 800 1000 1200
RO
C
0.76
0.77
0.78
0.79
0.8
0.81
0.82
=5NodesMaxN =6NodesMaxN
=8NodesMaxN =10NodesMaxN
=12NodesMaxN =14NodesMaxN
(b)
Figure 4.11: Significance and ROC curve integral values produced fromBDTG using mH and cosΘ as input variables and varying the number oftrees and fixing the number of nodes to 5, 6, 8, 10, 12 and 14.
4.2.7.3 mH , cosΘ and pT
If one adds the transverse momentum as input variable to the previous ones,
there is a worse overtraining and the plots in Figure 4.12 are obtained varying
the maximum number of nodes from 4 to 14 and the number of trees from 30
to 1100. Increasing the number of trees and nodes the overtraining enhances,
then a good choice for the parameter in order to maximize the ROC curve area
and to minimize the overtraining effect turns out to be NNodesMax = 5 and
143
4. Angular Analysis and TMVA 4.2 TMVA
NTrees = 600.
TreesN0 200 400 600 800 1000 1200
Sig
nific
ance
0.95
1
1.05
1.1
1.15
=5NodesMaxN =6NodesMaxN
=8NodesMaxN =10NodesMaxN
=12NodesMaxN =14NodesMaxN
(a)
TreesN0 200 400 600 800 1000 1200
RO
C
0.83
0.835
0.84
0.845
0.85
0.855
0.86
0.865
0.87
=5NodesMaxN =6NodesMaxN
=8NodesMaxN =10NodesMaxN
=12NodesMaxN =14NodesMaxN
(b)
Figure 4.12: Significance and ROC curve integral values produced fromBDTG using mH , cosΘ and pT as input variables and varying the numberof trees and fixing the number of nodes to 5, 6, 8, 10, 12 and 14.
Finally, the BDTG’s features and its chosen tuning parameters are all sum-
marized in Table 4.2.
Option Value Description
BoostType Grad * Boosting type for the trees in the forest.
UseBaggedGrad True* Use only a random subsample of all eventsfor growing the trees in each iteration.
GradBaggingFraction 0.5* Defines the fraction of eventsto be used in each iteration.
Shrinkage 0.1 * Learning rate for GradBoost algorithm.
UseYesNoLeaf True* Use Sig or Bkg categories, not purity,as classification of the leaf node.
SeparationType GiniIndex * Separation criterion for node splitting.
nEventsMin max(40,NEvtsTrain/NVar2/10)* Minimum number of events required ina leaf node (default uses given formula).
nCuts 20* Number of steps duringnode cut optimization.
MaxDepth 3 * Max depth of the decision tree allowed.
mH mH and cosΘ mH , cosΘ and pT
NTrees 800 900 600 * Number of trees in the forest.NNodesMax 6 6 5 * Max number of nodes in tree.
Table 4.2: BDTG configuration.
4.2.8 Tuning Parameters for the implemented ANNs
In this work three Artificial Neural Networks (ANNs) have been imple-
mented, they are termed as MLP, MLPBNN and MLPBFGS.
The TMVA implementation of ANNs supports random and importance event
sampling. With event sampling enabled, only a fraction (set by the option
Sampling) of the training events is used for the training of the ANN. Values
144
4. Angular Analysis and TMVA 4.2 TMVA
in the interval [0, 1] are possible. Setting the option SamplingImportance to 1,
the events are selected randomly, in fact only for a value below 1 the probability
for the same events to be sampled again depends on the training performance
achieved for classification. Indeed in this last case if for a given set of events
the training leads to a decrease of the error of the test sample, the probabil-
ity for the events of being selected again is multiplied with the factor given in
SamplingImportance and thus decreases. In the case of an increased error of
the test sample, the probability for the events to be selected again is divided
by the factor SamplingImportance and thus increases. The probability for an
event to be selected is constrained to the interval [0, 1].
Event sampling is performed until the fraction specified by the option Sampling-
Epoch of the total number of epochs (NCycles) has been reached. Afterwards,
all available training events are used for the training. Event sampling can be
turned on and off for training and testing events individually with the options
SamplingTraining and SamplingTesting.
Since it is typically not known beforehand how many epochs are necessary to
achieve a sufficiently good training of the neural network, a convergence test
can be activated by setting ConvergenceTests to a value above 0. This value
denotes the number of subsequent convergence tests which have to fail (i.e. no
improvement of the estimator larger than ConvergenceImprove) to consider
the training to be complete. Convergence tests are performed at the same time
as overtraining tests. The test frequency is given by the parameter TestRate.
Finally it is recommended to set the option VarTransform = Norm, such that
the input is normalized to the interval [−1, 1].
Then the common configuration options of the three ANNs are summarized
in Table 4.3. The three ANNs differ from each other for the training method
applied to the net or for the neuron activation function α and for the number
of training cycles (NCycles) and the specification of hidden layer architecture
(HiddenLayers). As illustrated in Table 4.4, whereas the type of function α
and of the training method are arbitrarily fixed, the last two parameters have
been chosen for each variables set tempting to maximize the ROC curve area
taking care of the overtraining.
In order to configure the three ANNs it has been used the signal sample at
mH = 360 GeV and the irreducible background sample in the mass window 300
GeV < mH < 420 GeV.
Focusing now on the number of hidden layers. This parameter defines the net-
work architecture by setting the number of neurons per layer in the network
and the number of hidden layers. The selection of the number of layers was
based on the Weierstrass theorem, which states: “For a Multilayer perceptron
a single hidden layer is sufficient to approximate a given continuous correlation
145
4. Angular Analysis and TMVA 4.2 TMVA
Options Value Description
VarTransform Norm * Variable transformation performed before training.TestRate 5 * Test for overtraining performed at each #th epoch.
NeuronInputType sum * Neuron input function type.
ConvergenceTests 3* Number of steps (without improvement)required for convergence.
UseRegulator True * Use regulator to avoid over-training.LearningRate 0.02 * ANN learning rate parameter.DecayRate 0.01 * Decay rate for learning parameter.
Sampling 1* Only “Sampling” (randomly selected) eventsare trained each epoch.
SamplingEpoch 1* Sampling is used for the first “SamplingEpoch” epochs,afterwards, all events are taken for training.
SamplingImportance 1* The sampling weights of events in epochs which successful(worse estimator than before) are multiplied with“SamplingImportance”, else they are divided.
SamplingTraining True * The training sample is sampled.SamplingTesting False * The testing sample is sampled.
ResetStep 50 * How often BFGS should reset history.Tau 3 * Line search “size step”.
BPMode sequential * Back-propagation learning mode.
ConvergenceImprove 1e-30* Minimum improvement which counts as improvement(< 0 means automatic convergence check is turned off).
UpdateLimit 10000 * Maximum times of regulator update.
Table 4.3: Common configuration options of the three ANNs.
ANN Option Value
MLP
Training method BPNeuron activation function tanh
mH mH and cosΘ mH , cosΘ and pT
NCycles 200 100 100HiddenLayers N+5 N+7 N+6
MLPBFGS
Training method BFGSNeuron activation function tanh
mH mH and cosΘ mH , cosΘ and pT
NCycles 600 600 800HiddenLayers N+8 N+9 N+7
MLPBNN
Training method BFGSNeuron activation function sigmoid
mH mH and cosΘ mH , cosΘ and pT
NCycles 400 1100 600HiddenLayers N+8 N+7 N+9
Table 4.4: Features differing the three ANNs.
146
4. Angular Analysis and TMVA 4.2 TMVA
function to any precision, given an arbitrary large number of neurons in the
hidden layer. If the available computing power and size of the training data
sample are sufficient, one can thus raise the number of neurons in the hidden
layer until the optimal performance is reached” [58].
Whereas a too small number of neurons is not effective, an excessive number of
neurons and hidden layers slows the process while creating overtraining prob-
lems, hence decreasing the performance of the system against the test sample.
Therefore after choosing the input set variables, it is first of all assumed a neural
network with only one hidden layer formed by a certain number of neurons, vari-
able from N +1 to N +10, where N is the number of the input variables. Then
it is fixed the number of neurons in the hidden layer and it is varied the number
of training cycles; for each value of the HiddenLayers it is chosen the number
of NCycles up to which there is an increase of the ROC curve, i.e increasing the
number of epochs after this value there is no improvement in the performance,
but there is a regularity of the results. Besides because a too small number of
cycles is not operative, it is always chosen a number at least greater than 100,
and the increase of the performance has always been compared to the increase
of overtraining, finding a compromise.
4.2.8.1 mH
In Figure 4.13 are illustrated the best ROC curves obtained for different
settings of the three neural networks using as input variable only the invariant
mass of the four leptons selected event (mH). Comparing these curves and at the
same time visualizing the outputs of the networks in these configurations it has
been concluded that the best performance is obtained setting HiddenLayers =
N +5 and NCycles = 200 for MLP; HiddenLayers = N +8 and NCycles = 600
for MLPBFGS and HiddenLayers = N + 8 and NCycles = 400 for MLPBNN.
4.2.8.2 mH and cosΘ
In Figure 4.14 are illustrated the best ROC curves obtained for different
settings of the three neural networks using as input variables the invariant
mass of the four leptons selected event (mH) and the angular variable cosΘ.
In this case it is more evident that the best performance is obtained setting
HiddenLayers = N+7 and NCycles = 100 for MLP; HiddenLayers = N+9 and
NCycles = 600 for MLPBFGS and HiddenLayers = N+7 and NCycles = 1100
for MLPBNN.
147
4. Angular Analysis and TMVA 4.2 TMVA
Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgr
reje
ctio
n (1
-eff)
0
0.2
0.4
0.6
0.8
1
=100Cycles
=N+10; NNeuronsN =200Cycles
=N+9; NNeuronsN
=200Cycles
=N+8; NNeuronsN =100Cycles
=N+7; NNeuronsN
=200Cycles
=N+6; NNeuronsN =200Cycles
=N+5; NNeuronsN
=200Cycles
=N+4; NNeuronsN =100Cycles
=N+3; NNeuronsN
=100Cycles
=N+2; NNeuronsN =100Cycles
=N+1; NNeuronsN
(a) MLP
Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgr
reje
ctio
n (1
-eff)
0
0.2
0.4
0.6
0.8
1
=400Cycles
=N+10; NNeuronsN =500Cycles
=N+9; NNeuronsN
=600Cycles
=N+8; NNeuronsN =400Cycles
=N+7; NNeuronsN
=300Cycles
=N+6; NNeuronsN =300Cycles
=N+5; NNeuronsN
=300Cycles
=N+4; NNeuronsN =1000Cycles
=N+3; NNeuronsN
=400Cycles
=N+2; NNeuronsN =200Cycles
=N+1; NNeuronsN
(b) MLPBFGS
Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgr
reje
ctio
n (1
-eff)
0
0.2
0.4
0.6
0.8
1
=400Cycles
=N+10; NNeuronsN =100Cycles
=N+9; NNeuronsN
=400Cycles
=N+8; NNeuronsN =700Cycles
=N+7; NNeuronsN
=600Cycles
=N+6; NNeuronsN =500Cycles
=N+5; NNeuronsN
=500Cycles
=N+4; NNeuronsN =600Cycles
=N+3; NNeuronsN
=800Cycles
=N+2; NNeuronsN =500Cycles
=N+1; NNeuronsN
(c) MLPBNN
Figure 4.13: ROC curves produced from MLP (a), MLPBFGS (b) andMLPBNN (c) varying the number of training cycles and the number ofneurons in the only one hidden layer (N is the number of the input variables)and using as input variable the invariant mass mH .
148
4. Angular Analysis and TMVA 4.2 TMVA
Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgr
reje
ctio
n (1
-eff)
0
0.2
0.4
0.6
0.8
1
=100Cycles
=N+10; NNeuronsN =100Cycles
=N+9; NNeuronsN
=100Cycles
=N+8; NNeuronsN =100Cycles
=N+7; NNeuronsN
=100Cycles
=N+6; NNeuronsN =100Cycles
=N+5; NNeuronsN
=100Cycles
=N+4; NNeuronsN =100Cycles
=N+3; NNeuronsN
=100Cycles
=N+2; NNeuronsN =100Cycles
=N+1; NNeuronsN
(a) MLP
Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgr
reje
ctio
n (1
-eff)
0
0.2
0.4
0.6
0.8
1
=100Cycles
=N+10; NNeuronsN =600Cycles
=N+9; NNeuronsN
=300Cycles
=N+8; NNeuronsN =200Cycles
=N+7; NNeuronsN
=200Cycles
=N+6; NNeuronsN =200Cycles
=N+5; NNeuronsN
=100Cycles
=N+4; NNeuronsN =600Cycles
=N+3; NNeuronsN
=200Cycles
=N+2; NNeuronsN =200Cycles
=N+1; NNeuronsN
(b) MLPBFGS
Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgr
reje
ctio
n (1
-eff)
0
0.2
0.4
0.6
0.8
1
=400Cycles
=N+10; NNeuronsN =200Cycles
=N+9; NNeuronsN
=300Cycles
=N+8; NNeuronsN =1100Cycles
=N+7; NNeuronsN
=600Cycles
=N+6; NNeuronsN =400Cycles
=N+5; NNeuronsN
=700Cycles
=N+4; NNeuronsN =600Cycles
=N+3; NNeuronsN
=700Cycles
=N+2; NNeuronsN =800Cycles
=N+1; NNeuronsN
(c) MLPBNN
Figure 4.14: ROC curve produced from MLP (a), MLPBFGS (b) andMLPBNN (c) varying the number of training cycles and the number ofneurons in the only one hidden layer (N is the number of the input variables)and using as input variables the invariant mass mH and the angular variablecosΘ.
149
4. Angular Analysis and TMVA 4.2 TMVA
4.2.8.3 mH , cosΘ and pT
In Figure 4.15 are illustrated the best ROC curves obtained for different set-
tings of the three neural networks using as input variables the angular variable
cosΘ of the incoming quark, the invariant mass mH and the transverse mo-
mentum pT of the four leptons selected event. This time it has been chosen the
setting HiddenLayers = N + 6 and NCycles = 100 for MLP; HiddenLayers =
N + 7 and NCycles = 800 for MLPBFGS and HiddenLayers = N + 9 and
NCycles = 600 for MLPBNN.
Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgr
reje
ctio
n (1
-eff)
0.2
0.4
0.6
0.8
1
=100Cycles
=N+10; NNeuronsN =100Cycles
=N+9; NNeuronsN
=100Cycles
=N+8; NNeuronsN =100Cycles
=N+7; NNeuronsN
=100Cycles
=N+6; NNeuronsN =100Cycles
=N+5; NNeuronsN
=100Cycles
=N+4; NNeuronsN =100Cycles
=N+3; NNeuronsN
=100Cycles
=N+2; NNeuronsN =100Cycles
=N+1; NNeuronsN
(a) MLP
Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgr
reje
ctio
n (1
-eff)
0.2
0.4
0.6
0.8
1
=400Cycles
=N+10; NNeuronsN =600Cycles
=N+9; NNeuronsN
=200Cycles
=N+8; NNeuronsN =800Cycles
=N+7; NNeuronsN
=900Cycles
=N+6; NNeuronsN =300Cycles
=N+5; NNeuronsN
=300Cycles
=N+4; NNeuronsN =400Cycles
=N+3; NNeuronsN
=300Cycles
=N+2; NNeuronsN =200Cycles
=N+1; NNeuronsN
(b) MLPBFGS
Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgr
reje
ctio
n (1
-eff)
0.2
0.4
0.6
0.8
1
=500Cycles
=N+10; NNeuronsN =600Cycles
=N+9; NNeuronsN
=400Cycles
=N+8; NNeuronsN =700Cycles
=N+7; NNeuronsN
=500Cycles
=N+6; NNeuronsN =400Cycles
=N+5; NNeuronsN
=300Cycles
=N+4; NNeuronsN =600Cycles
=N+3; NNeuronsN
=200Cycles
=N+2; NNeuronsN =400Cycles
=N+1; NNeuronsN
(c) MLPBNN
Figure 4.15: ROC curve produced from MLP (a), MLPBFGS (b) andMLPBNN (c) varying the number of training cycles and the number ofneurons in the only one hidden layer (N is the number of the input variables)and using as input variables the invariant mass mH , the angular variablecosΘ and the transverse momentum pT .
4.2.9 Comparing MVA Methods Performance
Comparing for the three different sets of variables the MVA methods’ per-
formance, the plots in Figure 4.16 are obtained. In Table 4.5 the values of the
150
4. Angular Analysis and TMVA 4.3 Results
signal efficiency corresponding to a background efficiency equal to 0.30, of the
ROC curves area, of the separation between signal and background and of the
significance, quantities already explained in Section 4.2.1, are summarized to
ease the evaluation of the different techniques. The methods are ranked from
the best to the worst by the area of signal efficiency and purity curves.
The variable ranking is different for the four methods. Considering mH and
cosΘ, for the neural networks the best ranked variable is the mass whereas for
BDTG it is cosΘ. Assuming all the three variables as input set, the MLPBNN
and MLPBFGS methods rank the invariant mass on the top followed by the
transverse momentum and then the angular variable; for MLP pT is the best
afterwards mH and cosΘ are classified respectively, finally for the decision tree
classifier the ranking is cosΘ, mH and pT . The neural networks implement a
variable ranking that uses the sum of the weights-squared of the connections
between the variable’s neuron in the input layer and the first hidden layer. The
importance Ii of the input variable i is given by
Ii = x2i
nh∑j=1
(w
(1)ij
)2, i = 1, . . . , nvar (4.38)
where xi is the sample mean of input variable i and w(1)ij is the weight between
input-layer neuron i and hidden-layer neuron j. The ranking of the BDTG
input variables is derived by counting how often the variables are used to split
decision tree nodes, and by weighting each split occurrence by the separation
gain-squared it has achieved and by the number of events in the node. This
measure of the variable importance can be used for a single decision tree as well
as for a forest [58]. The importance’s values of the variables are reported in
Table 4.6.
4.3 Results
After having optimized the advanced techniques of the MVA toolkit, it is
possible to compare the ROC curves obtained applying multivariate analysis
with that obtained with a cut-based analysis. In this last case it has been
computed the signal efficiency and the background rejection simply cutting the
signal and background invariant mass histograms coming from the application
of the Higgs analysis described in the previous chapter. It is evident from Figure
4.17 that when using a multivariate analysis there is always an improvement for
the MLPBFGS and MLPBNN neural networks with respect to the cut-based
analysis. This improvement is much more visible when increasing the number
of input variables, especially for the decision tree classifier, which on the other
151
4. Angular Analysis and TMVA 4.3 Results
mH
MVA method Signal Efficiency(error) ROC Separation SignificanceMLPBFGS 0.746(05) 0.790 0.262 0.813MLPBNN 0.734(05) 0.780 0.251 0.791
MLP 0.706(05) 0.766 0.225 0.707BDTG 0.653(05) 0.728 0.166 0.589
mH and cosΘ
MVA method Signal Efficiency(error) ROC Separation SignificanceMLPBNN 0.803(04) 0.825 0.323 0.944MLPBFGS 0.804(04) 0.824 0.320 0.938
BDTG 0.788(05) 0.818 0.310 0.882MLP 0.678(05) 0.725 0.179 0.567
mH , cosΘ and pT
MVA method Signal Efficiency(error) ROC Separation SignificanceBDTG 0.869(04) 0.870 0.417 1.148
MLPBFGS 0.841(04) 0.856 0.382 1.117MLPBNN 0.835(04) 0.852 0.371 1.088
MLP 0.753(05) 0.787 0.252 0.842
Table 4.5: Evaluation results for the implemented MVA methods ranked bybest signal efficiency and purity (area). The top method is the best ranked.The signal efficiency here reported correspond to a background efficiencyequal to 0.30.
(a) mH and cosΘ
MLPBNN
Variable ImportancemH 1.456 · 102cosΘ 2.457 · 10−4
MLPBFGS
Variable ImportancemH 3.721 · 10cosΘ 8.420 · 10−5
MLP
Variable ImportancemH 4.360cosΘ 9.499 · 10−5
BDTG
Variable ImportancecosΘ 5.677 · 10−1
mH 4.323 · 10−1
(b) mH , cosΘ and pT
BDTG
Variable ImportancecosΘ 4.141 · 10−1
mH 3.368 · 10−1
pT 2.491 · 10−1
MLPBFGS
Variable ImportancemH 3.413 · 10pT 9.309
cosΘ 9.622 · 10−5
MLPBNN
Variable ImportancemH 9.382 · 10pT 6.949 · 10
cosΘ 1.537 · 10−4
MLP
Variable ImportancepT 1.189 · 10mH 3.595cosΘ 8.150 · 10−5
Table 4.6: Variables ranking.
152
4. Angular Analysis and TMVA 4.3 Results
Signal efficiency0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgro
un
d r
ejec
tio
n
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
MVA Method:
MLPBFGS
MLPBNN
MLP
BDTG
Background rejection versus Signal efficiency
(a) mH
Signal efficiency0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgro
un
d r
ejec
tio
n
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
MVA Method:
MLPBNN
MLPBFGS
BDTG
MLP
Background rejection versus Signal efficiency
(b) mH , cosΘ
Signal efficiency0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgro
un
d r
ejec
tio
n
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
MVA Method:
BDTG
MLPBFGS
MLPBNN
MLP
Background rejection versus Signal efficiency
(c) mH , cosΘ and pT
Figure 4.16: Comparing ROC curves for the different optimized MVAmethods using as input variables mH (a), mH and cosΘ (b), mH , cosΘ andpT (c).
153
4. Angular Analysis and TMVA 4.3 Results
hand does not perform very well when using only one input variable. The MLP
net does not seem to have the same performance, when considering mH and
cosΘ together, thus it is not appropriate for the Higgs search in this channel.
The next step is to decide the optimal cut on MVA output in order to obtain
the maximum value of S/√B (the so-called working point), where S and B are
the number of the signal and background events passing the cut, respectively.
The graphs in Figure 4.18 show the values of this ratio for each possible cut
versus the corresponding cut. These plots are consistent with the MVA outputs
shown in Figure 4.20, 4.21 and 4.22 for mH , mH and cosΘ, mH , cosΘ and pT
as input variables set, respectively. The optimal cut values corresponding to the
maximum of S/√B are summarized in Table 4.7. For completeness Figure 4.19
also shows the S/√S +B quantity for each cut value.
As reported in Table 4.8, it can be noted that the application of the multivariate
mH
MVA method Cut Value S/√
B S/√
S + B Signal Eff. Bkg. Eff.MLPBFGS 0.821 0.532 0.283 0.691 0.246MLPBNN 0.767 0.512 0.281 0.715 0.278
MLP 0.879 0.492 0.272 0.682 0.273BDTG 0.485 0.464 0.260 0.603 0.249
mH and cosΘ
MVA method Cut Value S/√
B S/√
S + B Signal Eff. Bkg. Eff.MLPBNN 0.795 0.451 0.214 0.666 0.177MLPBFGS 0.902 0.509 0.210 0.474 0.090
BDTG 0.704 0.406 0.194 0.650 0.177MLP 0.663 0.379 0.218 0.692 0.310
mH , cosΘ and pT
MVA method Cut Value S/√
B S/√
S + B Signal Eff. Bkg. Eff.BDTG 0.975 0.671 0.089 0.120 0.002
MLPBFGS 0.939 0.467 0.143 0.380 0.037MLPBNN 0.959 0.348 0.110 0.364 0.037
MLP 0.871 0.449 0.231 0.650 0.212
mH Cut-Based
S/√
B S/√
S + B Signal Eff. Bkg. Eff.0.165 0.156 0.617 0.254
Table 4.7: The optimal cut values that a MVA output should exceed to getthe maximum value of S/
√B, with the corresponding signal and background
efficiencies and S/√B and S/
√S +B quantities. The top method is the
best ranked for each input variables set. The same sensitivity values arealso reported for the cut-based analysis.
analysis leads to an improvement in the maximum value of S/√B with respect
to a cut-based analysis. The latter is obtained considering the ratio between
signal and background events in the mass window 343 GeV < mH < 377 GeV.
It is worth stressing that the cut-based S/√B value and ones reported in Table
154
4. Angular Analysis and TMVA 4.3 Results
MVA TMVA mH Cut-Based Relative Improvement
method S/√
B S/√
B TMVA/Cut-Based
mH
MLPBFGS 0.532
0.165
3.2MLPBNN 0.512 3.1
MLP 0.492 3.0BDTG 0.464 2.8
mH and cosΘ
MLPBNN 0.451
0.165
2.7MLPBFGS 0.509 3.1
BDTG 0.406 2.5MLP 0.379 2.3
mH and cosΘ and pT
BDTG 0.671
0.165
4.1MLPBFGS 0.467 2.8MLPBNN 0.348 2.1
MLP 0.449 2.7
Table 4.8: Comparison between MVA methods and cut-based analysis interms of the maximum of S/
√B.
4.7 are derived in two different ways. It is therefore not possible, or at least it is
not sufficient, a direct comparison between them, in order to claim the success
of the MVA application to the standard analysis. This is particularly evident
from Figure 4.17, where ROC curves corresponding to better S/√B are not also
the best when studying their areas. The values in Table 4.7 are useful however
to understand the behaviour of MVA methods when varying the input variables
set. For example for the MLP method considering mH and cosΘ there is an in-
crease of signal efficiency with respect to the use of the only invariant mass but
background efficiency also grows thus reducing background rejection, so S/√B
and S/√S +B decrease. On the other hand using the complete set of variables,
even though signal and background efficiencies decrease with respect to the only
mH use, both S/√B and S/
√S +B raise, because signal efficiency is reduced
less than the background one. Therefore this neural network shows a deteriora-
tion in performance, as evident from Table 4.5 and from Figure 4.17(d), adding
only the angular variable to the invariant mass. The reason can be attributed
to the training method here adopted, maybe not appropriate to describe the
complexity of the training data in case of this input variables set.
Instead the MLPBNN and MLPBFGS neural networks show a similar and a
regular behaviour: the signal and background efficiencies and S/√B decrease
adding variables, together with the value S/√S +B. This observation is re-
flected in the trend of curves in figures 4.18(b), 4.18(c), 4.19(c) and 4.19(b), i.e.
the greater the number of variables the lower is the curve. So even if according
155
4. Angular Analysis and TMVA 4.3 Results
Table 4.5 their performance seems to increase when adding variables, the best
improvement with respect to a cut-based analysis is obtained just including
only the invariant mass as discriminant variable. We can therefore conclude
that these networks attribute a greater importance to mH .
Finally, for the BDTG MVA S/√B decreases adding only cosΘ but undergoes
a sharp rise if all the three variables are considered, while S/√S +B keeps to
decrease. Therefore the best gain with respect to the standard strategy is ob-
tained applying the decision tree and considering the complete set of variables
as input. This is also evident in the comparison between the evaluation results
for the MVA methods reported in Table 4.5.
Signal efficiency0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgro
und
reje
ctio
n
0
0.2
0.4
0.6
0.8
1
Cuts-BasedHmHm
Θ+cosHm
T+pΘ+cosHm
(a) BDTG
Signal efficiency0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgro
und
reje
ctio
n
0
0.2
0.4
0.6
0.8
1
Cuts-BasedHm
HmΘ+cosHm
T+pΘ+cosHm
(b) MLPBFGS
Signal efficiency0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgro
und
reje
ctio
n
0
0.2
0.4
0.6
0.8
1
Cuts-BasedHm
HmΘ+cosHm
T+pΘ+cosHm
(c) MLPBNN
Signal efficiency0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bac
kgro
und
reje
ctio
n
0.2
0.4
0.6
0.8
1
Cuts-BasedHmHm
Θ+cosHm
T+pΘ+cosHm
(d) MLP
Figure 4.17: Comparing ROC curves for the different optimized MVAmethods varying the input variables set and for the cut-based analysis.
156
4. Angular Analysis and TMVA 4.3 Results
Cut value applied on BDTG output-1 -0.5 0 0.5 1
BS
/
3
4
5
6
7
8
9
10
HmΘ+cosHm
T+pΘ+cosHm
(a) BDTG
Cut value applied on MLPBFGS output-0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4
BS
/
3
4
5
6
7
8
9
10
HmΘ+cosHm
T+pΘ+cosHm
(b) MLPBFGS
Cut value applied on MLPBNN output-2 -1.5 -1 -0.5 0 0.5 1 1.5
BS
/
2
3
4
5
6
7
8
9
10
HmΘ+cosHm
T+pΘ+cosHm
(c) MLPBNN
Cut value applied on MLP output0.2 0.4 0.6 0.8 1 1.2 1.4
BS
/
2
3
4
5
6
7
8
9
10
HmΘ+cosHm
T+pΘ+cosHm
(d) MLP
Figure 4.18: Comparing S/√B curves for the different optimized MVA
methods using as input variables mH (black points), mH and cosΘ (redpoints), mH , cosΘ and pT (violet points).
157
4. Angular Analysis and TMVA 4.3 Results
Cut value applied on BDTG output-1 -0.5 0 0.5 1
S+
BS
/
0
1
2
3
4
5
6
HmΘ+cosHm
T+pΘ+cosHm
(a) BDTG
Cut value applied on MLPBFGS output-0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4
S+
BS
/
0
1
2
3
4
5
6
HmΘ+cosHm
T+pΘ+cosHm
(b) MLPBFGS
Cut value applied on MLPBNN output-2 -1.5 -1 -0.5 0 0.5 1 1.5
S+
BS
/
0
1
2
3
4
5
6
HmΘ+cosHm
T+pΘ+cosHm
(c) MLPBNN
Cut value applied on MLP output0.2 0.4 0.6 0.8 1 1.2 1.4
S+
BS
/
0
1
2
3
4
5
6
HmΘ+cosHm
T+pΘ+cosHm
(d) MLP
Figure 4.19: Comparing S/√S +B curves for the different optimized MVA
methods using as input variables mH (black points), mH and cosΘ (redpoints), mH , cosΘ and pT (violet points).
158
4. Angular Analysis and TMVA 4.3 Results
BDTG response0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
dx / (1
/N)
dN
0
5
10
15
20
25 Signal (test sample)
Background (test sample)
Signal (training sample)
Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 0.536 (0.999)
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.0
, 0.0
)%
TMVA overtraining check for classifier: BDTG
(a) BDTG
MLPBFGS response0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1
dx / (1
/N)
dN
0
2
4
6
8
10
12
14
16
18
20 Signal (test sample)
Background (test sample)
Signal (training sample)
Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 0.979 (0.644)
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.0
, 0.0
)%
TMVA overtraining check for classifier: MLPBFGS
(b) MLPBFGS
MLPBNN response0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
dx / (1
/N)
dN
0
2
4
6
8
10
12 Signal (test sample)
Background (test sample)
Signal (training sample)
Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 0.856 (0.918)
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.0
, 0.0
)%
TMVA overtraining check for classifier: MLPBNN
(c) MLPBNN
MLP response0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3
dx / (1
/N)
dN
0
2
4
6
8
10
12
14 Signal (test sample)
Background (test sample)
Signal (training sample)
Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 0.61 (0.901)
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.0
, 0.0
)%
TMVA overtraining check for classifier: MLP
(d) MLP
Figure 4.20: MVA methods’ outputs using mH as input variable. In theseplots is also possible to visualize the overtraining check.
159
4. Angular Analysis and TMVA 4.3 Results
BDTG response-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8
dx / (1
/N)
dN
0
1
2
3
4
5
6
7
8 Signal (test sample)
Background (test sample)
Signal (training sample)
Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 0.88 (0.969)
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.0
, 0.0
)%
TMVA overtraining check for classifier: BDTG
(a) BDTG
MLPBFGS response0.2 0.4 0.6 0.8 1 1.2
dx / (1
/N)
dN
0
1
2
3
4
5
6
Signal (test sample)
Background (test sample)
Signal (training sample)
Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 1 (0.804)
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.0
, 0.0
)%
TMVA overtraining check for classifier: MLPBFGS
(b) MLPBFGS
MLPBNN response-0.2 0 0.2 0.4 0.6 0.8 1 1.2
dx / (1
/N)
dN
0
1
2
3
4
5
6
7Signal (test sample)
Background (test sample)
Signal (training sample)
Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 1 (0.817)
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.0
, 0.0
)%
TMVA overtraining check for classifier: MLPBNN
(c) MLPBNN
MLP response-0.2 0 0.2 0.4 0.6 0.8
dx / (1
/N)
dN
0
1
2
3
4
5
6Signal (test sample)
Background (test sample)
Signal (training sample)
Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 0.2 (0.978)
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.0
, 0.0
)%
TMVA overtraining check for classifier: MLP
(d) MLP
Figure 4.21: MVA methods’ outputs using mH and cosΘ as input vari-ables. In these plots is also possible to visualize the overtraining check.
160
4. Angular Analysis and TMVA 4.3 Results
BDTG response-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8
dx / (1
/N)
dN
0
1
2
3
4
5 Signal (test sample)
Background (test sample)
Signal (training sample)
Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 0.989 (0.969)
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.0
, 0.0
)%
TMVA overtraining check for classifier: BDTG
(a) BDTG
MLPBFGS response0 0.2 0.4 0.6 0.8 1 1.2 1.4
dx / (1
/N)
dN
0
0.5
1
1.5
2
2.5
3
3.5 Signal (test sample)
Background (test sample)
Signal (training sample)
Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 0.88 (0.947)
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.0
, 0.0
)%
TMVA overtraining check for classifier: MLPBFGS
(b) MLPBFGS
MLPBNN response-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4
dx / (1
/N)
dN
0
0.5
1
1.5
2
2.5
3 Signal (test sample)
Background (test sample)
Signal (training sample)
Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 0.937 (0.999)
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.0
, 0.0
)%
TMVA overtraining check for classifier: MLPBNN
(c) MLPBNN
MLP response0.4 0.6 0.8 1 1.2 1.4
dx / (1
/N)
dN
0
0.5
1
1.5
2
2.5
3
3.5
4Signal (test sample)
Background (test sample)
Signal (training sample)
Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 0.962 (0.959)
U/O
-flo
w (
S,B
): (
0.0,
0.0
)% /
(0.0
, 0.0
)%
TMVA overtraining check for classifier: MLP
(d) MLP
Figure 4.22: MVA methods’ outputs using mH , cosΘ and pT as inputvariables. In these plots is also possible to visualize the overtraining check.
161
Conclusions
The discovery or exclusion of the Higgs boson, the only unobserved particle
of the Standard Model, is one of the main goals of the LHC. In the Standard
Model the possible Higgs boson decays channels are to pairs of fermions or
bosons. The Higgs boson decay H → ZZ(∗) → 4l, where l indicates an elec-
tron or a muon, is defined as the “Golden channel” at the ATLAS experiment.
It provides high sensitivity for its discovery, a narrow 4-leptons invariant mass
peak stands on top of a smooth background. This channel has been the subject
of this thesis.
After a presentation of the Higgs mechanism and a description of the ATLAS
detector in Chapter 1 and 2, in Chapter 3 the standard cut-based Higgs search
analysis in the decay channel H → ZZ(∗) → 4l that I have performed is illus-
trated in details. This analysis has been performed for Higgs boson mass (mH)
hypotheses in the full 110 GeV to 600 GeV mass range using data collected by
the ATLAS experiment in 2011, corresponding to an integrated luminosity of
4.8 fb−1. In total 71 candidate events are selected by the analysis, while in the
same mass range ∼ 62 events are expected from the background processes. The
SM Higgs boson is excluded at 95% C.L. in the mass ranges 134 GeV−156 GeV,
182 GeV−233 GeV, 256 GeV−265 GeV and 266 GeV−415 GeV.
This thesis aimed further to investigate the possibility to integrate the standard
Higgs search analysis with a multivariate one in order to better discriminate
between signal and background events. Therefore in Chapter 4 an angular
distribution study has been carried out upon the Monte Carlo signal and back-
ground selected events in order to find other discriminant variables besides the
invariant mass. The angular variable cosΘ of the momentum of the incoming
quark in the CM frame together with the Higgs transverse momentum and in-
variant mass have been selected as input variables to apply the multivariate
analysis at an Higgs signal sample of mH = 360 GeV.
TMVA, the Toolkit for Multivariate Data Analysis, provides a good set of tools
163
Conclusions
with the benefit of a common interface to a still growing number of methods.
Nevertheless the user still has to check the classifier results carefully and needs
to spend time on optimizing the settings of each classifier to be used. In addition
a sufficient number of candidates with known event class (signal/background)
dedicated to training and testing only have to be available in order not to suffer
from systematic effects.
In this thesis I have tested four MVA methods: a boosted decision tree and
three artificial neural networks, and the tuning of their parameters in order to
optimize separation between the signal and background candidates have been
described in Chapter 4. Then the curves representing background rejection ver-
sus signal efficiency have been compared to the one derived from a cut-based
analysis showing a growing improvement when increasing the number of input
variables. The presented strategies analyse signal and background distributions
in two different ways, therefore a direct comparison between the corresponding
S/√B values is not possible, or at least is not sufficient, in order to claim the
success of the MVA application to the standard analysis. However from this
comparison it seems that the best gain with respect to the cut-based analysis is
obtained applying the decision tree and considering the complete set of variables
as input.
I have carried out this study in the high mass range, in which the SM Higgs
cross section has been already excluded, because here the considered variables,
in particular the angular one, are more discriminating. Thus, this tudy is just
preliminary to investigate the potential of a such analysis starting from a region
in which the separation between signal and background is not too low, for then
extrapolating the learned knowledge about MVA tools and using them at the
low mass region in attempt to provide an help for the Higgs discovery in the
not yet excluded range.
From my thesis work, it can be concluded that the application of a multivari-
ate analysis to the Higgs search could enhance the discovery sensitivity. However
it is also true that at low mass the angular variables are not so much discrim-
inating, so they should be integrated with other variables. Then the tuning of
MVA methods could turn out to be more difficult, increasing the complexity of
neural network or decision tree structure needed to optimize their performance,
leading to a greater computing time and so increasing the overtraining at the
same time.
Thus, the future prospectives are a H → ZZ(∗) → 4l new optimization study
for a low mass Higgs, and then applying the MVA methods to new data. In
fact in the next months this channel will play a central role together with the
H → γγ channel for the Higgs discovery or esclusion.
164
Appendix AVector Boson Fusion Angular
Distributions
Θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10
0.01
0.02
0.03
0.04
0.05
0.06
0.07
1θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10
0.01
0.02
0.03
0.04
0.05
0.06
2θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Φ0 1 2 3 4 5 6
0
0.01
0.02
0.03
0.04
0.05
0.06
1φ
0 1 2 3 4 5 60
0.01
0.02
0.03
0.04
0.05
0.06
0.07
2φ
0 1 2 3 4 5 60
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
Figure A.1: Angular distributions for the vector boson fusion Higgs signalat mH = 130 GeV (the brown histogram) and for the ZZ irreducible back-ground in the mass window 130 GeV< mH < 150 GeV (the red histogram).These plots show angular distributions for reconstructed events.
166
A. Vector Boson Fusion Angular Distributions
Θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10
0.02
0.04
0.06
0.08
0.1
0.12
0.14
1θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
2θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Φ0 1 2 3 4 5 6
0
0.01
0.02
0.03
0.04
0.05
0.06
1φ
0 1 2 3 4 5 60
0.01
0.02
0.03
0.04
0.05
0.06
2φ
0 1 2 3 4 5 60
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Figure A.2: Angular distributions for the vector boson fusion Higgs signalat mH = 360 GeV (the brown histogram) and for the ZZ irreducible back-ground in the mass window 300 GeV< mH < 420 GeV (the red histogram).These plots show angular distributions for reconstructed events.
167
Bibliography
[1] K. Nakamura et al. “Particle Data Group”. J. Phys. G 37, 075021, 2010.
[2] L. Anchordoqui and F. Halzen. Lessons in Particle Physics. University of
Wisconsin, 2009. arXiv:0906.1271v2.
[3] A. Signer. The Standard Model. University of Durham, 2002.
[4] W. Greiner. Relativistic Quantum Mechanics Wave Equations. Springer,
third edition.
[5] S. Dawson. Introduction to electroweak symmetry breaking, 1999.
arXiv:hep-ph/9901280v1.
[6] T. Hambey and K. Riesselmann. SM Higgs mass bounds from theory. DESY
97-152 D0-TH 97/18, 1997.
[7] M. Gomez-Bock, M. Mondragon, M. Muhlleitner, R. Noriega-Papaqui,
I. Pedraza, M. Spira, and P. M. Zerwas. “Electroweak symmetry breaking
and Higgs physics: basic concepts”. Journal of Physics: Conference Series
18:74–135, 2005. arXiv:hep-ph/0712.2419v1.
[8] L. Reina. TASI 2004 lecture notes on Higgs boson physics, 2005. arXiv:hep-
ph/0512377v1.
[9] M. Spira and P. M. Zerwas. Electroweak symmetry breaking and Higgs
physics. CERN-TH/97-379 DESY 97-261, 1998. arXiv:hep-ph/9803257v2.
[10] The TEVNPH Working Group for the CDF and DØ Collaborations. Com-
bined CDF and DØ Search for Standard Model Higgs Boson Production with
up to 10 fb−1 of Data. FERMILAB-CONF-12-065-E, CDF Note 10806, DØ
Note 6303, 2012.
169
Bibliography
[11] H. Flacher, M. Goebel, J. Haller, A. Hocker, K. Monig, and J. Stelzer.
“Revisiting the global electroweak fit of the Standard Model and beyond
with Gfitter”. The European Physical Journal C 60, 543 (publisher), 2009.
arXiv:0811.0009v4.
[12] M. Baak, M. Goebel, J. Haller, A. Hocker, D. Ludwig, K. Monig, M. Schott,
and J. Stelzer. “Updated status of the global electroweak fit and constraints
on new physics ”. Submitted to the European Physical Journal C, 2011.
arXiv:1107.0975v1.
[13] U. Aglietti et al. Tevatron for LHC report: Higgs. FERMILAB-CONF-06-
467-E-T, 2007. arXiv:hep-ph/0612172v2.
[14] A Generic Fitter Project for HEP Model Testing. URL http://gfitter.
desy.de/.
[15] M. Spira. QCD effects in Higgs physics. CERN-TH/97-68, 1997. arXiv:hep-
ph/9705337v2.
[16] A. Djouadi. The anatomy of electro-weak symmetry breaking. Tome I: The
Higgs boson in the Standard Model. LPT-Orsay-05-17, 2005. arXiv:hep-
ph/0503172v2.
[17] LHC Higgs Cross Section Working Group, S. Dittmaier, C. Mariotti,
G. Passarino, and R. Tanaka (Eds.). Handbook of LHC Higgs cross sec-
tions: 1. Inclusive observables. CERN-2011-002 (CERN,Geneva), 2011.
arXiv:hep-ph/1101.0593v3.
[18] The ATLAS Collaboration. Expected performance of the ATLAS exper-
iment: Detector, Trigger and Physics. CERN-OPEN-2008-020, CERN,
Geneva, 2008.
[19] A. Djouadi, J. Kalinowski, and M. Spira. HDECAY: a program for Higgs
boson decays in the Standard Model and its supersymmetric exstension,
1997. arXiv:hep-ph/9704448v1.
[20] V. Buscher and K. Jakobs. “Higgs boson searches at hadron colliders”. In-
ternational Journal of Modern Physics Letters A Vol. 20, 2005. arXiv:hep-
ph/0504099v1.
[21] The ATLAS Collaboration. ATLAS detector and physics performance.
Technical design report, vol. 2. CERN-LHCC-99-15, 1999.
[22] E. Lyndon (ed.) and B. Philip (ed.). “LHC machine”. J. Instrum. 3,
(S08001), 2008.
170
Bibliography
[23] J. Stirling. Available at. URL http://projects.hepforge.org/mstwpdf/
plots/plots.html.
[24] Daniel Fournier. Performance of the LHC, ATLAS and CMS in 2011. HCP,
2011.
[25] M. Benedikt, P. Collier, V. Mertens, J. Poole, , and K. Schindl. LHC
Design Report. 3. The LHC injector chain, 2004.
[26] Available at. URL http://lhc-machine-outreach.web.cern.ch/
lhc-machine-outreach/lhc_in_pictures.htm.
[27] G Aad, N Groot, F Filthaut, and D Froidevaux. “The ATLAS experiment
at the CERN Large Hadron Collider”. J. Instrum. 3, (S08003), 2008.
[28] The ATLAS detector available at. URL http://www.atlas.ch/detector.
html.
[29] The ATLAS Collaboration. ATLAS liquid-argon calorimeter: Technical
Design Report. ATLAS-TDR-002, CERN-LHCC-96-041, CERN, Geneva,
1996.
[30] The ATLAS Collaboration. “Drift Time Measurement in the ATLAS
Liquid Argon Electromagnetic Calorimeter using Cosmic Muons”. The
European Physical Journal C - Particles and Fields, 70:755–785, DOI
10.1140/epjc/s10052-010-1403-6, 2010.
[31] The ATLAS Collaboration. ATLAS tile calorimeter: Technical Design Re-
port. ATLAS-TDR-003, CERN-LHC-96-042, CERN, Geneva, 1996.
[32] D. M. Gingrich et al. “Construction, assembly and testing of the ATLAS
hadronic end-cap calorimeter”. J.Instrum. 2, (P05005), 2007.
[33] The ATLAS magnet available at. URL http://www.atlas.ch/magnet.
html.
[34] Muon system layout (Parameter Books). Available at. URL http://atlas.
web.cern.ch/Atlas/GROUPS/MUON/layout/parameter_book.html.
[35] S. Palestini. The muon spectrometer of the ATLAS experiment. CERN,
Geneva, 2002.
[36] O. Fedin. Reconstruction and identification of photons and electrons with
the ATLAS detector. Petersburg Nuclear Physics Institute, St. Petersburg,
188300, Russia, 2009.
171
Bibliography
[37] C. Anastopoulos et al. Search for the Standard Model Higgs boson in the
decay channel H → ZZ(∗) → 4l with 4.8 fb−1 of pp collisions at√s = 7
TeV. ATLAS Internal Note, December 20, 2011. Draft version.
[38] Search for the SM Higgs boson in the decay channel H → ZZ →lll Winter 2012. URL https://twiki.cern.ch/twiki/bin/viewauth/
AtlasProtected/HiggsZZllllWinter2012.
[39] T. Binoth, N. Kauer, and P. Mertsch. Gluon-induced QCD Corrections to
pp→ ZZ → lll′ l′, 2008. arXiv:hep-ph/0807.0024v1.
[40] S. Alioli, P. Nason, C.Oleari, and E. Re. “NLO Higgs boson production via
gluon fusion matched with shower in POWHEG”. JHEP 0904:002, 2009.
arXiv:hep-ph/0812.0578v2.
[41] P. Nason and C.Oleari. “NLO Higgs boson production via vector-boson fu-
sion matched with shower in POWHEG”. JHEP 1002:037, 2010. arXiv:hep-
ph/0911.5299v2.
[42] T. Sjostrand, S. Mrenna, and P. Z. Skands. “PYTHIA 6.4 Physics and
Manual ”. JHEP 0605:026, 2006. arXiv:hep-ph/0603175v2.
[43] P. Golonka and Z. Was. “PHOTOS Monte Carlo: a precision tool for
QED corrections in Z and W decays”. Eur. Phys. J. C 45:97-107, 2005.
arXiv:hep-ph/0506026v2.
[44] Z. Was and P. Golonka. “TAUOLA as tau Monte Carlo for future applica-
tions”. Presented at International workshop on Tau Lepton Physics, 2004.
arXiv:hep-ph/0411377v1.
[45] Z. Was. “TAUOLA for simulation of tau decay and production: perspec-
tives for precision low energy and LHC applications”. Presented at Interna-
tional workshop on Tau Lepton Physics, 2011. arXiv:hep-ph/1101.1652v1.
[46] S. Catani, D. de Florian, M. Grazzini, and P. Nason. “Soft-gluon resum-
mation for Higgs boson production at hadron colliders”. JHEP 0307:028,
2003. arXiv:hep-ph/0306211v1.
[47] The ATLAS Collaboration. Search for the Standard Model Higgs boson in
the decay channel H → ZZ(∗) → 4l with 4.8 fb−1 of pp collisions at√s = 7
TeV. ATLAS-CONF-2011-162, December 13, 2011.
[48] M. L. Mangano, M. Moretti, F. Piccinini, R. Pittau, and A. D. Polosa.
“ALPGEN, a generator for hard multiparton processes in hadronic colli-
sions”. JHEP 0307:001, 2003. arXiv:hep-ph/0206293v2.
172
Bibliography
[49] J. M. Butterworth, J. R. Forshaw, and M. H. Seymour. “Multiparton
Interactions in Photoproduction at HERA”. Z. Phys. C 72:637-646, 1996.
arXiv:hep-ph/960137v1.
[50] S. Frixione, P. Nason, and B. R. Webber. “Matching NLO QCD and parton
showers in heavy flavour production”. JHEP 0308:007, 2003. arXiv:hep-
ph/0305252v2.
[51] G. Corcella et al. “HERWIG 6.5: an event generator for Hadron Emission
Reactions With Interfering Gluons (including supersymmetric processes)”.
JHEP 0101:010, 2002. arXiv:hep-ph/011363v3.
[52] TechnicalitiesForMedium1. URL https://twiki.cern.ch/twiki/bin/
viewauth/AtlasProtected/TechnicalitiesForMedium1.
[53] M. Hance, D.Olivito, and H. H. Williams. Performance studies for
e/gamma calorimeter isolation. University of Pennsylvania, Lawrence
Berkeley National Laboratory, 2011.
[54] C. Anastopoulos et al. ATLAS sensitivity prospects for the Standard Model
Higgs boson in the decay channel H → ZZ(∗) → 4l at√s = 10 and 7 TeV.
ATL-PHYS-INT-2010-062, 2010.
[55] The ATLAS Collaboration. “Search for the Standard Model Higgs boson
in the decay channel H → ZZ(∗) → 4l with 4.8 fb−1 of pp collisions at√s = 7 TeV with ATLAS”. Preprint submittted to Phys. Lett. B, March 2,
2012. CERN-PH-EP-2012-014. arXiv:hep-ex/1202.1415v3.
[56] The ATLAS Collaboration. “An update to the combined search for the
Standard Model Higgs boson with the ATLAS detector at the LHC using up
to 4.9 fb−1 of pp collision data at√s = 7 TeV ”. ATLAS-COM-PHYS-
2012-019, March 6,2012.
[57] J. S. Gainer, K. Kumar, I. Low, and R. Vega-Morales. “Improving the sensi-
tivity of Higgs boson searches in the golden channel”. International Journal
of Modern Physics Letters A Vol. 20, 2011. arXiv:hep-ph/1108.2274v2.
[58] A. Hoecker, P. Speckmayer, J. Stelzer, J. Therhaag, E. von Toerne, and
H. Voss. TMVA 4-Toolkit for Multivariate Data Analysis with ROOT, 2009.
arXiv:physics/0703039v5.
[59] P. Speckmayer, A. Hocker, J. Stelzer, and H. Voss. “The Toolkit for Mul-
tivariate Data Analysis, TMVA 4”. Journal of Physics: Conference Series
219, 2010. 032057.
173
Bibliography
[60] B. P. Roe, H. Yang, J. Zhu, Y. Liu, I. Stancu, and G. McGregor. “Boosted
decision trees as an alternative to artificial neural networks for particle
identification”. Nuclear Instruments and Methods in Physics Research A
543:577-584, 2005.
[61] Cern brochure, 2009. URL http://cdsweb.cern.ch/record/1165534/
files/CERN-Brochure-2009-003-Eng.pdf.
[62] E. Bruning, P. Collier, P. Lebrun, S. Myers, R. Ostojic, J. Poole, and
P. Proudlock. LHC Design Report. 1. The LHC main ring. Editorial Board,
CERN, 2004.
[63] E. Bruning, P. Collier, P. Lebrun, S. Myers, R. Ostojic, J.Poole, and
P.Proudlock. LHC Design Report. 2. The LHC infrastructure and general
services. Editorial Board, CERN, 2004.
[64] LHC homepage. URL http://lhc.web.cern.ch/lhc/.
[65] LHC collision rate. URL http://lhc-machine-outreach.web.cern.ch/
lhc-machine-outreach/collisions.htm.
[66] The ATLAS Collaboration. ATLAS muon spectrometer: Technical Design
Report. ATLAS-TDR-010, CERN-LHCC-97-022, CERN, Geneva, 1997.
[67] R. C. Fernow. Introduction to experimental particle physics. Cambridge
University Press, New York, 1986.
[68] The Snowmass Working Group on Precision Electroweak Measurements.
Present and future electroweak precision measurements and the indirect de-
termination of the mass of the Higgs boson. FERMILAB-CONF-02/010-T,
2002. arXiv:hep-ph/0202001v1.
[69] S. Gentile. Search for Higgs boson with the ATLAS detector.
[70] I. Tsukerman on behalf of CMS and ATLAS collaborations. Discovery
potential at the LHC: channels relevant for SM Higgs. ITEP, Moscow,
Russia, 2008. arXiv:hep-ph/0812.1458v1.
174
Desidero ringraziare la Professoressa Anna Di Ciaccio per avermi dato l’op-
portunita di lavorare su un campo di ricerca cosı stimolante ed innovativo,
nonche per la grande disponibilita e cortesia dimostratemi.
Sentiti ringraziamenti vanno al Dottor Andrea Di Simone il cui aiuto e stato
fondamentale per la riuscita di questo lavoro. Desidero inoltre ringraziare il
Dottor Luca Mazzaferro, sempre disponibile a risolvere problemi imminenti al
fine di non rallentare il lavoro di tesi e sempre presente con costante supporto
ed interessamento verso lo studio da me condotto.
Ringrazio soprattutto i miei compagni di studi, Andrea, Damiano, Giulio,
Lorenzo, Ludovico e Mattia, con cui ho condiviso questi cinque anni e che hanno
saputo allietare le giornate passate insieme all’universita.
Un ringraziamento speciale va alle mie due piu care amiche, Flavia e Ilaria, al
mio fianco in ogni occasione sia di studio che di svago e immolatesi spesso a
valvole di sfogo in questi ultimi mesi. Ringrazio Flavia per la sua allegria e
spensieratezza capaci di contagiarti anche nei momenti piu difficili, senza farti
mai perdere di vista l’obiettivo di tanto faticare, anzi capaci di contribuire al
suo raggiungimento con una maggiore serenita. Ad Ilaria devo la realizzazione,
nonche la stesura, di questa tesi, senza i suoi immancabili consigli, suggerimenti
e la sua giornaliera presenza cio non sarebbe stato possibile.
Infine, desidero ringraziare con affetto tutta la mia famiglia, per il loro co-
stante incoraggiamento e sostegno. In particolare i miei piu sinceri ringrazia-
menti vanno ai miei genitori, mio fratello e mia sorella, mostratisi sempre pron-
ti ad aiutarmi in qualsiasi situazione pur di agevolare il mio studio, presenti
quotidianamente con comprensione e tanta pazienza oltre ad una buona dose
immancabile, e non sempre forse meritata, di “coccole” giornaliere.