Search for the Standard Model Higgs boson in the …...aiuta ad individuare la scala di energia alla...

Universita degli Studi di Roma Tor Vergata

FACOLTA DI SCIENZE MATEMATICHE, FISICHE E NATURALI

Corso di Laurea Magistrale in Fisica Nucleare e Subnucleare

Search for the Standard Model Higgs boson

in the four leptons decay channel with multivariate analysis

at the ATLAS experiment

Candidato:

Cristina Papaleo

Relatore:

Anna Di Ciaccio

Anno Accademico 2010-2011

Universita degli Studi di Roma Tor Vergata

FACOLTA DI SCIENZE MATEMATICHE, FISICHE E NATURALI

Corso di Laurea Magistrale in Fisica Nucleare e Subnucleare

Ricerca del bosone di Higgs del Modello Standard

nel canale di decadimento a quattro leptoni

applicando l’analisi multivariata

nell’ambito dell’esperimento ATLAS

Candidato:

Cristina Papaleo

Relatore:

Anna Di Ciaccio

Anno Accademico 2010-2011

Contents

Introduction 1

Introduzione 3

1 The Standard Model and The Higgs Boson 6

1.1 The Standard Model (SM) . . . . . . . . . . . . . . . . . . . . . . 6

1.1.1 Particles and Interactions . . . . . . . . . . . . . . . . . . 6

1.1.2 The Gauge Field Theories . . . . . . . . . . . . . . . . . . 8

1.1.3 The Higgs Mechanism . . . . . . . . . . . . . . . . . . . . 12

1.2 The SM Higgs Boson . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.2.1 Theoretical Constraints on the Higgs Mass . . . . . . . . 19

1.2.2 Experimental Limits on the Higgs Mass . . . . . . . . . . 26

1.3 Higgs at the LHC . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

1.3.1 Production Mechanisms . . . . . . . . . . . . . . . . . . . 33

1.3.2 Decay Modes . . . . . . . . . . . . . . . . . . . . . . . . . 36

1.3.3 Discovery Potential . . . . . . . . . . . . . . . . . . . . . . 45

1.3.4 The Higgs Mass and Total Decay Width . . . . . . . . . . 46

2 LHC and the ATLAS Detector 49

2.1 The Large Hadron Collider . . . . . . . . . . . . . . . . . . . . . 49

2.1.1 Architectural Overview . . . . . . . . . . . . . . . . . . . 51

2.2 ATLAS Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

2.2.1 Geometry and Definitions . . . . . . . . . . . . . . . . . . 55

2.2.2 Physics Requirements . . . . . . . . . . . . . . . . . . . . 56

2.2.3 ATLAS Detector Overview . . . . . . . . . . . . . . . . . 57

2.2.4 The Inner Detector (ID) . . . . . . . . . . . . . . . . . . . 58

2.2.5 The Calorimeters . . . . . . . . . . . . . . . . . . . . . . . 63

2.2.6 The Magnet System . . . . . . . . . . . . . . . . . . . . . 67

i

Contents

2.2.7 The Muon Spectrometer . . . . . . . . . . . . . . . . . . . 68

2.2.8 Trigger System . . . . . . . . . . . . . . . . . . . . . . . . 76

2.2.9 Electron Reconstruction and Identification . . . . . . . . . 79

2.2.10 Muon Reconstruction and Identification . . . . . . . . . . 84

3 Higgs search in the decay channel H → ZZ(∗) → 4l 89

3.1 Signal and Main Backgrounds . . . . . . . . . . . . . . . . . . . . 90

3.1.1 Data and Monte Carlo Samples . . . . . . . . . . . . . . . 92

3.2 Pileup Reweighting . . . . . . . . . . . . . . . . . . . . . . . . . . 94

3.3 Lepton Reconstruction and Identification . . . . . . . . . . . . . 97

3.3.1 GSF Electrons . . . . . . . . . . . . . . . . . . . . . . . . 99

3.4 Lepton Corrections . . . . . . . . . . . . . . . . . . . . . . . . . . 99

3.5 Event Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

3.5.1 Preliminary Cuts . . . . . . . . . . . . . . . . . . . . . . . 101

3.5.2 Event Preselection . . . . . . . . . . . . . . . . . . . . . . 102

3.5.3 Quadruplet Candidates and Higgs Candidate Selection . . 105

3.5.4 Reducible Background Rejection . . . . . . . . . . . . . . 106

3.5.5 Higgs Boson Mass Reconstruction . . . . . . . . . . . . . 110

3.6 Background Estimation . . . . . . . . . . . . . . . . . . . . . . . 111

3.7 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

3.7.1 Combined Results . . . . . . . . . . . . . . . . . . . . . . 115

4 Angular Analysis and TMVA 119

4.1 Angular Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

4.1.1 Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . 120

4.1.2 Angular and pT Distributions . . . . . . . . . . . . . . . . 124

4.2 TMVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

4.2.1 What is TMVA . . . . . . . . . . . . . . . . . . . . . . . . 125

4.2.2 Boosted Decision Tree (BDT) . . . . . . . . . . . . . . . . 130

4.2.3 Artificial Neural Network (ANN) . . . . . . . . . . . . . . 135

4.2.4 Optimization of the MVA methods . . . . . . . . . . . . . 139

4.2.5 Monte Carlo Samples . . . . . . . . . . . . . . . . . . . . 139

4.2.6 Input Variables . . . . . . . . . . . . . . . . . . . . . . . . 140

4.2.7 Tuning Parameters for the implemented BDTG . . . . . . 141

4.2.8 Tuning Parameters for the implemented ANNs . . . . . . 144

4.2.9 Comparing MVA Methods Performance . . . . . . . . . . 150

4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

Conclusions 163

A Vector Boson Fusion Angular Distributions 166

ii

Contents

Bibliography 169

iii

Introduction

The Standard Model of elementary particles is the theory that describes

three of the four fundamental interactions (the strong, weak and electromag-

netic ones) in a coherent framework. It has been developed during the last

century and has proven to be extremely successful providing excellent descrip-

tion of all the phenomena observed in the particle physics domain up to energies

explored by LEP (Large Electron-Positron collider) and Tevatron. Nevertheless,

the origin of particle masses remains an open question. The electroweak symme-

try breaking mechanism provides an elegant answer. However, it predicts also

a yet undiscovered particle, the Higgs boson. The Higgs boson is searched for

through its direct production and its virtual effects on electroweak observables,

but so far physicists have only been able to set limits on the mass of this particle

since no signal has been observed. This quest is very important and the future

of particle physics will be driven by either the discovery or the exclusion of the

SM Higgs boson. In case of discovery, the Higgs boson mass can give hints on

which scale new physics occurs; in case of exclusion, a deep theoretical work

will be needed to find an alternative explanation of the experimental successes

of the SM.

In 2009 the Large Hadron Collider (LHC) started to provide proton-proton col-

lision data at the highest energy ever reached. This machine has been designed

to provide the ultimate answer about the Higgs boson existence because it will

be able to explore the whole mass range, from 114.4 GeV up to 1 TeV, where

the Higgs boson is expected to be.

This thesis describes the work dedicated to the search of the Higgs boson in

the ATLAS experiment at LHC. Since the Higgs mass is unknown, many mass

channels need to be studied; I have studied the channel H → ZZ(∗) → 4l, where

l denotes an electron or a muon. The presence of a real Z provides two high

pT leptons in the final state together with other two leptons coming from the

virtual Z. A mass constraint can be made on both lepton pairs. The applica-

1

Introduction

tion of kinematic cuts on leptons pT , requirements of isolation for the leptons

in the final state and impact parameter requirements provide a sufficient back-

ground rejection. This cut-based analysis has been implemented and its results

have been validated with the ones obtained by the ATLAS Collaboration, then

I have integrated it with a multivariate one (MVA).

For this purpose, at the beginning I have carried out an angular analysis to

find additional variables besides the invariant mass of the four leptons selected

events, that can potentially discriminate between signal and background pro-

cesses. I have selected the angular variable cosΘ of the momentum of the

incoming quark in the CM frame together with the Higgs transverse momen-

tum and invariant mass as input variables to apply the multivariate analysis

at an Higgs signal sample of mH = 360 GeV. The signal at high mass has

been considered as the benchmark case because in this range the discriminating

power of the chosen variables is greater. Then using the found observables a

search optimization has been performed applying advanced techniques such as

artificial neural networks and boosted decision tree methods. Therefore this

thesis will describe the application of the traditional cut-based analysis and of

the MVA to the Higgs search in the four leptons channel and will compare the

signal efficiency and background rejection for the two analyses.

In summary this thesis is organized as follows: in order to provide the nec-

essary preliminary notions, in the first chapter an introduction to the Stan-

dard Model and the SM electroweak symmetry breaking is given along with our

present knowledge about the SM Higgs boson and the latest results from the

direct and the indirect searches. The second chapter presents the main features

of the LHC accelerator and the ATLAS detector. Then Chapter 3 describes

the cut-based Higgs search analysis and the events results that I have obtained

applying it to data collected by ATLAS in 2011. The subsequent chapter ex-

plains the multivariate analysis tool and the tuning of the multivariate methods,

selected to optimize the sensitivity for finding the Higgs boson in the considered

channel, is illustrated in detail. Finally the optimization results for different

methods are discussed. Concluding remarks and perspectives are discussed in a

dedicated chapter.

2

Introduzione

Il Modello Standard e la teoria che descrive tre delle quattro interazioni fon-

damentali (forte, debole ed elettromagnetica) in un quadro coerente. E’ stato

sviluppato durante l’ultimo secolo e si e dimostrato estremamente valido fornen-

do un’eccellente descrizione di tutti i fenomeni osservati in fisica delle particelle

fino alle energie esplorate da LEP e Tevatron. Tuttavia l’origine della massa

delle particelle rimane ancora una questione aperta. Il meccanismo di rottura

della simmetria elettrodebole fornisce un’elegante soluzione. Tale meccanismo

pero predice anche l’esistenza di una particella non ancora scoperta, il bosone di

Higgs. La ricerca di tale bosone avviene attraverso la sua produzione diretta e

attraverso misure indirette, a partire dagli effetti virtuali che il bosone di Higgs

induce sugli osservabili elettrodeboli. Finora i fisici sono stati in grado solo di

fissare dei limiti sulla massa di questa particella poiche nessun segnale e stato

osservato. Questa ricerca e molto importante e il futuro della fisica particellare

verra determinato dalla scoperta o dall’esclusione del bosone di Higgs predet-

to dal Modello Standard. In caso di scoperta, la massa del bosone di Higgs

aiuta ad individuare la scala di energia alla quale appare nuova fisica; in caso

di esclusione invece un profondo lavoro teorico sara necessario per trovare una

spiegazione alternativa ai successi sperimentali del Modello Standard.

Nel 2009 l’acceleratore LHC ha iniziato a fornire eventi di collisioni protone

protone alla piu alta energia mai raggiunta. Questa macchina e stata costruita

al fine di fornire la risposta definitiva sull’esistenza del bosone di Higgs, poiche

sara in grado di esplorare l’intero intervallo di massa, che va da 114.4 GeV ad

1 TeV, in cui ci si aspetta di trovarlo.

Questa tesi descrive il lavoro dedicato alla ricerca del bosone di Higgs condotta

nell’ambito dell’esperimento ATLAS ad LHC. Poiche la massa del bosone di

Higgs non e nota, molti sono i canali di ricerca che devono essere indagati; io

ho studiato il canale H → ZZ(∗) → 4l, dove con l si indica un elettrone o un

muone. La presenza di una Z reale produce due leptoni di alto pT nello stato

3

Introduzione

finale insieme ad altri due leptoni provenienti dalla Z virtuale. Un vincolo sulla

massa puo essere posto per entrambe le coppie leptoniche. L’applicazione di

tagli sul pT dei leptoni, richieste di isolamento per i leptoni nello stato finale e

le condizioni poste sul parametro di impatto forniscono una reiezione del fondo

sufficiente. Questa analisi, basata sull’applicazione di alcuni tagli (cut-based),

e stata implementata e i suoi risultati sono stati convalidati con quelli ottenuti

dalla Collaborazione ATLAS; ho quindi poi integrato questa analisi con una

multivariata (MVA).

A tal scopo, dapprima ho effettuato un’analisi angolare per trovare ulteriori va-

riabili, oltre la massa invariante degli eventi a quattro leptoni selezionati, che

potenzialmente potessero contribuire alla separazione tra segnale e processi di

fondo. Ho selezionato la variabile angolare cosΘ dell’impulso del quark inciden-

te nel sistema di riferimento del centro di massa, insieme al momento trasverso

dell’Higgs e alla sua massa invariante, come variabili di input per applicare

un’analisi multivariata ad un campione di segnale, corrispondente ad un Higgs

di massa mH = 360 GeV. E’ stato considerato il segnale ad alta massa come

caso di riferimento perche in questa regione il potere discriminante delle varia-

bili scelte e maggiore. Usando quindi gli osservabili cosı trovati, e stata eseguita

una ricerca di ottimizzazione applicando tecniche avanzate come le reti neurali

artificiali e gli alberi decisionali (boosted decision tree). Pertanto in questo la-

voro di tesi si presentera l’applicazione della tradizionale analisi cut-based e di

MVA alla ricerca dell’Higgs nel canale di decadimento in quattro leptoni e si

confrontera l’efficienza di segnale e la reiezione del fondo per le due analisi.

In breve questa tesi e organizzata come segue: al fine di fornire le nozioni

preliminari necessarie, nel primo capitolo un’introduzione al Modello Standard

e alla rottura della simmetria elettrodebole viene data insieme all’attuali co-

noscenze del bosone di Higgs del Modello Standard e agli ultimi risultati delle

ricerche dirette e indirette. Il secondo capitolo presenta le principali caratteri-

stiche dell’acceleratore LHC e del rivelatore ATLAS. Poi, il Capitolo 3 descrive

l’analisi cut-based per la ricerca dell’Higgs ed i risultati ottenuti applicandola ai

dati raccolti da ATLAS nel 2011. Il successivo capitolo illustra lo strumento di

analisi multivariata e come sono stati scelti i parametri caratterizzanti i meto-

di selezionati al fine di ottimizzare la sensibilita per trovare il bosone di Higgs

nel canale considerato. Infine, i risultati di ottimizzazione per diversi metodi

sono discussi. Osservazioni conclusive e prospettive future sono delineate in un

capitolo a parte.

4

Chapter 1The Standard Model and The

Higgs Boson

The Standard Model (SM) is a successful theory incorporating the present

understanding of fundamental particles and their interactions. The SM is per-

turbative at sufficiently high energies and renormalizable due to its gauge invari-

ant formulation. It is able to accommodate basically all the known experimen-

tal facts and precise measurements performed in high energy particles colliders

over the last decades. However it remains “incomplete”: the existence of dark

matter and the gravitational interaction are not described; the mechanism for

electroweak symmetry breaking that gives masses to the particles is not iden-

tified, and the associated particle, the Higgs boson, has not been observed yet.

Therefore it is crucial to prove the Higgs existence and the validity of the theory

or completely exclude it over the entire allowed mass range.

This chapter briefly describes the Standard Model and its key ingredients with

some attention to the mechanism which predicts the existence of the Higgs par-

ticle and describes the origin of the masses of the fundamental particles. Then

it will discuss the Higgs boson production, decay modes and the limits on its

mass.

1.1 The Standard Model (SM)

1.1.1 Particles and Interactions

In our current understanding, the physical world is composed by a few fun-

damental building blocks, which are collectively called matter, and is shaped

by their interactions, which are collectively called forces. In the Standard

Model particles can be divided in two categories: fermions and bosons [1].

6

1. The SM and The Higgs Boson 1.1 The Standard Model (SM)

The fermions are the bricks of the ordinary matter. They, by definition, have

semi-integer spins and the fundamental fermions in nature and therefore in SM

have all spin 1/2. Fermions obey Fermi-Dirac statistic and the Pauli exclusion

principle. The fermions can be grouped into leptons (l) and quarks (q), as shown

in Table 1.1. Quarks have an electric charge that is a fraction of the electron’s

one and carry a color charge. They do not exist as free particles but they are

combined to form particles called hadrons. Leptons are typically divided into

charged and neutral ones. The latter are referred to as neutrinos, interact only

weakly with the matter and can be transformed into each other when they pass

through matter. Moreover both leptons and quarks can be organized in three

symmetric sets of particles, called families, with increasing mass. Each of them

includes two quarks and two leptons.

There are four kinds of interactions between the fermions: the gravitational,

Fermions

Quarks

GenerationQ = + 2

3, T3,L = + 1

2, T3,R = 0, Q = − 1

3, T3,L = − 1

2, T3,R = 0,

YL = + 13, YR = + 4

3YL = + 1

3, YR = − 2

3

I u 1.7− 3.3 MeV d 4.1− 5.8 MeVII c 1.18− 1.34 GeV s 80− 130 MeVIII t 173.1± 0.6± 1.1 GeV b 4.13− 4.37 GeV

Leptons

GenerationQ = −1, T3,L = − 1

2, T3,R = 0, Q = 0, T3,L = + 1

2, T3,R = 0,

YL = −1, YR = −2 YL = −1, YR = 0

I e 0.511 MeV νe < 3 eVII µ 106 MeV νµ < 19 keVIII τ 1.78 GeV ντ < 18.2 MeV

Table 1.1: Fermions of the SM. The properties of these particles are ex-pressed in terms of their mass, their charge (Q), their weak isospin (I3, is thethird component) and their hypercharge (Y ). These quantum numbers arerelated to each other by the relation: Q = T3+

Y2. It also introduces the dis-

tinction between left-handed and right-handed component of the fermionicparticle (indicated respectively by the subscript L and R) [1].

the electromagnetic, the weak and the strong. Each interaction is mediated by

one or more massive or massless spin-1 particles, summarized in Table 1.2 along

with the interaction’s range. The integer spin particles obey the Bose-Einstein

statistics and are referred to as the bosons. In particular gravitation should also

be mediated by a boson, the graviton, even if with spin 2 and not 1 as the oth-

ers, but there is still no evidence of its existence. The last boson which appears

in Table 1.2 is the Higgs boson. This particle has never been observed and its

search is the topic of this thesis. Here it is only mentioned that, on the contrary

of the other bosons, it is not a mediator of a force and it is expected by SM

predictions to be scalar, i.e. spin 0, and neutral. It appears as a consequence

7


of the Higgs mechanism after the spontaneous electroweak symmetry breaking

(see Section 1.1.3). Through this mechanism, as a consequence of the interac-

tion with the Higgs field, the vector bosons and the fermions acquire mass.

It is noted that in the following the natural system of units ~ = c = 1 will be

used.

ForceRelative Range

Boson SpinMass

strength [m] [GeV]

Strong 1 10−15 8 gluons (g) 1 0Electromagnetic 10−2 ∞ photon (γ) 1 0

Weak 10−2 10−13 W± 1 80.399± 0.02310−2 10−13 Z 1 91.1876± 0.0021

Gravitational 10−40 ∞ graviton (?) 2 0Higgs (H) 0 ?

Table 1.2: The particle interactions with their carrier particles, whose massand spin are reported. The relative strength and effective range of forces arealso shown [1]. The gravitational force is not described by the SM. Thegraviton and the Higgs particle have not yet been observed experimentally.

1.1.2 The Gauge Field Theories

The Standard Model has been extended from models developed in the 1960’s

by Glashow, Winberg and Salam. The Standard Model description of particles

and forces in nature is based on the mathematical language of the Quantum

Field Theory (QFT), where particles are excitations of fundamental fields which

are functions of, or extend in, space and time. Particle dynamics are described

by a Lagrangian density L, simply referred to as the Lagrangian hereafter1.

Fermions are represented mathematically by matter fields, interactions between

them are represented by gauge fields that operate on matter fields. For a given

Lagrangian description of a system, gauge invariance implies that L is conserved

under local symmetry transformations2. Any such symmetry corresponds to a

conservation law and vice versa (Noether’s theorem), for instance if a temporal

translation invariance is required, a conservation of energy is obtained, whereas

spatial translation invariance implies conservation of momentum.

The SM is a renormalizable quantum field theory which provides a unified ap-

proach for the description of the electromagnetic, weak and strong interactions.

The Standard Model is based on the gauge symmetry SU(3)C×SU(2)L×U(1)Y .

1In the following it will not use the Lagrangian L to describe the system, but the Lagrangiandensity L which is related to L by L =

∫Ld~x. However for simplicity L will be called

Lagrangian as well.2These symmetries are said to be global if they are the same at any point in the Universe.

In the SM local symmetries are imposed as well as the global symmetries. These stricterrequirements imply that a certain local (position dependent) transformations should leave allphysical quantities conserved in the local space, apart from the globally preserved ones.

8


SU(3)C is the group of color symmetry, described within the frame of Quantum

Chromo Dynamics (QCD), SU(2)L the one of the weak isospin symmetry and

U(1)Y the one for the hypercharge symmetry. The symmetry SU(2)L ×U(1)Y ,

that represents the unified weak and electromagnetic interaction, is broken spon-

taneously by the Higgs mechanism (SU(2)L × U(1)Y → U(1)Q) [2, 3].

1.1.2.1 Quantum Electrodynamics (QED)

The Lagrangian of the massless electromagnetic field Aµ interacting with a

spin-1/2 field ψ of bare mass m is

L = −1

4FµνF

µν + ψ(iγµDµ −m)ψ (1.1)

where ψ = ψ†γ0 and γµ are the 4 × 4 Dirac matrices satisfying the anti-

commutation relation: {γµ, γν} = 2gµν with gµν being the metric tensor. The

electromagnetic field tensor is defined as

Fµν = ∂µAν − ∂νAµ (1.2)

The Lagrangian in (1.1) is obtained by requiring that the Dirac Lagrangian of

the free spin-1/2 particle

L = ψ(iγµ∂µ −m)ψ (1.3)

becomes symmetric under local U(1)Q3 transformations of the form:

U(x) = exp (−ieQα(x)) (1.4)

where e is the unit electric charge and Q is the charge operator4.

The Lagrangian invariance can be maintained with the addition of a spin-1 field

Aµ, called gauge boson, it is noted that under this transformation ψ and Aµ

change as:

ψ(x) → ψ′(x) = ψ(x) exp (−ieQα(x)) (1.5)

Aµ(x) → A′

µ(x) = Aµ(x) + ∂µα(x) (1.6)

For (1.3) to be invariant under (1.4), the covariant derivative Dµ

Dµ = ∂µ + ieAµQ (1.7)

3U(1) is the group of all complex numbers with module one.4Qψ = qfψ, where qf = 1 for the electrons.

9


has to be introduced. Therefore the new Lagrangian can be written as (1.1). It is

characterized by a term representing the original electron field (ψ(iγµ∂µ−m)ψ),

composed by the fermionic kinetic term (iψγµ∂µψ) and the fermionic mass term

(ψmψ). eψγµAµQψ is the interaction term between the vector field Aµ and

the electromagnetic current. The strength of the interaction is proportional to

the value of the constant e. The new field Aµ is thus the photon field and

the interaction term appearing in the Lagrangian due to to the local gauge

invariance describes the electromagnetic interactions mediated through photons.

Finally the first term is the kinetic term for the photon (− 14FµνF

µν).

The respective conserved current is the electric current:

jµ = eψγµQψ (1.8)

The photon is massless, and a mass term of the form m2AµAµ was not intro-

duced to preserve gauge invariance.

1.1.2.2 Quantum Chromodynamics (QCD)

The strong interactions are described by a local non-abelian gauge theory,

in which SU(3)C is the gauge group and gluons are the gauge bosons. The

corresponding Lagrangian is:

L = −1

4Gµν

a Gaµν + qj(iγµD

µjk −Mjk)qk (1.9)

where Mjk is the quark mass matrix. The Latin indices refer to color and

assume values a = 1,2,. . . 8 for the eight gluons and j,k =1,2,3 for the three

quarks. The gluon field tensor is defined as:

Gµνa = ∂µGν

a − ∂νGµa − gsf

abcGµbG

νc (1.10)

here Gµa are the gluon fields, gs is the strong coupling and fabc are the structure

constants of the SU(3) group. The covariant derivative acting on the quark

fields is:

Dµjk = δjk∂

µ + igs(Ta)jkGµa (1.11)

where Ta are the generators of the SU(3) group defined by the commutation

relation [Ta, Tb] = ifabcTc. In particular T a = λa/2, λa are the eight Gell-Mann

matrices. These are hermitian, 3× 3 and traceless matrices.

The second term of (1.9) contains a quark-gluon interaction vertex, while the

first term contains three and four gluon couplings. These self-interactions of the

gluons, which have no analog in QED where the photon is electrically neutral,

are a consequence of the fact that gluons also carry color charge due to the

10


non-abelian nature of the group.

Gluons are required to be massless since the presence of a mass term for gauge

fields break the gauge invariance of Lagrangian.

1.1.2.3 Weak Interactions and Electroweak Unification

The fermions are grouped in left-handed and right-handed fields:

ψL = PLψ =1

2(1− γ5)ψ (1.12)

ψR = PRψ =1

2(1 + γ5)ψ (1.13)

where PL,R are the chirality5 operators and γ5 = iγ0γ1γ2γ3 [4].

Only left-handed particles and right-handed antiparticles participate in the weak

interaction. The left-handed fermions are SU(2)6 doublets7,

ψL =

(νe

e−

)L

,

(νµ

µ−

)L

,

(ντ

τ−

)L

,

(u

d′

)L

,

(c

s′

)L

,

(t

b′

)L

(1.14)

while the right-handed fermions are singlets

ψR =(µ−)

R,(e−)R, ...(u)R,(d)R, ... (1.15)

As a consequence of these couplings the SU(2) symmetry is denoted as SU(2)L.

The electromagnetic and weak interactions are unified into a single theory by the

Glashow, Salam and Weinberg (GSW) theory, the Electroweak theory (EW).

The simplest unification of the parity violating weak force and the parity con-

serving electromagnetic force is the SU(2)L × U(1)Y gauge theory.

Local gauge invariance under SU(2) transformations requires introduction of

three massless spin 1 gauge bosons W iµ, i = 1,2,3. The conserved quantity is

called weak isospin (Ta with a = 1,2,3). An additional U(1) symmetry was

added to include the electromagnetic interaction in the EW theory. It is an

independent gauge symmetry of the weak hypercharge (Y ), which is specified

according to the formula Q = T3+Y2 , where Q is the electric charge and T3 the

third component of the weak isospin. This symmetry is denoted as U(1)Y and

it requires an additional gauge boson Bµ with spin 1. The U(1)Y gauge boson

5Chirality is a property of the field defined by the operator γ5, which is formed, as shownin the text, by the product of Dirac matrices so that it anti-commutes with all the others. Incase of massless particles the chirality corresponds to the helicity: fermions with right-handed(left-handed) helicity are the ones that have the spin pointing in the same (opposite) directionof the momentum. For antifermions this convention is reversed.

6SU(2) is the group of the special unitary 2× 2 matrices. A unitary matrix satisfies: T †a =

T−1a , where T †

a is the hermitian conjugate matrix. The generators of SU(2) are Ta = τa/2,where τa are the Pauli matrices defined in the weak isospin space.

7d’,s’,b’ are the eigenstates of the weak interactions.

11


couples to both the right-handed and the left-handed components8.

The gauge invariant Lagrangian describing the electroweak interactions is

L = −1

4W a

µνWµνa − 1

4BµνB

µν + ψiγµDµψ (1.18)

where the field tensor Wµν and Bµν are defined as:

W iµν = ∂µW

iν − ∂νW

iµ − gεijkW

jµW

kν (1.19)

Bµν = ∂µBν − ∂νBµ (1.20)

Additionally cubic and quartic self-couplings of the W iµ fields have been intro-

duced. The covariant derivative is

Dµ = ∂µ + igW aµTa + ig′

1

2BµY (1.21)

where g is the SU(2) constant coupling and g′ the U(1) constant coupling. The

interaction between the fermions and the gauge fields is

Lint = −ψLγµ

(g

2W a

µ τa +g′

2BµY

)ψL − ψRγ

µ

(g′

2BµY

)ψR (1.22)

It should be noted that for the local gauge invariance to be conserved, no mass

terms for the fermions or the gauge bosons (m2BµBµ and m2WµWµ) could be

introduced in the Lagrangian.

1.1.3 The Higgs Mechanism

The Standard Model, i.e. the SU(3)C × SU(2)L × U(1)Y theory, is the

combination of the electroweak theory and the QCD theory. The symmetry of

SU(2)L × U(1)Y , i.e. the invariance for a local gauge transformation, requires

the presence of massless gauge bosons in the EW theory. This conflicts with

experimental measurements of W± and Z gauge bosons, according which their

masses are large and can not be neglected (see Table 1.2). A solution has been

proposed by F. Englert, R. Brout, P. Higgs and independently G. Guralnik, C.

R. Hagen, and T. Kibble. They conjectured that the massless gauge bosons of

weak interactions acquire their mass through interaction with a scalar field (the

8Transformations under SU(2)L×U(1)Y of the left-handed and right-handed components:

• Under SU(2)L transformation:

ψL(x) → ψ′L(x) = eigα

a(x) τa2 ψL(x), ψR(x) → ψ′

R(x) = ψR(x) (1.16)

• Under U(1)Y transformation:

ψL(x) → ψ′L(x) = eig

′β(x)Y2 ψL(x), ψR(x) → ψ′

R(x) = eig′β(x)Y

2 ψR(x) (1.17)

12


Higgs Field), resulting in a single massless gauge boson (the photon) and three

massive gauge bosons (W± and Z). This is possible because the Higgs field has

a potential function which allows degenerate vacuum solutions with a non-zero

vacuum expectation value [5].

In the context of the SU(2)L × U(1)Y symmetry, the Higgs mechanism is im-

plemented through an additional SU(2)L doublet of complex scalar fields (four

real scalar fields)

φ =

(φ+

φ0

)=

√1

2

(φ1 + iφ2φ3 + iφ4

), (1.23)

the self-interaction of which leads to the spontaneous electroweak symmetry

breaking9. The quantum numbers of these fields are summarized in Table 1.3.

The Higgs sector of the Lagrangian is

T T3 Y/2 Qφ+ 1/2 1/2 1/2 1φ0 1/2 -1/2 1/2 0

Table 1.3: The quantum numbers of the complex scalar fields of the SU(2)Ldoublet φ.

LH = (Dµφ)†(Dµφ)− V (φ) (1.24)

where the most general renormalizable form of the scalar potential is

V (φ) = µ2φ†φ+ λ(φ†φ)2 = µ2|φ|2 + λ|φ|4 (1.25)

The potential is chosen such that it is an even function of the scalar field, i.e.

V (φ) = V (−φ), so that the Lagrangian is invariant under the parity transfor-

mation φ → −φ. The potential is parametrized by λ and µ. λ, which is the

strength of the quartic self-coupling of the scalar field (showed by φ4 term), is

required to be positive so that the energy is bounded from below. This require-

ment ensures the existence of stable ground states. Two qualitatively different

cases, corresponding to manifest or spontaneously broken symmetry, may be

distinguished depending on the sign of the coefficient µ2.

If µ2 > 0, the potential has a unique minimum at φ = 0 that corresponds to

the ground state, i.e. the vacuum. In terms of a quantum field theory, where

9It might break this symmetry simply introducing by hand a mass term for gauge bosons,which violates the symmetry, however, such a procedure would destroy the renormalizabilityof the theory. Then it uses a more elegant way to break the symmetry called “spontaneoussymmetry breaking”. In this scenario, the gauge invariant Lagrangian is maintained, whilethe state of lowest energy, which is interpreted as the vacuum state is not gauge invariant.There is an infinite number of states, each of which has the same ground state energy andnature chooses one as a state of “true” vacuum.

13


(a) µ2 > 0.

(b) µ2 < 0.

Figure 1.1: Illustration of the Higgs potential for a scalar field φ = φ1+iφ2

with µ2 > 0 and µ2 < 0.

φ is an operator, the precise statement is that the operator φ has zero vacuum

expectation value (vev), i.e. 〈φ〉0 = 〈0|φ|0〉 = 0. The vacuum obeys the re-

flection symmetry of the Lagrangian. In this case, aside from the φ4 term the

Lagrangian is just the Lagrangian for a charged scalar particle of mass µ and

massless gauge bosons.

If µ2 < 0, the Lagrangian has a mass term of the wrong sign for the field φ and

the minimum energy is not at φ = 0. The potential adopts a shape known as

the “Mexican hat”, with a maximum at φ = 0, as can be seen in Figure 1.1(b).

The vacuum expectation value is obtained by looking at the stationary points

of L:

∂Lφ

∂(φ†φ)= 0 ⇒ φ20 = φ†φ ≡ 1

2(φ21 + φ22 + φ23 + φ24) = −µ

2

2λ≡ v2

26= 0 (1.26)

where v is referred to as the vacuum expectation value of the scalar field. It must

be noted that it is not zero. The values of (Reφ+, Imφ+, Reφ0, Imφ0) can range

over the surface of a 4-dimensional sphere of radius v, such that v2 = −µ2/λ

and φ†φ = |φ+|2 + |φ0|2. This implies that Lagrangian of φ is invariant under

rotations of this 4-dimensional sphere. The minimum of this potential no longer

corresponds to a unique value of φ but there is an infinite number of states

14


with the same lowest energy, there is a degenerate vacuum. The solution for

the location of the minima, φ0, is satisfied by

φ0 = eiθ√−µ

2

2λ(1.27)

where 0 ≤ θ ≤ 2π is the angle around the axis of the potential, V (φ). Choosing

one of the non-zero ground states φ0 for µ2 < 0 spontaneously breaks the

SU(2)L × U(1)Y symmetry down to U(1)Q. Lagrangian is still invariant, but

the ground state is no more symmetric under SU(2)L×U(1)Y . This is illustrated

in Figure 1.1: the potential in Figure 1.1(b) is symmetric under rotations but

any minimum chosen is not. Usually, θ = 0 is used to fix one vacuum state:

φ0(θ = 0) ≡ φvacuum. The direction of the minimum in the SU(2)L × U(1)Y

space is not determined since the potential depends only on the combination

φ†φ. Without loss of generality, now the Higgs field is fixed such that the vacuum

expectation value of φ is defined to be a real parameter in the φ0 direction, i.e.,

φ1 = φ2 = φ4 = 0, φ23 = −µ2/λ:

φ0 = 〈0|φ|0〉 = 1√2

(0

v

)(1.28)

φ+ is chosen to be zero because the vacuum state has to be neutral in order to

break SU(2)L ×U(1)Y symmetry saving the U(1)Q. Using the charge operator

Q on φ0 and the properties of the Higgs field, this leads to:

Qφ0 =

(T3 +

Y

2

)φ0 = (−1

2+

1

2)φ0 = 0 (1.29)

The vev has to be neutral because if Qφ = 0, as shown above, performing U(1)Q

transformation

φ→ φ′ = exp(−ieα(x)Q)φ ∼ (1− ieαQ)φ = φ (1.30)

So that choice for the vacuum state breaks both SU(2)L and U(1)Y but not

U(1)Q. In this way the vacuum stays neutral but it carries a hypercharge and

an isospin so that it couples to weak bosons.

The physical content of the theory is revealed by a perturbative expansion of the

Lagrangian around the ground state, φ(x) can be expanded about this particular

vacuum, one can parametrize excitations from this ground state by

φ =1√2e

iξa(x)τa

2v

(0

v +H(x)

)(1.31)

15


where the real fields ξ1, ξ2, ξ3 andH have a zero vacuum expectation value. This

gives rise to a scalar field H(x), the massive Higgs field, which describes radial

excitations from the ground state changing the potential energy, and to three

massless scalar fields ξa(x), the Goldstone bosons, corresponding to angular

excitations without potential energy change. These three massless scalar bosons

correspond to the three broken symmetry generators. These phase factors,

and thus the Goldstone bosons, can be eliminated by a local SU(2)L gauge

transformation with α(x) = ξ(x)/v, this gauge choice is referred to as unitary

gauge, leading to the following parametrization of the scalar field:

φ =1√2

(0

v +H(x)

)(1.32)

Here the degrees of freedom represented by the Goldstone bosons are absorbed

(“eaten up”) by the vector particlesW± and Z, given them an additional degree

of freedom: a longitudinal polarization, thus the vector bosons acquire mass.

Only massive particles with velocities below the speed of light can have longi-

tudinal degrees of freedom. The photon has only a transversal polarization10.

Therefore the ξa(x) disappear from the Lagrangian and reappear as the longitu-

dinal component of the massive gauge bosons. Since Qφ = 0, the ground state

is still symmetric under U(1)Q and the photon will remain massless.

1.1.3.1 Massive Gauge Bosons

The coupling of φ to the gauge bosons takes place through the covariant

derivative Dµ. By expanding around the ground state of φ, i.e. introducing the

ansatz (1.32) into the Lagrangian of the electroweak theory (the Higgs sector is

expressed in (1.24)), it is now straightforward to see how the Higgs mechanism

generates masses for W± and Z bosons. Evaluating the resulting kinetic term

(Dµφ)†(Dµφ) at the vacuum expectation value φ0:

(Dµφ0)†(Dµφ0) =

∣∣∣∣(∂µ + igW aµ

τa2

+ ig′1

2BµY

)φ0

∣∣∣∣2 (1.33)

10It is instructive to count the degrees of freedom after the spontaneous symmetry breakinghas occurred. The starting point is a Lagrangian with a complex scalar SU(2)L doublet φand four massless vector bosons. Counting degrees of freedom gives four from the scalarsand eight from the vector bosons, for a total of twelve. Through the Higgs mechanism theLagrangian is transformed into one real scalar, three massive vector bosons and one masslessvector boson. The massless vector boson is of course to be identified with the photon and thesingle remaining scalar with the Higgs boson. Counting degrees of freedom again gives onefrom Higgs, two from the photon and nine from the massive vector bosons, again adding upto twelve.

16


The relevant terms are (it should be noted that φ has hypercharge Y = 1):

∆L =1

2(0 v)

(1

2gW j

µτj +1

2g′Bµ

)(1

2gW kµτk +

1

2g′Bµ

)(0

v

)=

=1

8(0 v)

(gW 3

µ + g′Bµ g(W 1µ − iW 2

µ)

g(W 1µ + iW 2µ) −gW 3

µ + g′Bµ

)2(0

v

)=

=1

8v2[g2(W 1

µ − iW 2µ)(W

1µ + iW 2

µ) + (−gW 3µ + g′Bµ)

2

]=

=1

8v2g2

[(W 1

µ

)2+(W 2

µ

)2]+

1

8v2(g′Bµ − gW 3

µ

)(g′Bµ − gW 3µ

)=

=

(vg

2

)2

W+µ W

−µ +1

8v2(W 3

µ Bµ)

(g2 −gg′

−gg′ g′2

)(W 3µ

Bµ

)(1.34)

where the charged gauge bosons, W±, have already acquired mass and a mixing

between W 3µ and Bµ is observed. The corresponding mass eigenstates for the

neutral gauge bosons are obtained by diagonalizing the mass matrix:

W±µ =

1√2(W 1

µ ∓ iW 2µ) con mW =

1

2gv (1.35)

Zµ =gW 3

µ − g′Bµ√g2 + g′2

con mZ =1

2v√g2 + g′2 (1.36)

Aµ =g′W 3

µ + gBµ√g2 + g′2

con mA = 0 (1.37)

The spontaneous symmetry breaking rotates the four SU(2)L × U(1)Y gauge

bosons to their mass eigenstates by means of the gauge interaction term of the

Higgs fields: {W 1µ , W

2µ}→{W+

µ , W−µ } e {W 3

µ , Bµ}→{Aµ, Zµ}.W± is associated to the charged current processes, A to the electromagnetic

currents and Z to the neutral currents. Once Aµ is recognized as the photon,

the three couplings, previously described, are related to each others by

e =gg′√g2 + g′2

(1.38)

Usually the ratio between g and g′ is defined through the weak mixing angle

θW , also known as the Weinberg’s angle:

tan θW =g′

g(1.39)

It is important to notice that θW is a free parameter of the model since it is the

ratio of two coupling constants related to independent symmetry groups. Given

θW , with sin θW = g′√g2+g′2

, all gauge couplings are determined by the electric

17


charge:

e = g sin θW = g′ cos θW (1.40)

and thus the electroweak unification is achieved. In terms of θW , the photon

and Z boson field are(Zµ

Aµ

)=

(cos θW − sin θW

sin θW cos θW

)(W 3

µ

Bµ

)(1.41)

and the relationmW

mZ= cos θW (1.42)

for the masses of the gauge bosons is predicted.

By inserting (1.32) into the expression (1.25) for the Higgs potential V (φ), the

mass term −µ2H2 for the Higgs field H appears, implying the existence of a

new physical particle, the Higgs boson as already said, with mass

mH =√−2µ2 =

√2λv (1.43)

It can be noticed that basically in this model there are two constants g and

g′ related to the symmetry SU(2)L × U(1)Y and two parameters of the Higgs

potential µ and λ. Usually they are parametrized with the observables α, the

fine structure constant, GF , the Fermi constant, mZ , the Z boson mass, and

mH , the mass of the Higgs boson, for which the relations are summarized here:

α =g2g′2

4π(g2 + g′2

) =g2 sin2 θW

4π(1.44)

mZ =1

2v√g2 + g′2 (1.45)

GF =1√2v2

(1.46)

mH =√

−2µ2 =√2λv (1.47)

GF is the strength of the weak interaction in the effective and point-like de-

scription of weak interactions formulated by Fermi. The parameter v can be

determined from the measurement of the muon life time in the weak charged

current decay µ→ eνeνµ. The interaction strength for muon decay is measured

very precisely to be GF = 1.16637(1) · 10−5 GeV−2[1] giving the value for the

vacuum expectation value

v =(√

2GF

)−1/2 ≈ 246 GeV (1.48)

18

1. The SM and The Higgs Boson 1.2 The SM Higgs Boson

This value set the scale of the electroweak symmetry breaking11, but it is not

predicted by the SM. The relation between GF and the vev v comes from

GF√2=

g2

8m2W

=1

2v2(1.50)

which is a comparison between the Fermi theory and the charged current in the

limit of highly massive gauge bosons. In fact the muon decay at the leading

order can be described by the propagation of a W boson but Fermi showed

that this can be simplified to one vertex with the constant coupling GF . Once

the values of α, GF and mZ are known, it can be predicted from (1.35) and

(1.36) the mass of the W boson, at the lowest order mW = mZ cos θW ≈ 80

GeV, which has been confirmed experimentally by the Z’s and W ’s discovery

at SppS (Super proton-antiproton Synchrotron) and by precise measurements

of mW and mZ at LEP (Large Electron Positron Collider). The v parameter

can be experimentally determined but there is no way to measure the value of

λ before a discovery of the Higgs.

1.2 The SM Higgs Boson

1.2.1 Theoretical Constraints on the Higgs Mass

Despite the prediction of a Higgs boson, the Standard Model does not pro-

vide for its mass. Although the Higgs boson mass is a free parameter of the

theory, constraints on the possible mass values can be derived using theoretical

arguments regarding the energy regime in which the perturbative expansion of

the Standard Model is valid. Therefore these argumentations come from very

reasonable considerations, but they cannot provide stringent limits since they

depend on the absence of new physics up to a cut-off energy scale. As it will see,

this means that it can be set a range of masses that is valid as long as virtual

effects of new physics enter in the calculation of the Higgs boson mass. These

arguments are: the unitarity in longitudinal scattering amplitudes; the pertur-

bativity of the Higgs self-coupling and the stability of the electroweak vacuum

[5, 6, 7, 8, 9].

1.2.1.1 Unitarity

A major deficit of Fermi’s theory of weak interactions was the violation of

unitarity at the electroweak scale√s ∼ G

−1/2F , due to the assumption of point

11By the relation

mW =gv

2=

g

2√2λmH (1.49)

it is showed that the Higgs mass sets the electroweak scale.

19


interactions. The introduction of the massive intermediate vector bosons pushed

this problem to higher energies. However, owing to the high energy behaviour

of their longitudinal polarisation:

εµL =

(|~p|mV

, 0, 0,E

mV

)E�mV−−−−−→ pµ

mV(1.51)

unitarity violation is still expected in processes involving the longitudinal com-

ponents of the vector bosons. Indeed, in the Standard Model vector bosons

are predicted to have self-interactions, the SM calculation of the scattering

amplitude of longitudinal gauge bosons VLVL → VLVL, where V = W±,Z,

leads to conclude that if virtual effects of the Higgs boson or new physics are

not included, then this amplitude grows proportionally to the center of mass

energy of the scattering. This behaviour violates the unitarity, which means

that at some energy this process has the probability to occur greater than one

(σVLVL→VLVL ≥ 1), which is unphysical.

A characteristic example of such process is theW+L W

−L scattering at high ener-

Figure 1.2: Divergent WW cross section graphs and their cancellation.The upper three diagrams violate unitarity starting from

√s ≈ 1.2 TeV.

These unitarity violations are deleted by the lower two diagrams involvingHiggs boson exchange.

gies. In the following it will be proved that without an additional interaction, the

cross section of longitudinal WW scattering, shown in the first three graphs in

Figure 1.2, would be divergent and violate unitarity bounds from√s = 1.2 TeV

onwards. The Higgs mechanism cancels the divergences of the cross section of

the longitudinal degrees of freedom of the W bosons by destructive interference

of the last two graphs of Figure 1.2 with the first three graphs. Thus the cross

section does not diverge and no violation of unitarity occurs. This mechanism

only works if the Higgs boson is not too heavy, otherwise it would not contribute

enough to the scattering amplitudes before unitarity is violated.

Considering the W+L W

−L → W+

L W−L process, at the tree level, neglecting dia-

20


grams relative to the exchange in the s and t channels of the Higgs boson, the

remaining processes are:

1. The exchange in s and t channels of a Z boson.

2. The exchange in s and t channels of a photon.

3. The direct coupling in one vertex of the four W bosons.

The amplitudes of these processes can be computed, the result is that each of

these amplitudes can be parametrized in the following way:

A = α

(p

mW

)4

+ β

(p

mW

)2

+ γ (1.52)

Adding up all amplitudes, it results that the α term of the process 3 deletes all

the others, whereas the β term is deleted only including the processes relative

to the Higgs exchange in s or t channel. Expanding the amplitude in partial

waves, it is obtained

A = 16π∑L

(2L+ 1)aL(s)PL(cos θ) (1.53)

The unitarity condition of the scattering matrix is expressed by

|aL(s)| < 1 ⇒ |Re(aL(s))| ≤1

2(1.54)

Then ignoring the contributions of the Higgs diagrams (or equivalently ifm2H �

s), and considering only the s wave, the amplitude turns out to be:

a0(s) = − s

32πv2(1.55)

leading to the bound√s . 1.7 TeV. Assuming more constraining channels:

a0(s) = − g2s

64πm2W

= − GF s

8π√2

(1.56)

and the following bound

s ≤ 4π√2

GF≈ (1.2 TeV)2 (1.57)

on the center of mass energy√s can be derived for the validity of a theory of

weakly coupled massive gauge bosons. The increase of the scattering amplitude

can be damped by Higgs boson exchange, indeed including these diagrams the

β term is deleted and the s wave amplitude in the high center of mass energy

21


limit, s� m2H , is:

a0(s) = −m2HGF

4π√2

(1.58)

This puts an upper limit on the Higgs boson mass as follows:

mH ≤

√2π

√2

GF≈ 870 GeV (1.59)

This bound can be improved by requiring unitarity in all the relevant scattering

amplitudes, like W+L W

−L → ZLZL. In this case the obtained upper bound

is mH < 710 GeV [8]. Within the canonical formulation of the SM, internal

consistency therefore requires mH < 1 TeV. If there is no Higgs boson or its

mass is greater than 710 GeV, an additional strong force acting on the W

bosons is needed to cancel the divergences. It means that the couplings in the

W and Z boson sector become so large that the whole concept of the Higgs

mechanism as a perturbative expansion around the vacuum expectation value

breaks down. Then there must be a critical energy scale (Λ) above which new

physics appears to ensure unitarity and this scale should be around 1 or 2 TeV,

as extracted in (1.57). In other words, if energies up to 2 TeV can be explored,

then it will be possible either to discover the Higgs boson either exclude it and

to reach the limit where the SM fails. Luckily, this is now possible thanks to

the collisions produced at the Large Hadron Collider that will be described in

the next chapter.

1.2.1.2 Triviality

Figure 1.3: Diagrams generating the evolution of the Higgs self-interactionλ[9].

Quite restrictive bounds on the Higgs mass depend on the energy scale Λ

up to which the Standard Model is valid, i.e. the scale up to which no new

interactions and no new particles should appear. These bounds are derived from

the evolution of the quartic Higgs self-coupling λ with the energy E determined

by the quantum fluctuations,

dλ

dt=

1

16π2[12λ2+12λg2t −12g4t −

3

2λ(3g2+g′2)+

3

16λ(2g4+(g2+g′2)2)] (1.60)

22


with t = log(E2/v2), where the electroweak symmetry breaking scale v has

been used as the reference point, gt is the energy dependent top-Higgs Yukawa

coupling12, g and g′ are the electroweak gauge couplings. The three main con-

tributions to the evolution of λ are depicted in Figure 1.3. Due to their different

spin statistics, the Higgs loop gives rise to an indefinite increase of the quartic

coupling while the top quark loop drives the coupling to the smaller values,

finally even to values below zero.

If λ is large, for moderate top masses the contribution from Higgs loops domi-

nates over the top loops. Then the first term in (1.60) from Higgs loops

dλ

dt≈ 3

4π2λ2 (1.61)

dominates over the top loops leading to

λ(E2) =λ(v2)

1− 3λ(v2)4π2 log E2

v2

(1.62)

Hence, λ(E2) grows to infinity as the energy E increases and tends to zero as the

energy decreases. Without the λ(φ†φ)2 interaction, however, the theory becomes

a non-interacting theory at low energy, termed a trivial theory. Besides growing

λ to infinity, no well-defined theory would exist, since the Higgs potential in

Figure 1.1 would be reduced to an infinitesimally thin band with a vev equal to

0 and infinitely strong interactions. In particular it follows that λ(E2) becomes

infinite at the Landau pole, corresponding to the energy

Λ2 = v2 exp

(4π2

3λ

)= v2 exp

(8π2v2

3m2H

)(1.63)

It is therefore required that the quartic coupling λ is finite up to a large scale

Λ where no new physics appears,

1

λ(Λ)> 0 i.e. 0 < λ(Λ) <∞ (1.64)

which, with mH =√2λv, leads to the upper bound on the Higgs mass:

m2H <

8π2v2

3 log(Λ2/v2)(1.65)

This mass bound is related logarithmically to the energy Λ up to which the

Standard Model is assumed to be valid. The Renormalization Group equation

(1.60) can be used to establish the energy domain in which the SM is valid as a

12gt = −mtv

.

23


function of the Higgs mass. From Equation (1.63) it can be seen that for large

cut-off scales, the Higgs mass should be small, while for small cut-off scales the

Higgs boson mass can be rather heavy.

The maximal value of mH for the minimal cut-off Λ ∼ 1 TeV is given by ∼750 GeV. The caveat in this estimation is that for large λ the perturbation theory

breaks-down and the constraint is not valid. However, the lattice simulations

of gauge theories, properly including non-perturbative effects, provide an upper

bound of the Higgs mass of mH < 700 GeV, in agreement with the above

calculation [7].

If the cut-off energy is at the Plank scale around 1019 GeV, then the Higgs

boson mass should be little, mH < 190 GeV. The lower this energy is set, the

looser is the upper constraint on the Higgs boson mass. However in the Higgs

boson mass calculation the contribution from top and gauge boson loops cannot

be neglected. If these corrections are included and it is required that the theory

is perturbative (i.e. λ is finite) below a given energy, then an upper limit on

mH as a function of the top quark mass can be set. For mt = 175 ± 6 GeV,

mH < 180± 4± 5 GeV if Λ = 1019 GeV [6]. Figure 1.4 shows the upper limits

that prevent the self-interaction to become infinite. In the picture there are also

lower limits that will be explained in the following subsection.

1.2.1.3 Vacuum Stability

In the previous subsection only the Higgs boson self-interaction has been

included in the running of the quartic coupling, which is a good approximation

in the region where the coupling is large. For completeness the contributions

from gauge bosons and fermions should be also included. Nevertheless, out of

all the fermions only the top quark contributes significantly, as can be seen

from (1.60). For small values of the coupling λ the top quark contribution can

dominate:dλ

dt≈ 1

16π2

[− 12g4t +

3

16λ(2g4 + (g2 + g′2)2)

](1.66)

driving λ(E2) to a negative value for large gt. In this case the scalar potential

has no minimum, there is no ground state and the vacuum is not stable, no

stable spontaneous symmetry breaking occurs. The equation (1.66) is easily

solved to find

λ(Λ) = λ(v) +1

16π2

[− 12g4t +

3

16λ(2g4 + (g2 + g′2)2)

]log

(Λ2

v2

)(1.67)

The requirement to have a scalar potential which is bounded from below at all

scales Λ, and therefore keeping λ(Λ) > 0, results in a lower limit for the Higgs

24


boson mass which depends on the cut-off scale:

m2H >

v2

8π2

[− 12g4t +

3

16λ(2g4 + (g2 + g′2)2)

]log

(Λ2

v2

)(1.68)

At one-loop order this lower limit is:

Λ = 1 TeV ⇒ mH > 70 GeV (1.69)

Λ = 1019 GeV ⇒ mH > 130 GeV (1.70)

(1.71)

Figure 1.4: Theoretical bounds on the SM Higgs mass as a function of thecut-off scale. It is assumed that the SM is a valid theory up to the scale Λ.The upper solid area indicates the sum of theoretical uncertainties in themH upper bound for mt = 175 GeV. The upper edge corresponds to Higgsmasses for which the SM Higgs sector ceases to be meaningful at scale Λ,and the lower edge indicates a value of mH for which perturbation theory iscertainly expected to be reliable at scale Λ. The lower solid area representsthe theoretical uncertainties in the mH lower bounds derived from stabilityrequirements using mt = 175 GeV and αs(mZ) = 0.118, where αs is thestrong coupling constant [6].

1.2.1.4 Combining the Theoretical Arguments

Therefore it is seen that when λ is small (a light Higgs boson) radiative

corrections from the top quark and gauge couplings become important and lead

to a lower limit on the Higgs boson mass from the requirement of vacuum sta-

bility, λ(Λ) > 0. If λ is large (a heavy Higgs boson) then triviality arguments,

(1/λ(Λ) > 0), lead to an upper bound on the Higgs mass. The allowed region

25


for the Higgs mass from these considerations is shown in Figure 1.4 as a function

of the scale up to which the Standard Model is expected to be valid and no new

physics is required, Λ. The width of the bands correspond to the theoretical

uncertainties due to the truncation of the perturbative expansion and on the

experimental uncertainties of the input parameters. It is known that Λ ≤ MP

because at energies above the Plank scale MP ≈ 1019 GeV quantum gravita-

tional effects become significant and the Standard Model must be placed by a

more fundamental theory which incorporates gravity. According to Figure 1.4,

i.e. considering mt = 175 GeV, if the Standard Model is valid up to the Planck

scale, the Higgs mass is bounded to the mass range [9]:

130 GeV . mH . 190 GeV (1.72)

On the other hand, if a Higgs boson with higher mass will be found, it will

imply a break down of the Standard Model at lower energy scale. At the scale

Λ = 1 TeV, the Higgs boson can be found in a broader mass region:

55 GeV . mH . 700 GeV (1.73)

1.2.2 Experimental Limits on the Higgs Mass

Experimental limits for the Higgs mass are deduced both from direct [1, 10]

and indirect searches [5, 7, 8, 11, 12].

1.2.2.1 Direct Searches

The first direct search was carried out at LEP. The LEP collider operated

in two phases: in the first (LEP I) the center of mass energy was close to

mZ , while in the second (LEP II) the energy was gradually increased from

189 GeV to 209 GeV. The principle mechanism for producing the SM Higgs

boson in e+e− collisions at LEP energies was Higgs-strahlung in the s channel,

e+e− → Z(∗) → HZ, where the electron and the positron annihilate producing

a virtual vector boson which becomes real emitting the Higgs boson. The Z

boson in the final state is either virtual (LEP I), or on mass shell (LEP II).

The Standard Model Higgs searches at LEP13 were concentrated in four final

state topologies:

• four-jet topology: H → bb and Z → qq;

• τ -lepton production: H → τ+τ− and Z → qq or H → bb and Z → τ+τ−;

13At LEP I, only the modes Z → l+l− and Z → νν were used because the backgrounds inthe other channels were prohibitive. For the data collected at LEP II, all decay modes wereused.

26


• missing energy14, mainly in the process H → bb and Z → νν;

• leptonic final states: H → bb and Z → e+e−, µ+µ−.

The combination of the four LEP experiments (ALEPH, DELPHI, L3 and

OPAL) data set a lower bound at mH > 114.4 GeV at a 95% confidence level

(C.L.) [1] .

Another accelerator where direct searches have been carried out was Tevatron

Figure 1.5: Higgs boson production cross sections (fb) at the Tevatron(√s = 1.96 TeV) for the most relevant production mechanisms as a function

of the Higgs boson mass [13].

at Fermi National Accelerator Laboratory. It was a proton-antiproton collider

which was running at√s = 1.96 TeV, it shut down on September 30, 2011.

The decays mode of the Higgs boson and its production in hadron colliders will

be discussed in Section 1.3. However, the most important Higgs production

mechanisms at Tevatron, whose cross sections are visible in Figure 1.5, were:

• gluon fusion: gg → H;

• associated production with a vector boson (Higgs-strahlung): qq →W±H

or ZH;

• vector boson fusion: qq → qqH, where the quarks radiate weak gauge

bosons that fuse to form the H.

14If a particle traverses the detector without leaving a trace, the total sum of momenta isfound to be non-zero. The energy corresponding to the particle is said to be “missing”.

27


For masses less than about 135 GeV, the most promising discovery channel

was the associated production with H → bb. The final states analyzed for

the vector bosons were: W± → l±ν (l = e, µ, τ), Z → νν and Z → l+l−

(l = e, µ). Besides both Tevatron collaborations (CDF and DØ) searched

for SM H → τ+τ− where the gg → H, WH, ZH and vector boson fusion

production processes were considered. Another research channel was the SM

process H → γγ, making use of all production modes. The CDF Collaboration

searched also for WH + ZH → jjbb processes, making use of signal events in

which the vector boson decays to jets.

In the high mass region, above 135 GeV, studies were focused on the gg →H →WW (∗) → l+νl−ν channel15 (l = e, µ, τ). Nevertheless, WH production,

ZH production, and vector boson fusion qqH production contribute additional

signal in this channel which was used in the searches.

The sensitivity reached combining the results from all the analyses carried out by

the two experiments at Tevatron, CDF and DØ, enabled to set new limits on the

SM Higgs boson mass in 2011. Analyzing an integrated luminosity from 4.3 fb−1

up to 10.0 fb−1 (CDF) and up to 9.7 fb−1 (DØ) at√s = 1.96 TeV, the ratios

of the 95% C.L. expected and observed limit to the SM cross section are shown

in Figure 1.6 for the combined CDF and DØ analyses. The regions of Higgs

boson masses excluded at the 95% C.L. are 147 GeV < mH < 179 GeV and

100 GeV < mH < 106 GeV. The expected exclusion region, given the current

sensitivity, is 141 GeV < mH < 184 GeV and 100 GeV < mH < 119 GeV

(masses below mH < 100 GeV were not studied). There is an excess of data

events with respect to the background estimation in the mass range 115 GeV

< mH < 135 GeV. At mH = 120 GeV, the p-value for a background fluctuation

to produce this excess is ∼ 3.5 × 10−3, corresponding to a local significance of

2.7σ. The global significance for such an excess anywhere in the full mass range

is approximately 2.2σ [10].

1.2.2.2 Indirect Searches

Indirect evidence for a light Higgs boson can be derived from the high-

precision measurements of electroweak observables at LEP and elsewhere. In-

deed, the fact that the Standard Model is renormalizable only after including the

top and Higgs particles in the loop corrections, indicates that the electroweak

observables are sensitive to the masses of these particles. The Higgs boson en-

ters into one-loop radiative corrections in the Standard Model and so precision

electroweak measurements can bound the Higgs boson mass. For example, the

Fermi coupling can be rewritten in terms of the weak coupling and theW mass;

15The star indicates that below H →W+W− threshold, one of the W± boson is virtual.

28


1

10

100 110 120 130 140 150 160 170 180 190 200

1

10

mH (GeV/c2)

95%

CL

Lim

it/S

M

Tevatron Run II Preliminary, L ≤ 10 fb-1

ExpectedObserved±1σ Expected±2σ Expected

LE

P E

xclu

sio

n

Tevatron+ATLAS+CMS

Exclusion

SM=1

Tev

atro

n +

LE

P E

xclu

sio

n

CM

S E

xclu

sio

n

AT

LA

S E

xclu

sio

n

ATLAS+CMSExclusion

ATLAS+CMSExclusion

February 27, 2012

Figure 1.6: Observed and expected (median, for the background-only hy-pothesis) 95% C.L. upper limits on the ratios to the SM cross section, asfunctions of the Higgs boson mass for the combined CDF and DØ analyses.The limits are expressed as a multiple of the SM prediction for test masses(every 5 GeV/c2) for which both experiments have performed dedicatedsearches in different channels. The points are joined by straight lines forbetter readability. The bands indicate the 68% and 95% probability regionswhere the limits can fluctuate, in the absence of signal [10].

29


at lowest order, GF /√2 = g2/8m2

W . After substituting the electromagnetic

coupling α, the electroweak mixing angle and the Z mass for the weak coupling,

and the W mass, this relation can be rewritten as

GF√2=

2πα

sin2 2θWm2Z

[1 + ∆rα +∆rt +∆rH ] (1.74)

The ∆ terms take account of the radiative corrections: ∆rα describes the shift

Figure 1.7: One-loop diagrams for the contributions to the W mass in-volving the Higgs boson (a) and the top quark (b).

in the electromagnetic coupling α if evaluated at the scale mZ instead of zero-

momentum; ∆rt denotes the top (and bottom) quark contributions to the W

and Z masses, which are quadratic in the top mass. Finally, ∆rH accounts for

the virtual Higgs contributions to the masses; the Higgs boson, for instance,

enters in the one-loop corrections to the W mass, illustrated in Figure 1.7. This

latter term depends only logarithmically on the Higgs mass at leading order [7]:

∆rH =GFm

2W

8√2π2

11

3

[log

m2H

m2W

− 5

6

](m2

H � m2W ) (1.75)

Since the dependence on the Higgs boson mass is only logarithmic, the limits

derived on the Higgs boson from this method are relatively weak. In contrast,

the top quark contributes quadratically to many electroweak observables. It is

proved that at one-loop all electroweak parameters have at most a logarithmic

dependence on mH . This fact has been glorified by the name of the “screening

theorem”. Since in general electroweak radiative corrections involving the Higgs

boson take the form [5],

g2(log

mH

mW+ g2

m2H

m2W

. . .

)(1.76)

the quadratic effects in the Higgs mass are always screened by two additional

powers of g relative to the lower order logarithmic effects and so radiative cor-

rections involving the Higgs boson can never be large, being dominated by the

logarithmic term. Although the sensitivity on the Higgs mass is only logarith-

30


mic, the increasing precision in the measurement of the electroweak observables

allows to derive interesting estimates and constraints on the Higgs mass. The

Higgs boson mass can be extracted indirectly from precision fits of all the mea-

sured electroweak observables, within the fit uncertainty. This is actually one

of the most important results that can be obtained from precision tests of the

Standard Model and greatly illustrates the predictivity of the Standard Model

itself. The two observables that are most sensitive to the Higgs boson mass

are the W boson mass and the effective leptonic weak mixing angle sin2 θleff16.

Even if the precision of mW is better compared to sin2 θleff , the latter has a

more pronounced dependence on mH .

Stringent limits on Higgs boson mass are set from a global fit historically by

the LEP Electroweak Working Group and more recently using GFitter toolkit.

The SM predictions for the electroweak precision observables measured by the

LEP, SLC (Stanford Linear Collider), and Tevatron experiments are fully imple-

mented in the Gfitter software [11, 12]. The list of floating fit SM parameters

is17: mZ , mH , mc, mb, mt, ∆α(5)had(m

2Z), αs(m

2Z) where only the latter pa-

rameter is kept fully unconstrained allowing an independent measurement. In

particular for the running quark masses mc and mb, that are the MS renor-

malized masses of the c and b quark, the world average values are used as input

data. The combined top-quark mass is taken from the Tevatron Electroweak

Working Group.

The fit is performed minimizing the statistic test χ2, which consider the dif-

ference between measurements and SM predictions. The results are produced

either ignoring (“standard fit” or “blue-band”) or including (“complete fit”) the

direct searches at LEP, at Tevatron and at LHC. The values for GFitter are the

following using the Tevatron results from July/August 2011:

Electroweak precision data only: mH = 95+30[+74]−24[−43] GeV, (1.78)

Including direct searches: mH = 125+8[+21]−10[−11] GeV, (1.79)

where the errors are quoted at 1σ [2σ] level. The results, very similar to the

ones obtained by LEP group, are illustrated in Figure 1.8, which shows ∆χ2 ≡16sin2 θleff is a particular renormalization prescription of

sin2 θW = 1−m2

W

m2Z

(1.77)

which can be taken as the tree level definition of the Weinberg angle.17The masses of leptons and light quarks are fixed to their world average values; GF has

been determined through the measurement of the µ lifetime, giving GF =1.16637(1)·10−5

GeV−2. The leptonic and top-quark vacuum polarisation contributions to the running of theelectromagnetic coupling are precisely known or small. Only the hadronic contribution for

the five lighter quarks, ∆α(5)had(m

2Z), adds significant uncertainties and replaces the electro-

magnetic coupling at the Z peak, α(m2Z), as floating parameter in the fit.

31

1. The SM and The Higgs Boson 1.3 Higgs at the LHC

χ2 − χ2min of the global least-squares fit of the Standard Model predictions to

the electroweak data as a function of the Higgs boson mass. The preferred

value for its mass correspond to the minimum of the curve. The data clearly

favours a low mass Higgs boson. It may be concluded from these numbers that

[GeV]HM

50 100 150 200 250 300

2 χ∆

0

1

2

3

4

5

6

7

8

9

10

LE

P 9

5% C

L

Teva

tro

n 9

5% C

L

σ1

σ2

σ3

Theory uncertaintyFit including theory errorsFit excluding theory errors

G fitter SM

AU

G 11

Figure 1.8: ∆χ2 as a function of mH for the “standard” fit. The solid(dashed) line gives the results when including (ignoring) theoretical errors[14].

the canonical formulation of the Standard Model including the existence of a

light Higgs boson is compatible with the electroweak data. However, alternative

mechanisms cannot be ruled out if the system is opened up to contributions from

physics areas beyond the Standard Model.

1.3 Higgs at the LHC

In the Standard Model once the Higgs boson mass is fixed, all the properties

of the particle are uniquely determined. The SM Higgs couplings to fundamental

fermions are proportional to the fermion masses, and the couplings to bosons

are proportional to the squares of the boson masses. In particular, its couplings

to gauge bosons, Higgs bosons and fermions are given by [1]:

gHff =mf

v, gHV V =

2m2V

v, gHHV V =

2m2V

v2

gHHH =3m2

H

vand gHHHH =

3m2H

v2

(1.80)

where V = W± or Z. These are s wave couplings which are even under parity

and change conjugation, corresponding to the JPC = 0++ assignment of the

32


Higgs spin and parity quantum numbers.

In the next sections it will be argued that in Higgs boson production and decay

processes, the dominant mechanisms involve the coupling of the H to the W±,

Z and/or the third generation quarks and leptons. The Higgs boson coupling

to gluons, Hgg, is induced at leading order by a one-loop graph in which the

H couples mostly to a virtual tt pair. Likewise, the Higgs boson coupling to

photons, Hγγ, is also generated via loops, although in this case the one-loop

graph with a virtual W+W− pair provides the dominant contribution.

1.3.1 Production Mechanisms

A Higgs particle can be created from the fusion of beam particle constituents,

from the fusion of heavy particles produced at the collision, or it can be radiated

off a massive virtual particle. In Figure 1.9 the first process (A) is referred to

as direct production, whereas the other processes (B-E) are referred to as as-

sociated production. The distinction is important: the final state of associated

production will also contain the signatures of the two quarks or the massive

vector boson that radiated the Higgs. Their experimental signatures can be

used as a label in the search for Higgs events.

As illustrated by the diagrams in Figure 1.9, the four main production mecha-

nisms for the Standard Model Higgs boson at the LHC are [7, 15, 13, 16]:

• gluon fusion: gg → H;

• vector boson fusion (VBF): qq → qq +H;

• associated production with W/Z (or Higgs-strahlung off W ,Z): qq →W,Z +H;

• associated production with heavy quarks (or Higgs bremsstrahlung off

heavy quarks): gg, qq → QQ+H.

In addition to these main mechanisms, one could add bottom fusion bb→ H+X

and Higgs production in association with single top qb → qtH. In Figure 1.10

the fully inclusive Higgs boson production cross sections for√s = 7 TeV at

the LHC (for a description of the experimental apparatus see Chapter 2) are

summarized as a function of the Higgs mass.

For the 7 TeV proton-proton collider, the dominant process for Higgs produc-

tion is gluon fusion. The Higgs boson does not couple directly to the gluons, as

gluons are massless. Instead, the coupling is mediated by a quark loop, which is

most often a top loop and a bottom loop to a lesser extent, as the Higgs-quark

coupling is proportional to the quark mass.

The Higgs production cross section from weak vector boson fusion is typically

33


Figure 1.9: Feynman diagrams of the main Higgs production processes:gluon fusion (A), W+W− and ZZ fusion (B), W± and Z Higgs-strahlung(C), qq fusion (D), and Higgs bremsstrahlung of a top or bottom quark (E).

an order of magnitude lower than gluon fusion. At higher Higgs masses, how-

ever, it becomes important, because of the decrease in cross section for gluon

fusion. It will be competitive with the dominant gluon fusion mechanism for

large masses,mH ∼ 1 TeV. In this reaction two quarks radiate two vector bosons

which annihilate producing the Higgs boson. Besides, the peculiar final state

with two hard jets in the forward and backward region of the detector (coming

from the fragmentation of the two quarks after the vector bosons radiation),

with a big rapidity gap between them, provides a distinctive signature that can

be efficiently used to disentangle the signal from the background18.

Note also that the main contribution to the cross section is due to the W+W−

fusion channel, σ(W+W− → H) ∼ 3σ(ZZ → H) at the LHC, a consequence of

the fact that the W boson couplings to fermions are larger than those of the Z

boson.

Besides the fact that the gluon fusion and the VBF are important for the high

cross sections, they are also complementary because the former is specified by

the Yukawa Higgs boson couplings to fermions, while the latter is fixed by the

gauge Higgs boson couplings to vector bosons. Since the Yukawa and the gauge

sectors are not really connected, it is important to explore both these produc-

18Since it is a pure electroweak process there are no color fields connecting the two quarks.The result is that gluons cannot be emitted in the central part of detector, but they are mostlyradiated in a collinear direction with respect to the interacting quarks, while the Higgs bosondecays in the central part of the detector. So a Central Jet Veto is an effective cut thatdiscriminates between signal and backgrounds.

34


tion processes in order to understand the role of the Higgs boson in the SM

Lagrangian.

The Higgs boson production in association with a vector boson (W , Z) is called

Higgs-strahlung. The Higgs-strahlung production processes have even lower

cross sections. The production in association with a W± or a Z boson is im-

portant in the intermediate mass region (mH < 2mZ), its cross section is about

one to two orders of magnitude smaller than the gluon fusion cross section for

Higgs masses mH < 200 GeV. Its cross section falls rapidly with an increasing

value of mH , but the associated W± or Z can decay into leptons, which allow

for effective tags.

The Higgs production in association with heavy quarks, that are mostly top

quarks, is less important because the cross section is about five times smaller

than that forW±H or ZH for mH < 200 GeV; for Higgs masses above 500 GeV

it reaches above W±/ZH cross section but is still far below the gluon fusion

cross section. However for light Higgs bosons, ttH is expected to be the only

channel where H → bb is observable. So although the largest cross section is

that of the gg → H production on the whole mass range, it is often convenient

to consider associated production channels.

Considering that the maximum Higgs boson production cross section for low

Higgs masses is σH ≈ 30 pb and the total pp cross section at the LHC is

σtot ≈ 110 mb a major challenge of the LHC experiments becomes clear: com-

pared to other pp reactions, the Higgs boson signal is suppressed by ten orders

of magnitude. Extreme care has to be taken to understand and reject the back-

ground processes.

[GeV] HM100 200 300 400 500 1000

H+

X)

[pb]

→(p

p σ

-210

-110

1

10= 7 TeVs

LH

C H

IGG

S X

S W

G 2

010

H (NNLO+NNLL QCD + NLO EW)

→pp

qqH (NNLO QCD + NLO EW)

→pp

WH (NNLO QCD + NLO EW)

→pp

ZH (NNLO QCD +NLO EW)

→pp

ttH (NLO QCD)

→pp

Figure 1.10: Higgs boson production cross sections (fb) at the LHC,√s =

7 TeV, for different production channels as a function of the Higgs mass [17].

35


1.3.2 Decay Modes

In the Standard Model the possible Higgs boson decays channels are to pairs

of fermions or bosons, the most important decay modes are shown in Figure

1.11. As the production, the decay of Higgs is dependent on the mass of the

boson and the Branching Ratios (BRs19), which are the probability of decay in

specific final states, for the various Higgs decay channels are shown in Figure

1.12 as a function of the Higgs mass. The BRs are known at next-to-leading

order (NLO), including both QCD and electroweak corrections. Besides the pro-

duction modes, the branching ratio is the second mean feature that has to be

considered to find the best channels where the Higgs boson might be detected.

However it cannot be simply chosen the final states with the highest branching

ratios, since it is necessary to think if it is actually possible to recognize such sig-

nal events among backgrounds. There are two kinds of background: non-signal

events with the same experimental signature (“irreducible” background) and

misidentifications by the detector (“reducible” background). The latter is set

by the quality of the detector and is therefore to a certain extent controllable.

The former may be reduced by the application of constraints on the experi-

mental signature in the final analysis. In Table 1.4 the main decay channels

considered by ATLAS and the relative reducible and irreducible backgrounds

are scheduled [15, 16, 18].

The Higgs boson coupling to a particle is proportional to the mass of that

Figure 1.11: Tree level (first diagrams from left) and loop processes (lasttwo diagrams from left) for the Higgs boson decays.

particle, as seen in (1.80), thus its main decay channels are to the most massive

particles, insofar as the decay is kinematically allowed. So certain decay possi-

bilities open up only if mH is larger than the kinematical threshold as can be

seen in Figure 1.12. Note in particular the opening of the WW channel, which

causes a dip in the branching ratio of the ZZ(∗) channel and the subsequent

19It is defined Branching Ratio of an initial state |i〉 in a final state |f〉, the ratio betweenthe decay width |i〉 → |f〉 and the total decay width of |i〉:

BR(|i〉 → |f〉) =Γ|i〉→|f〉

Γtot(1.81)

36


Figure 1.12: The main branching ratios BR(H) of the Standard ModelHiggs decay channels [19].

Decay channel Irreducible background Reducible background

WH,H → bbWZ → lνbb tt→WWbbWbb→ lνbb Wj

ttH,H → bbttZ ttjjttbb Wjjjj

(WH, ttH, ZH), H → γγqq → γγ jjgg → γγ γj

qg → qγ → γγ Z → ee

H → τ+τ− Z → τ+τ− + jets

Z → ll + jets (l = e, µ)W → lν + jetstt (+jets)

WW/WZ/ZZ + jetsQCD di-jets

H → ZZ(∗) → 4lZZ(∗) tt→ 4l+X

Zγ(∗) Zbb→ 4l

H →WW (∗) → lνlνWW (∗) tt→WWbb→ lνlν +X

WZ/ZZ → lνlν +X Wt→WWb→ lνlν +X

H → ZZ → 4l Z(γ(∗))Z(γ(∗)) → 4l

H → ZZ → llνν ZZZjj

WZ/ZZ, tt

H →WW → lνjj WW → lνjjWjj

tt→ lνjjbb

Table 1.4: Higgs decay channels and relative backgrounds searched byATLAS.

37


opening of the ZZ channel which re-establishes the decay into both WW and

ZZ.

Since the pole masses of the gauge bosons and fermions are known (the electron

and light quarks masses are too small to be relevant) all the partial widths for

the Higgs decays into these particles can be predicted. The decay widths into

massive gauge bosons V = W,Z and into fermions are directly proportional to

the HV V and to Hff couplings, respectively. In leading order the partial decay

widths are given by the expressions [16]:

Γ(H → ff) = Nc

GFmHm2f

4√2π

[1−

4m2f

m2H

]3/2; (1.82)

Γ(H → V V ) =GFm

3H

16√2πδv√1− 4x(1− 4x+ 12x2), x =

m2V

m2H

, (1.83)

whereNc is the color factor: Nc = 3 (1) for quarks (leptons), δW = 2 and δZ = 1.

The massive decay particles usually have short life times themselves, thus only

their decay products are detected. The signatures of these decay products are

roughly divided in four categories: charged tracks (e.g. electrons, muons), jets

(e.g. quarks, hadrons), electromagnetic showers (e.g. photons, electrons), and

missing transverse energy (e.g. neutrinos). With the exception of the H → γγ

channel, the search for Higgs boson events starts with the reconstruction of

the intermediate particles from such signatures. The Higgs boson itself is then

reconstructed as a mass resonance, either from the final decay products or from

the reconstructed intermediates. The most promising channels are those where

the final state of the Higgs event stands out clearly against the huge background

of soft (hadronic) events. This limits the ability to discover the Higgs in several

of the existing channels.

The varying production cross sections and decay branching ratios for the Higgs

result in different approaches to a direct research at ATLAS20, which will be able

to eventually detect a SM Higgs boson over the full kinematic range between the

LEP limit and a theoretical upper limit of 1 TeV. To discuss the Higgs decays,

it is useful to split the mass range from 100 GeV to 1 TeV in three regions:

• the “low mass” range: mH . 130 GeV,

• the “intermediate mass” range: 130 GeV . mH . 180 GeV,

• the “high mass” range: 180 GeV . mH . 1 TeV.

20ATLAS is one of the experiments installed at LHC, its detector will be described inChapter 2.

38


1.3.2.1 Low Mass Region

In the “low mass” range (mH . 130 GeV), the most important decay chan-

nel, compatible with the accessible phase space, for the Higgs is H → bb

(BR∼ 75 − 50% for mH = 115 − 130 GeV, see Figure 1.12). Than there is

a set of final states with branching ratios one order of magnitude smaller than

bb, which are τ+τ− (BR∼ 7−5%), gg (BR∼ 7%) and cc (BR∼ 3−2%). Finally

the last ones with a probability of a few per mille are γγ and Zγ. As already

mentioned, decays to massless particles, gluons and photons, proceed through a

virtual loop of heavy fermions and/or gauge bosons with the major contribution

coming from the top quarks in the gluon channel and the W boson in case of

photons. Among all these final states not all of them can be used.

H → bb has the highest branching ratio at low masses, but it suffers from

huge QCD backgrounds (the background of this channel is the continuous tt

production). Therefore the detection of the Higgs boson is not feasible in the

inclusive production, associated productions can be used to gain additional re-

jection. This can be obtained considering final states such as ttH, WH, ZH

and exploiting the leptons coming from decays of gauge bosons and top quark21.

As listed in Table 1.4, the main irreducible backgrounds to this channel for the

associated production WH (searched in the final state lνbb) are WZ → lνbb

andWbb. The reducible backgrounds come largely from events with a bb pair in

the final state, mainly tt → WWbb, and from events in which jet are misiden-

tified as b, mainly W + jet. For the ttH production channel, the complexity of

the final state (lνjjbbbb, in which the leptonic couple and the jet couple come

from the decays of the W s coming from top quarks, one bb couple comes from

top quarks and the other from Higgs) allows to reduce the backgrounds, due to

W + jets and QCD multijet production. For instance, four b-jets are required

in the event. The irreducible backgrounds are due to ttbb and ttZ.

For the research of H → bb channel is required an efficient selection of b-jet22

from the jets for the reconstruction of their invariant mass and a good identi-

fication of the leptons. Experimentally it is searched high pT leptons (pT > 20

GeV) in the final state as channel signature. This decay gives further problems

in the reconstruction. The Higgs mass has to be reconstructed from two jets

giving trouble with the invisible energy from escaping neutrinos and energy lost

outside the jet cone. As a result the mass peak will be wide. The ttH channel

also suffers from the combinatorial problem of selecting the correct combination

21The HZ production is already suppressed in comparison with the HW production andtaking the factor three lower branching ratio for leptonic decays into account the HZ produc-tion mode is of limited interest as the rate will be low.

22The method called b-tagging is based either on the long lifetime of the b-quarks whichcauses secondary vertices or on the high amount of leptons in B meson decays. While HWwill in general have two b-quarks in the final state, the Htt will have four as t→Wb.

39


of b-jets.

H → γγ decay channel, despite its very low branching ratios of O(10−3), is one

of the most important channels in the low mass range. The Higgs boson does

not couple directly to photons, which are massless particles. The H → γγ decay

is however possible through a W± boson or top quark loop, where the Higgs

couples to this virtual W± boson or top quark which in turn couples to the two

photons.

The trigger is two isolated electromagnetic clusters. This decay mode suffers

less from background processes than the H → bb decay in associated produc-

tion23 and it is therefore the preferred channel in this mass range. Indeed, it

has a very clear signature with two isolated very energetic photons in the final

state forming a narrow invariant mass peak. The associated resolution on the

Higgs mass is around 1− 2 GeV.

This decay mode is particularly interesting for the case of the associated produc-

tion. In fact, if the associatedW± or Z boson decays leptonically, the (isolated)

lepton can be used to find the decay vertex, yielding a better mass resolution

than what is possible for the direct production.

The irreducible background comes from the direct production of γγ (together

with W , Z or tt in the case of associated production), while the main reducible

background comes from jj and jγ final states where the jets have been misidenti-

fied as photons. This can occur especially if the jet is composed by one leading

π0 and a number of soft hadrons. The rejection against these jets requires

high angular granularity to distinguish the two photons coming from the π0

decay. Reducible backgrounds are also events Z → e+e−, in which electrons

are misidentified as photons. These events are only a problem if mH ≈ mZ ,

which is ruled out by the LEP results. A photon that traverses material may be

absorbed and subsequently knock a high energy electron out of the material. If

this happens before the photon is detected, then the electron is observed instead.

Similarly, a high energy electron may lose a sizable fraction of its energy by the

emittance of a bremsstrahlung photon. Both processes lead to misidentification

of photons.

It is the reducible background that puts strict constraints on the detector. For

a general purpose detector, the ability to reconstruct photon conversions is

required, as well as a good electron identification. In this region the Higgs

natural width is negligible, as a result the sensitivity is heavily affected by the

di-photon mass resolution, which calls for excellent reconstruction of the energy

and the direction of the photons. This is obtained by the combination of the

23The same decay can occur trough a direct production or with an associated productionof WH, ZH or ttH in which in the final state there are jets generated from gauge bosons ortop quark decays.

40


high granularity presampler and first strip layer of the ATLAS electromagnetic

calorimeters; it is in fact required a good identification of photons from jets

along with a good resolution in energy and in the opening angle of photons.

H → τ+τ− decay mode has the second highest branching fraction at low masses.

The final states depend on the tau decays: 42% of the time both taus decay

hadronically, in 46% of the cases one goes to hadrons and the other one to an

electron or a muon, and the remaining 12% are fully leptonic modes. The fully

hadronic channels are very challenging, while semi-leptonic and leptonic ones

(H → τ+τ− → ντντ νl−νl+, where l = e, µ) can more easily lead to a discovery

at LHC using the vector boson fusion production mode. The tau products in

the central region and tagging jets from VBF in the forward part of the detector

with a rapidity gap containing little hadronic activity are a rare topology for

QCD events. The rapidity gap arises from the lack of color flow between the

initial interacting particles. A central jet veto is therefore included allowing ef-

fective discrimination against backgrounds. Neutrinos in the final state prevent

the full reconstruction of the event and the Higgs mass is calculated using the

collinear approximation, which assumes that the visible decay products of the

τ follow the same direction of the latter. This is a good approximation since

mH/2 � mτ and hence the taus are highly boosted. Resolutions around 30%

on mH are obtained.

The main sources of backgrounds are the Z + jets and tt production; in the

secondary backgrounds the W + jets and single top production are included.

The main background after the selection cuts is Z → τ+τ−, with an invariant

mass peak close to the signal for mH . 130 GeV.

It should be concluded that the bb decay mode is tipically overwhelming by the

QCD background, cc and gg are never considered because, firstly, if the hadronic

final states can be detected they would be dominated by bb, secondly, there are

no proper “tagging” algorithms for c quarks and gluons such as for b quarks.

Therefore at LHC the two main decay modes explored for the low mass range

are τ+τ− and γγ. Figure 1.13 show achievable discovery sensitivity results for

these channels at 30 fb−1, the importance of ttH in the region around the LEP

limit is evident from the plot.

1.3.2.2 Intermediate Mass Region

If 130 GeV . mH . 2mZ , the Higgs boson decays mostly into vector bosons:

WW (∗) or ZZ(∗), with one virtual gauge boson (noted by the “*” superscript)

below the 2mV (V =W ,Z) kinematical thresholds.

The main decay mode of the Z boson is hadronic (∼ 70%), typically resulting in

two jets. A large fraction of the leptonic decays are to two neutrinos (∼ 20%),

41


which are invisible. Decays to pairs of electrons, muons, or τs make up ∼ 10%

of the total. The W± boson decays mainly to hadrons as well (∼ 68%). The re-

maining leptonic decays are not to pairs, but always to a neutrino and a charged

lepton because of charge and lepton number conservation [1]. The neutrinos will

result in missing transverse energy.

A part from ZZ(∗) and WW (∗), the only other decay mode which survives is

the bb decay which has a branching ratio that drops from 50% at mH ∼ 130

GeV to the level of a few percent for mH ∼ 2mW . The WW (∗) decay starts to

dominate at mH ∼ 130 GeV and becomes gradually overwhelming, in partic-

ular for 2mW . mH . 2mZ where the W boson is real while the Z boson is

still virtual, strongly suppressing the H → ZZ(∗) mode and leading to a WW

branching ratio of almost 100% [16]. As can be seen in Figure 1.12 around 160

GeV, the threshold for the production of a pair of realW s, all BRs into fermions

and even into ZZ drop.

Therefore at this Higgs boson masses H → WW is the dominant decay mode,

with both W → lν. It has the highest discovery potential (see Figure 1.1324).

This is a consequence of the large branching ratios and rather clean signature.

The presence of two high-pT isolated leptons and large missing transverse energy

provides efficient trigger and great reduction against QCD processes. Unfortu-

nately, the involvement of neutrinos in the leptonic decays of W bosons implies

no narrow invariant mass peak can be reconstructed.

The irreducible backgrounds are due to the productions of the continuumW+W−

→ l+νl−ν 25 and ZZ → l+l−νν. The reducible backgrounds come from tt (the

two top quarks decay into a pair of W bosons and two b-jets), Wt, Wbb, bb,

W + jets, in which a jet is misidentified as an electron (lepton). The dominant

backgrounds are in particular WW and tt (as shown in Table 1.4) with real

leptons and neutrinos in the final state, they can be distinguished from signal

using jet activity and the angle between the leptons. For the signal, the charged

leptons tend to go together given the scalar nature of the Higgs and the chirality

of the neutrinos. The backgrounds in which the lepton pair come from a Z are

instead rejected requiring that the invariant mass of the di-leptonic system is

less than 80 GeV.

All the production modes can be explored, but an exclusive VBF analysis can

further improve signal over background ratios. The use of the vector boson

fusion production features, with the two hard jets in the forward and backward

regions usually identified with a specific forward tagging algorithm, and the lack

of hadronic activity in the central region indicated by applying a veto for jet

24In the decay H → W+W− → l+νl−ν plus two jets from the VBF production, thesignificance for 30 fb−1 of integrated luminosity is above 5σ for the full intermediate massrange.

25The WW continuum is of the order of the WW signal.

42


with transverse energy above some threshold, allows rejection of the reducible

background.

Up to mH ∼ 150 GeV and above mH ∼ 170 GeV the Higgs boson decay into

two Z bosons, which each decay into two leptons is important. Between the

two values, as already mentioned, the opening of the H → WW into on-shell

W bosons causes a drop in the H → ZZ BR, and thus reduces its possible con-

tribution for the discovery (see Figure 1.13). As the Higgs boson is not massive

enough, one of the Z bosons will be virtual. The most promising channels are

those where the Z bosons decay into electrons or muons. Although τ leptons

can be used, the reconstruction of a Higgs mass peak from Z → ττ decays is

difficult and relatively inefficient, because of the presence of neutrinos which

escape detection. Furthermore, the efficiency of identifying a τ -pair is rather

low. Therefore, only the decays into muons and electrons, and the backgrounds

that affect these channels, are to be considered.

The H → ZZ(∗) → 4l channel is rather clean and it is therefore referred to as

the “Golden Channel” at ATLAS. The excellent energy resolution and linearity

of the reconstructed electrons and muons leads to a narrow 4-leptons invariant

mass peak on top of a smooth background.

The presence of a real Z provide two high pT leptons in the final state together

with other two leptons coming from the virtual Z. A mass constraint can be

made on both lepton pairs. Then the analysis requires high quality lepton iden-

tification, lepton trajectory reconstruction, and lepton momentum resolution

from a detector. The former two are needed to find the four leptons signature

and the latter is important in order to be able to reconstruct the intermediate

(real) Z bosons, allowing to reduce the irreducible background coming from

the ZZ(∗) and Zγ(∗) continua, followed by only leptonic decay in the first case

and by also pair production in the latter case. The most important reducible

backgrounds are tt (tt→Wb+Wb→ lν + lνc+ lν + lνc) and Zbb (Zbb→ llll)

events that result in a four leptons final state. The former dominate because of

the large top production cross section, the latter are harder to reject because of

the real Z in the experimental signature, but requirements of isolation for the

leptons in the final state, an efficient ability to identify the b-jets and impact

parameter requirements26 provide a sufficient background rejection.

In this channel, the Higgs mass can be fully reconstructed and as a result the

mass resolution is very important for mH < 220 GeV, where the Higgs is a nar-

row resonance, therefore the mass resolution is a driving factor for the discovery

potential. H → 4l analyses will be discussed in detail in Chapter 3.

26The impact parameter has average values for the signal that are lower than that forbackground.

43


1.3.2.3 High Mass Region

In the “high mass” range (180 GeV . mH . 1 TeV) the Higgs boson decays

exclusively into the massive gauge boson channels with a branching ratio of

∼ 2/3 for WW and ∼ 1/3 for ZZ final states, slightly above the ZZ threshold

[16]. While the latter involves two identical particles (ZZ), the former includes

two different ones (W±W∓), two bosons with opposite charge, which justifies

the factor 2 between them. Then even at the ZZ kinematic threshold, the

WW final state is still more probable than ZZ because the neutral current

(NC) coupling is smaller than the charged current (CC) one. Finally around

mH & 350 GeV also the decay in tt is allowed, but its branching ratio remains

smaller than WW and ZZ ones (see Figure 1.12). In particular for high Higgs

masses: the H → tt branching ratio is at the level of 20% slightly above the

2mt threshold and starts decreasing for mH ∼ 500 GeV to reach a level below

10% at mH ∼ 800 GeV. The reason is that while the H → tt partial decay

width grows as mH , the partial decay width into (longitudinal) gauge bosons

increases as m3H (see 1.82 and 1.83). Then the decay H → tt is not a Higgs

boson discovery channel at LHC.

In the mass region 180 GeV . mH . 700 GeV, the H → ZZ → 4l decay

provides a powerful, almost background free, channel for the Higgs discovery.

Since leptons coming from real Z bosons decay have an high pT , the ZZ and

Zγ backgrounds can be rejected with an efficient pT cut on the reconstructed

Z.

If mH & 700 GeV the discovery in this channel is limited by the decrease of the

inclusive Higgs production cross section and by the increase of the Higgs width.

A larger width of the signal increases the irreducible continuum background.

The large width of a heavy Higgs also makes impossible to observe a mass peak.

The full coverage of the theoretically allowed mH range is therefore obtained

by looking at more effective channels, even though they suffer from large QCD

backgrounds, like:

• H → ZZ → l+l−νν;

• H →W+W− → lνjj;

• H → ZZ → lljj.

The first channel has a six times larger branching ratio than the four leptons

channel27, it has a large missing transverse energy signature, which can be

exploited if the detector has a complete measurement of the energy flow without

holes. The leptons in the channels provide additional handles for experimental

27since the branching ratio of the Z into neutrinos is larger than that into leptons.

44


identification. The H → W+W− → lνjj has a fourfold larger rate than the

process with a double leptonic decay. The last channel is dominant with respect

to the purely leptonic signature, because of the great branching ratio for the

hadronic Z decay.

The main sources of background for these channels consist of the WW , WZ,

and ZZ continua, which are of the same order of magnitude as the signal, as well

as, tt, Zjj, and W±jj events, which are three orders of magnitude larger. The

reconstruction of the intermediate vector bosons and the subsequent application

of a mass window allow for an effective rejection of the latter three background

processes. The intermediates can have a very large transverse momentum (pT >

350 GeV/c), because of the high mass of the Higgs boson. The same is then

true for the jets and the leptons that result from their decay. The jets will be

confined to a small region, because of the Lorentz boost, and, with a sufficient

high granularity measurement of the energy flow to disentangle the two jets,

provide an additional signature to reject background. Finally, if the Higgs boson

was produced through gauge boson fusion, then there will be a color suppression

in the central region (none other central jets besides that ones coming fromW/Z

decay) and one or two forward jets with a large transverse momentum (one in

the positive η hemisphere, the other in the negative one). These jets are the

result of the quarks left over from the fusion (see Figure 1.9). These kinematic

features allow in particular to reduce tt background, which is characterized by

a great number of central jets.

1.3.3 Discovery Potential

Figure 1.13: The significance for the SM Higgs boson discovery in variouschannels at ATLAS as a function of mH considering 30 fb−1 collected dataover the full mass region at

√s = 14 TeV [20].

45


The observation of a signal with a significance of 5 standard deviations (5σ),

defined according to the estimator S/√B, where S (B) is the expected number

of signal (background) events, can be claimed as discovery of the Higgs boson.

Figure 1.13 shows the sensitivity for the Higgs discovery in units of S/√B for

the individual channels as well as for the combination of the various channels,

assuming integrated luminosities of 30 fb−1 at√s = 14 TeV. In this evaluation

no K-factors, i.e. higher–order QCD corrections to the Higgs production cross

sections and distributions, they are the ratio between NLO and LO total cross

sections: K = σNLO/σLO, have been included28. Actually this plot does not

reflect the current situation in fact at the end of 2012 ∼ 20 fb−1 should be

collected by LHC running at√s = 8 TeV; the current collected luminosity is of

∼ 5.61 fb−1.

1.3.4 The Higgs Mass and Total Decay Width

After the detection of Higgs boson the LHC experiments with higher inte-

grated luminosity may also look into some of its properties like mass and width.

An integrated luminosity of 300 fb−1 at√s = 14 TeV is assumed in the follow-

ing [16, 21, 20].

The Higgs mass can be measured with a very good accuracy. In the range below

mH . 400 GeV where the total width is not too large, a relative precision of

∆mH/mH ∼ 0.1% can be achieved in the channel H → ZZ(∗) → 4l. In the low

Higgs mass range, a slight improvement can be obtained by reconstructing the

sharp Hγγ peak. In the range mH & 400 GeV, the precision starts to deterio-

rate because the Higgs boson width becomes large and because of the smaller

production rates which increase the statistical error. However a precision of the

order of 1% can still be achieved for mH ∼ 700 GeV if theoretical errors, such

as width effects, are not taken into account.

The total decay width of the Standard Model Higgs boson is shown in Figure

1.14. For low masses, below 130 GeV, the Higgs boson is very narrow resonance

with ΓH . 10 MeV, but after the real and virtual gauge boson decay channels

open, the width becomes rapidly wider, reaching ∼ 1 GeV slightly above the ZZ

threshold. For larger Higgs masses, mH & 500 GeV, the Higgs boson becomes

obese: its decay width is comparable to its mass because of the longitudinal

gauge boson contributions in the decays H →WW , ZZ. For mH ∼ 1 TeV, one

has a total decay width of ΓH ∼ 700 GeV, resulting in a very broad resonant

structure. The resonance is no longer visible in the invariant mass plots since it

is too spread, it makes discovery impossible. So the Higgs boson signal cannot

28This is a conservative assumption, provided the K-factor for the signal process of interestis larger than the square root of the K-factor for the corresponding background process.

46


be detected in an invariant mass fit, but with an event counting experiment. In

this case the life time of the Higgs boson is so short that it would be improper

to call it “particle”. It would be better to look at the Higgs boson as an inter-

mediate state that enhances the cross sections or opens new final state channels

in the scattering of two particles. However, as previously discussed such masses

(mH & 500 GeV) are highly disfavoured by electroweak precision data.

The Higgs boson width can be experimentally obtained from a measurement

of the width of the reconstructed Higgs peak, after unfolding the contribution

of the detector resolution. This direct measurement is only possible, using

H → ZZ → 4l for Higgs masses larger than 200 GeV, above which the intrinsic

width of the resonance becomes comparable to or larger than the experimental

mass resolution, which is typically of the order of a few GeV. While the pre-

cision is rather poor near this mass value, approximately 60%, it improves to

reach the level of ∼ 5% around mH ∼ 400 GeV29 and the precision stays almost

constant up to a value mH ∼ 700 GeV. Below mH ' 200 GeV the width is too

Figure 1.14: SM Higgs total decay width in GeV as a function of mH [19].

small to be resolved experimentally and can only be determined indirectly.

29For the higher masses the intrinsic width becomes larger and its contribution to the totalresolution dominates compared to the detector resolution.

47

Chapter 2LHC and the ATLAS Detector

The Large Hadron Collider (LHC) [22] is currently the world’s largest hadron

collider, with protons accelerated in a 27 km circumference synchrotron and

colliding at never before reached energy. The LHC is designed to collide pp

pairs at a center of mass energy (√s) of 14 TeV at a high peak luminosity of

1034 cm−2s−1, and is currently colliding at a center of mass energy of 7 TeV. The

ATLAS (acronym for A Toroidal LHC ApparatuS) detector is one of the two

general purpose detectors positioned on the synchrotron ring, aimed to detect

new rare physics, in particular the Higgs boson.

This chapter presents an overview of the LHC in Section 2.1 and of the ATLAS

detector in Section 2.2, with a focus on the ATLAS subdetectors used to measure

energy and momentum of leptons, jet and missingET (missing transverse energy,

the neutrino signature). A brief description of the ATLAS trigger system and of

the lepton reconstruction is also presented in Section 2.2.8 and 2.2.9 and 2.2.10

respectively.

2.1 The Large Hadron Collider

The LHC is installed in a circular tunnel, previously occupied by LEP (Large

Electron Positron Collider), at the European Organization of Nuclear Research

(CERN1) near the French-Swiss border in the North-West of Geneva. The tun-

nel is about 27 km long and about 100 m deep underground.

The LHC is a high energy, high luminosity collider, aimed at discovering new

physics. The principal “modus operandi” is delivering as high a luminosity as

possible, expanding the statistical reach to very rare events, while delivering as

high an energy as possible, enabling the production of rare physics at higher

1Conseil Europeen pour la Recherche Nucleaire.

49

2. LHC and the ATLAS Detector 2.1 The Large Hadron Collider

probability. It is a proton-proton collider, protons are selected as collision ob-

jects to enable both factors. The use of protons rather than electrons enables a

higher collision energy by avoiding large energy losses by synchrotron radiation.

The choice of proton-proton instead of proton-anti-proton collisions is due to

the limited production capability of anti-protons.

As for accelerators like the LHC, the most important parameters are the beam

Figure 2.1: The cross section of the physics processes as a function of thecenter of mass energy at hadron collider. The dashed lines correspond tothe Tevatron (1.96 TeV) and LHC (7 and 14 TeV) collision energies [23].

energy and the number of interesting collisions, since the production rate of a

particular process varies with this two quantities (see Figure 2.1). The number

of collisions is quantified by the machine luminosity, defined as

L =N2

b nbfrevγr4πεnβ∗ F , (2.1)

where Nb is the number of particles per bunch, nb is the number of bunches

per beam, frev is the revolution frequency, γr is the relativistic factor, εn is the

50


normalized transverse beam emittance, β∗ is the beta function at the collision

point, and F is the geometric luminosity reduction factor due to the crossing

angle at the interaction point.

The number of events per second generated in the LHC collisions is given by

Nevent = Lσevent, (2.2)

here σevent is the cross section for the event under study.

The LHC is designed to collide proton beams at a center mass of energy of

14 TeV and an instantaneous luminosity of L =1034 cm−2s−1, which will extend

the frontiers of particle physics. The nominal number of bunches and protons per

bunch are nb =2808 and Nb =1.15·1011 respectively. The revolution frequency

of a bunch of protons is 11.245 kHz. Beam crossings are 25 ns apart.

Considering only the inelastic cross section, that is about 60 mbarn at√s = 7

TeV, the inelastic event rate at nominal luminosity (NInEvent) is

NInEvent = 1034 · 60 · 10−310−24 = 600 million/s (2.3)

The average crossing rate (R) is given by

R = nbfrev = 2808 · 11245 = 31.6 MHz, (2.4)

so the number of inelastic events per crossing at nominal luminosity (NCevent)

turns out to be

NCevent =NInEvent

R' 19 (2.5)

The LHC has also the capacity to collide heavy ions, in particular lead nuclei,

at 2.76 TeV per nucleon, at a design luminosity of 1027 cm−2s−1.

When the bunch-spacing is 75 ns or less, the beams are brought to collision

at a crossing angle to reduce beam-beam interactions at points other than the

nominal collision point.

The particle energy is mainly limited by the magnetic field of the bending

solenoid keeping the beam in a circular orbit. The magnets can produce a

magnetic dipole field of 8.33 T, required at a beam energy of 7 TeV.

The specific parameters for LHC at the latest and nominal operation are sum-

marized in Table 2.1.

2.1.1 Architectural Overview

As for a particle-particle collider, there should be two rings with counter-

rotating beams, unlike particle-antiparticle colliders that can have both beams

51


Parameter Late 2011 Nominal

Beam energy 3.5 TeV 7 TeVInstantaneous luminosity 3.6·1033 cm−2s−1 1034 cm−2s−1

Bunch spacing 50 ns 25 nsParticles per bunch 1.1·1011 1.15·1011Bunches per beam 1380 2808Crossing angle 120 µrad 285 µrad

β∗ 1 m 0.55 mεn 1.9-2.3 µm 3.75 µm

Table 2.1: The LHC parameters for the 2011 operation and their designvalues [22, 24]. The quoted value for the instantaneous luminosity is thepeak value.

sharing the same phase space in a single ring. Due to the limited space, it led to

the “two-in-one” superconducting magnet design. This design accommodates

the windings for the two beam channels in a common cold mass and cryostat,

with the magnetic flux circulating in the opposite sense through the two chan-

nels. The LHC synchrotron uses a total of about 1600 bending and focusing

magnets: 1232 identical dipole magnets, for keeping particles in their nearly cir-

cular orbits, and 392 identical quadruple magnets for focusing the beams. The

dipoles are placed in the curved sections of the LHC ring and the quadrupoles

in the straight sections. All of these magnets make use of superconducting

niobium-titanium (Nb-Ti) cables and operate at a low temperature of 1.9 K.

As shown in Figure 2.2(a), LHC is not a perfect circle. It is made of eight

(a) (b)

Figure 2.2: Figure (a): layout of the LHC ring with the four interactionpoint [25]. Figure (b): the LHC injection scheme [26].

arcs and eight straight sections. An insertion consists of a long straight section

plus two (one at each end) transition regions, the so-called “dispersion suppres-

52


sors”. The exact layout of the straight section depends on the specific use of

the insertion: physics (beam collisions within an experiment), injection, beam

dumping, beam cleaning. A sector is defined as the part of the machine between

two insertion points. An octant starts from the middle of an arc and ends in

the middle of the following arc and thus spans a full insertion.

Four insertions are used as experimental insertions, where six experiments are

installed: ALICE (A Large Ion Collider Experiment), ATLAS, CMS (Common

Muon Solenoid), LHCb (Large Hadron Collider beauty), LHCf (Large Hadron

Collider forward) and TOTEM (TOTal Elastic and diffractive cross section

Measurement). ALICE, ATLAS, CMS and LHCb are installed in four huge

underground caverns built around the four collision points of the LHC beams,

in particular ATLAS and CMS are located at the two highest-luminosity in-

teraction points diametrically opposite. TOTEM is installed close to the CMS

interaction point and LHCf is installed near ATLAS.

• ATLAS and CMS are two general purpose detectors designed to cover the

widest possible range of physics at the LHC, from the search for Higgs

boson to supersymmetry (SUSY) and extra dimensions.

• ALICE is a detector specialized in analysing lead-ion collisions. It will

study the properties of quark-gluon plasma2.

• LHCb is designed mainly for the study of B-physics. A specialized study

of the slight asymmetry between matter and antimatter present in inter-

actions of B-physics may lead to the discovery of new physics.

• LHCf is a small experiment that is constructed to measure neutral parti-

cles produced very close to the direction of the beams at the LHC. The

motivation is to test models used to estimate the primary energy of the

ultra high-energy cosmic rays.

• TOTEM is also a small experiment dedicated to the measurement of the

total proton-proton cross section with a luminosity-independent method.

2.1.1.1 Acceleration Chain

Before entering the LHC main ring, protons need to go through a pre-

acceleration chain, called the LHC injector chain, which increases their energy

to 450 GeV. The pre-acceleration chain involves a linear accelerator (LINAC)

and three smaller synchrotron rings. Figure 2.2(b) shows the injection scheme

of LHC.

2A state of matter where quarks and gluons, under conditions of very high temperaturesand densities, are no longer confined inside hadrons. Such a state of matter probably existedjust after the Big Bang, before particles such as protons and neutrons were formed.

53

2. LHC and the ATLAS Detector 2.2 ATLAS Detector

Hydrogen atoms are taken from a bottle containing hydrogen. Protons are pro-

duced by stripping electrons from hydrogen atoms, then they are preaccelerated

in a radio-frequency (RF) cavity to 750 KeV. After this, they are injected into

the LINAC2 which increases their energy to 50 MeV. Afterwards, they are in-

jected step by step into the Proton Synchrotron Booster (PSB), the Proton

Synchrotron (PS), and the Super Proton Synchrotron (SPS). The protons stay

in each of the three rings until they reach the targeted high energy. They leave

the PSB at 1.4 GeV and leave the PS at 25 GeV. In the SPS, they reach the en-

ergy 450 GeV, and are subsequently transferred to the LHC synchrotron, where

they are accelerated for 20 minutes to their nominal energy of 7 TeV. In Figure

2.2(b) the corresponding proton velocity at the end of each acceleration step is

denoted.

Once the desired energy is reached, the beams are kept at that energy for a few

hours (since 10 to 20 hours) and squeezed (compressed to be as thin as pos-

sible) so that the collisions at the experiments lead to a best-possible collision

rate (luminosity). The beams are then collided. The physics of the collisions is

recorded by the quoted experiments. Once the beams are depleted, the remains

are dumped and the current in the magnet ramped down. Then the whole story

restarts from the beginning.

On September 19th, 2008, the LHC suffered a quench incident during com-

missioning of the final LHC sector (sector 3-4) for operation at beam energy

5 TeV, resulting in a large helium leak into the tunnel and serious mechanical

damages to 24 dipole magnets and 5 quadruple magnets. After a yearlong shut-

down during which replacement magnets were installed, damages were repaired,

and quench monitoring system was improved, the LHC resumed operation in

late 2009. In December 2009, the first proton-proton collisions at the center

of mass energy of 900 GeV were delivered by the LHC. At the end of March

2010, first collisions at√s = 7 TeV were recorded at the LHC. LHC ran at 7

TeV until the end of 2011 and it has been decided that it will run at 8 TeV

until the end of 2012, after which there will be a 1.5 year shutdown to make

improvements. In 2014 the LHC will come online at 14 TeV.

2.2 ATLAS Detector

ATLAS (A Toroidal LHC Apparatus) [27, 28] is one of the four experiments

approved to run at the LHC, and it has been designed to be a general-purpose

detector, meaning it should be versatile enough to detect physics signals with a

wide range of signatures. The ATLAS detector is searching for new discoveries

in the head-on collisions of protons of extraordinarily high energy. ATLAS will

54


learn about the basic forces that have shaped our Universe since the beginning

of time and that will determine its fate. Among the possible unknowns are the

origin of mass, extra dimensions of space, unification of fundamental forces, and

evidence for dark matter candidates in the Universe.

2.2.1 Geometry and Definitions

In the following discussions, a right-handed cartesian coordinate system is

defined by placing the origin at the interaction point (I.P.) in the middle of the

detector. The z-axis is defined along the beam direction and the x − y plane

is transverse to the beam direction. The positive x-axis is defined as pointing

from interaction point to the center of the LHC ring and the positive y-axis is

defined as pointing vertically upwards. The side-A of the detector is defined as

that with positive z and side-C is that with negative z.

Customarily a cylindrical coordinate system is used as well, it is defined by R,

φ and θ.

• R is the radial vector from the interaction point and out (R =√x2 + y2);

• φ is the azimuthal angle. It is measured around the beam axis on the

perpendicular plane to z (−π < φ < π);

• θ is the polar angle, formed by the direction of the emitted particle with

the z-axis. It is measured from the beam axis (0 < θ < π).

At collider experiments it is quite common to introduce the pseudorapidity

variable η in place of θ. The pseudorapidity is related to θ as

η = − ln

(tan

θ

2

), (2.6)

it has the great advantage that transforms in addictive way under Lorentz trans-

formation, then the difference ∆η = η2 − η1 is a relativistic invariant.

In the case of massive objects such as jets, the rapidity

y =1

2ln

(E + pzE − pz

)(2.7)

is used instead, as the pseudorapidity can approximate the rapidity only when

it is possible to neglect the mass with respect to the energy.

The transverse momentum pT , the transverse energy ET , and the missing trans-

verse energy EmissT are defined in the x− y plane unless stated otherwise. Ad-

ditional parameters to fully describe the track of a particle are the transverse

55


impact parameter d0, which is the distance of the track’s point of the closest

approach to the beam axis in the transverse plane, and z0, the longitudinal

distance of this particular point (see Section 3.5.2). The spacial separation of

two particle tracks is expressed in terms of ∆R, in the pseudorapidity-azimuthal

angle space it is defined as ∆R =√

∆η2 +∆φ2.

2.2.2 Physics Requirements

The main goals of the ATLAS detector are to perform precision tests of QCD

and electroweak interactions and search for Higgs boson and new physics. In

particular, the search for the SM Higgs boson has been used as a benchmark

for the design of the ATLAS detector. Since the dominant decay channel of the

Higgs boson is unknown, due to its unknown mass, the detector has to be able to

cope the all possible decay scenarios. These benchmark requirements, combined

with the high luminosity, high beam energy and high background production at

the LHC have been translated into the following set of design requirements:

• Because of the very high luminosity and large particle flux, the detectors

need fast, radiation-hard electronics and sensor elements. In addition,

high granularity is needed to handle the large number of particles and to

reduce the influence of overlapping events.

• Large acceptance in pseudorapidity with almost full azimuthal angle cov-

erage is required. It ensures no high momentum particle can escape de-

tection.

• Good charged-particle momentum resolution and reconstruction efficiency

in the inner tracker are essential. For offline tagging of τ -leptons and b-

jets, vertex detectors close to the interaction region are required to observe

secondary vertices.

• Excellent electromagnetic (EM) calorimetry is needed for electron and

photon identification and measurements as well as a full coverage hadronic

calorimetry for accurate jet and missing transverse energy measurements.

• Good muon identification and momentum resolution over a wide range of

momenta and the ability to determine unambiguously the charge of high

pT muons are fundamental requirements.

• Highly efficient triggering on low transverse momentum objects with suffi-

cient background rejection is a prerequisite to achieve an acceptable trigger

rate for most physics processes of interest.

56


2.2.3 ATLAS Detector Overview

Figure 2.3: Schematic Atlas layout: the main subdetectors are depicted[27].

The ATLAS detector is housed in a hall about 100 m underground, in cor-

respondence of the “Point 1” of the LHC ring, the interaction point closest to

the Meyrin site. ATLAS covers almost the whole solid angle with its onion-like

structure. It is cylindrical, weighs approximately 7000 tons, is 44 m long and

25 m high. It is nominally forward-backward symmetric with respect to the

interaction point. The ATLAS detector is divided in three longitudinal regions,

one is central, the other two lateral; sub-detectors in the central part are named

with the barrel- prefix, the others with the extended barrel or end-cap prefixes.

In Figure 2.3 ATLAS and its sub-detectors are depicted: in the central part

near the beam line is housed the most internal sub-system, the tracker, embed-

ded into the 2 T solenoidal magnetic field; the solenoid is the structure around

the tracker and holds the electromagnetic calorimeter and this is surrounded by

the hadronic calorimeter. All around there are 8 giant coils providing for the

toroidal magnetic field: its goal is to bend the escaping muons, measured by the

external muon chambers. So the ATLAS detector is formed by six sub-systems:

the magnet system, the inner detector (ID), the calorimeters, the muon spec-

trometer, the trigger system and the data acquisition system. Each one of these

systems is formed by different parts which are distinguished by the technologies

used according to the functions to do.

Table 2.2 summarizes the general goals for the ATLAS sub-detectors and

their coverage. For the calorimeter, the requirement is on the energy resolution

σE/E; for the inner detector and the muon spectrometer, the requirement is on

57


Figure 2.4: Sagitta S in three-point measurement. l is the distance betweenthe outer measurements A and B. The sagitta S of the circular trajectorywith curvature radius ρ through the points A, D, B is defined as the shortestdistance CD from this trajectory to the the middle point C on the lineconnecting A and B.

the momentum resolution σpT/pT , which can be expressed as:

σpT

pT∼ σS

S∼ pTσS

l2B(2.8)

where B is the magnetic field strength, l is the length of the arc of the track

(determined by the size of the tracking detector) and S is the sagitta3, defined

as the maximum deviation of a circle from a straight line, see Figure 2.4. It is

determined from the measurement of the muon track position in three succes-

sive chamber stations. The precision track position measurement is performed

in bending plane of the toroidal magnetic field. This means measuring with high

precision the z coordinate of the track points in the barrel and the R coordinate

in the x − y plane in the end-caps. Note that the sagitta is larger and can be

measured with higher relative accuracy, when the distance l between the outer

measurements A and B in Figure 2.4 is larger.

In the following sections, descriptions will be given for the three major subde-

tector systems (inner detector, calorimeters and muon spectrometer) and for

the magnet system, followed by an introduction to the ATLAS trigger system

and lepton reconstruction.

2.2.4 The Inner Detector (ID)

The Inner Detector (ID), shown in Figure 2.5, is the heart of the ATLAS

detector with a length of 6.2 m and a diameter of 2.1 m. It is internally confined

by the beam pipe in which protons travel and exteriorly by the superconductor

central solenoid that provides a nominal magnetic field of 2 T along z-axis (see

Section 2.2.6). Due to close vicinity to the interaction point (I.P.), a high-

3The sagitta is linked to the transverse momentum pT of the muon:

pT =l2 ·B8S

(2.9)

58


Subdetector Required resolutionη coverage

Measurement Trigger

TrackingσpTpT

= 0.05%pT ⊕ 1% ±2.5

EM calorimetry σEE

= 10%√E

⊕ 0.7% ±3.2 ±2.5

Hadronic calorimetry (jets)

barrel and end-cap σEE

= 50%√E

⊕ 3% ±3.2 ±3.2

forward σEE

= 100%√E

⊕ 10% 3.1 < |η| < 4.9 3.1 < |η| < 4.9

Muon spectrometerσpTpT

= 10% at pT = 1 TeV ±2.7 ±2.4

Table 2.2: Required resolution and coverage of the main ATLAS sub-systems. Note that, for high-pT muons the muon spectrometer performanceis independent of the inner detector system. The units for E and pT are inGeV.

Figure 2.5: Cut-away view of the ATLAS inner detector [27].

59


granularity detector is required. It is designed to provide hermetic and robust

pattern recognition, excellent momentum resolution4 and both primary and

secondary vertex measurements for charged tracks above a given pT threshold

of ∼ 0.1 GeV and within the pseudorapidity range |η| < 2.5. It also provides

electron identification over |η| < 2.0 and a wide range of energies (between 0.5

GeV and 150 GeV).

The ID consists of three independent but complementary sub-detectors, with

the highest precision in the innermost layers near the interaction points:

• Closest to the interaction point a semiconductor pixel detector providing

3-dimensional space points and secondary vertex reconstruction;

• In the middle, a silicon strip detector (SCT, “Semiconductor Tracker”),

which provides 3-dimensional space points;

• Surrounding the other two, a straw tracker (TRT, “Transition Radiation

Tracker”), providing measurements in the bending plane and particle iden-

tification.

Each of these subdetector systems is separated into cylindrical barrel sections

with active detector elements perpendicular to the radial direction, and end-

cap sections with detector elements perpendicular to the beam; this design

optimizes the resolution perpendicular to the particle path and minimizes the

total material that particles pass through. A particle from I.P. traversing the

complete inner detector will cross on average at least 3 pixel layers, 4 SCT strip

layers and about 36 TRT tubes, see Figure 2.6(a). The main parameters of the

ID subdetectors are summarized in Table 2.3.

The inner detector will give a typical momentum resolution of

σpT

pT= 0.05%pT ⊕ 1%. (2.10)

The high-radiation environment imposes stringent conditions on the inner de-

tector sensors, on-detector electronics, mechanical structure and services. Over

the ten-year design lifetime of the experiment, the pixel inner vertexing layer

(B-layer) must be replaced after approximately three years of operation at de-

sign luminosity. In order to minimize dark current noise as radiation builds up

in the detection material, the silicon sensors must be kept at low temperature

approximately from −5 to −10 ℃. In contrast, the TRT is designed to operate

at room temperature.

4The magnetic field bends the charged particles thus allowing to measure the momentumby using the curvature of the tracks.

60


(a)

(b)

Figure 2.6: Figure (a): Drawing showing the sensors and structural ele-ments traversed by a charged track of pT = 10 GeV in the barrel inner de-tector (η = 0.3). The track traverses successively the beryllium beam-pipe,the 3 cylindrical silicon-pixel layers, the 4 cylindrical double layers (one axialand one with a stereo angle of 40 mrad) of barrel silicon-microstrip sensors(SCT), and approximately 36 axial straws contained in the barrel transition-radiation tracker modules. Figure (b): Plan view of a quarter-section of theATLAS inner detector showing each of the major detector elements with itsactive dimensions and envelopes. The labels PP1, PPB1 and PPF1 indicatethe patch-panels for the ID services. Figures are taken from [27].

61


System PositionResolution σ Channels

η coverage[µm] [106]

Pixels1 removable barrel layer Rφ = 10, z = 115 13.2 ±2.52 barrel layers Rφ = 10, z = 115 54 ±1.72× 3 end-cap disks Rφ = 10, R = 115 26.4 1.7-2.5

SCT4 barrel layers Rφ = 17, z = 580 3.2 ±1.42×9 end-cap disks Rφ = 17, R = 580 3.0 1.4− 2.5

TRT73 axial barrel straw panels 130(per straw) 0.1 ±0.7160 radial end-cap straw planes 130(per straw) 0.32 0.7− 2.5

Table 2.3: Main parameters of the inner detector [27].

2.2.4.1 The Pixel Detector

The pixel detector consists of three concentric cylindrical layers in the barrel

around the beam axis, at radii of 50.5 mm, 88.5 mm, 122.5 mm from the center

of ATLAS, so that all tracks in the region |η| <1.9 pass trough all three layers,

and three disks perpendicular to the beam axis in each and-cap at a distance of

495 mm, 580 mm, 650 mm, positioned in such a way that the region of three-

layer coverage is extended to |η| <2.5, as seen in Figure 2.6(b). Each layer

is made of overlapping, identical silicon sensors mounted on modules that are

segmented in R − φ and z into small rectangles, the pixels. The nominal pixel

size is of 50×400 µm2. Each charged particle will hit a cluster of sensors and

the amount of collected charge is used to determine the cluster’s center. The

resolution in the barrel is of 10 µm (R−φ) and 115 µm (z) and in the disks it is

10 µm (R−φ) and 115 µm (R). The innermost barrel layer, called the B-layer,

is only 50.5 mm away from the beam line and enhances the ability to identify

secondary vertices for b-tagging.

2.2.4.2 The SCT detector

Like the pixel detector, the SCT detector uses silicon sensors, which are

segmented into strips. The silicon microstrip tracker consists of four concentric

barrel layers and each of its end-caps have a total of nine layers ; the geometry

can be seen in Figure 2.6(b). Each of its layers have a double layer of strips

glued back-to-back, with the back-side strips rotated 40 mrad relative to the

front-side strips, so that when a charged particle passes through, information

on which strips on each side have charge deposited can be combined. The av-

erage width of the strips, the strip pitch, is 80 µm, which results in an intrinsic

point resolution of about 23 µm per single side measurement in the coordinate

perpendicular to the strip direction. The intrinsic accuracies per module in the

barrel are 17 µm (R − φ) and 580 µm (z) and in the disks are 17 µm (R − φ)

and 580 µm (R).

62


2.2.4.3 The TRT detector

The outermost component of the inner detector, the TRT, is a straw tracker

combined with transition-radiation detection for electron identification. The ba-

sic detecting unit of the TRT is polyimide drift tubes with 4 mm diameter. The

straws are installed in three cylindrical layers of barrel TRT modules and 20×2

disks of end-cap modules. Each module contains multiple layers of straws5. The

central anode wire, 31 µm in diameter, is made of tungsten plated with gold.

The cathode, on the inside of the tube itself, is made of aluminum protected by

a layer of graphite-polyimide. The standard gas mixture is 70% Xe, 27% CO2,

3% O2; the gas is ionized by passing particles, with an electron collection time

of 48 ns. The time until the ions reach the wire in the straw is measured to

compute the distance the particles traversed the tube from wire.

Polypropylene/polyethylene foils are installed in the space between the layers

of straws as the radiator, that cause high energy electrons (momentum above 2

GeV) to produce significant numbers of transition radiation photons6. The Xe

absorbs these low-energy photons, producing a significantly amplified signal; the

front-end electronics have a high-threshold discriminator to detect these signals,

allowing electrons to be identified as tracks with a significant number of high-

threshold hits. In fact there are two discriminators: one with a low threshold

to detect minimum ionizing radiation and the other with a high threshold to

detect transition radiation.

With a small average distance between the straws, the TRT provides a large

number of tracking points (typically 36) per track. The relatively low resolution

per tracking point (130 µm) is compensated by the large number of measure-

ments and the bigger size of the detector (Equation (2.8)).

2.2.5 The Calorimeters

The ATLAS calorimeters [29, 30] are sub-detectors used to measure the en-

ergy of particles. They comprise an electromagnetic calorimeter part and a

hadronic one, since different materials are needed for the measurement of elec-

trons and photons on one side and hadrons on the other. In Figure 2.7 a cut-

away view is shown of the ATLAS calorimeters. Each part consists of detectors

with full φ-symmetry and coverage around the beam axis. The electromagnetic

calorimeter is placed beyond the inner detector, and the hadronic calorimeter is

outside the electromagnetic one. The calorimeters closest to the beam line are

housed in three cryostats, one barrel and two end-caps. The barrel cryostat con-

5TRT contains up to 73 layers of straws interleaved with fibres (barrel) and 160 strawplanes interleaved with foils (end-cap).

6Transition radiation consists of x-ray photons emitted by charged particles traversing theboundary between materials with different dieletric constants.

63


Figure 2.7: Overview of the ATLAS calorimeter.

tains the electromagnetic barrel calorimeter, whereas the two end-cap cryostats

each contain an electromagnetic end-cap calorimeter (EMEC), at lower pseu-

dorapidity, a hadronic end-cap calorimeter (HEC), located behind the EMEC,

and a forward calorimeter (FCal) to cover the region closest to the beam. The

hadronic barrel calorimeter, normally called tile calorimeter (TileCal), is placed

outside the electromagnetic barrel calorimeter.

The calorimeters consist of alternate layers of an absorbing material where the

particles produce showers, loose their energy and are finally stopped and an

active material where the particle showers are measured. By this “sampling”

procedure the energy of the traversing particles can be determined.

The ATLAS calorimeter system cover the range |η| < 4.9, its segmentation is

such that several shower samplings are provided both in longitudinal and in

transverse direction. The calorimeters are designed in order to identify charged

and neutral particles and jets, and measure their energy. By measuring all these

energies, the missing energy in the transverse plane (EmissT ) can be calculated

by summing all the measured energy deposits. Missing energy can be caused

by neutrinos or possibly new physics, such as supersymmetry or models with

extra dimensions. Therefore calorimeters must provide good containment for

electromagnetic and hadronic showers, and must also limit punch-through7 into

the muon system. Hence, calorimeter depth is an important design consider-

ation. In the following sections the electromagnetic and hadronic calorimeters

are described in detail.

7Punch-through occurs when muons produced by light hadrons, such as pions and kaons,travel through the calorimetric system, reach the spectrometer and give rise to a backgroundsignal in it.

64


2.2.5.1 The Electromagnetic Calorimeter

The electromagnetic calorimeter [29, 30] is divided into a barrel (|η| < 1.475)

and two end-caps (1.375 < |η| < 3.2). Each region is housed in a separate cryo-

stat, which thermally isolates the detector and keeps it at ∼ 88.5 K.

The barrel calorimeter consists of two identical half-barrels, separated by a small

gap (4 mm) at z = 0, one covers the region with z < 0 and the other covers the

region with z > 0. Each end-cap calorimeter is mechanically divided into two

coaxial wheels: an outer wheel covering the region 1.375 < |η| < 2.5, and an

inner wheel covering the region 2.5 < |η| < 3.2. In the range |η| < 1.8 the elec-

tromagnetic calorimeter is preceded by a presampler designed in order to take

in account the energy lost by electrons and photons in the materials upstream

of the calorimeter.

Because of its resistance to radiation and its stability response over time the

sensing element is liquid argon (LAr), which fills the space between lead ab-

sorbers placed parallel to the beam direction in a geometric accordion structure,

which provides full coverage in φ without azimuthal cracks. Each lead absorber

is sandwiched between two sheets of stainless steel to provide structural sup-

port. In between absorber layers there are copper electrodes which are formed

by three sheets of copper separated by a thin insulating layer of polyimide. The

charged particle that passes through this elementary cell ionizes LAr: ionized

particles are picked-up by the external electrodes and these signals are coupled

in the inner one.

The barrel electromagnetic calorimeter and the precision region in the end-cap

calorimeters (1.5 < |η| < 2.5) are divided in depth into three longitudinal lay-

ers. The three segmentation levels vary in η and radial depth. The innermost

layer has the finest granularity along η. It is designed for γ/π separation and

for precise η measurements of neutral particles. The second layer collects the

largest fraction of the energy of the electromagnetic shower. The third layer

collects only the tail of the electromagnetic shower, it is used to help measure

high energy showers and separate between electromagnetic and hadronic show-

ers, therefore it is less segmented in η. The outermost region |η| < 1.5 of the

end-cap outer wheel and the inner wheel (2.5 < |η| < 3.2) are segmented in only

two longitudinal layers and have a coarser transverse granularity.

The total thickness of a module increases from 22 X08 to a maximum of 33

X0 in the barrel and from 24 X0 to 38 X0 in end-caps as |η| increases. The

8Radiation length is the typical length over which the energy of an electron is reduced bya factor e.

65


electromagnetic calorimeter is designed to have an energy resolution of9

σEE

=10%√E

⊕ 0.7% (2.11)

in the energy range spread from 2 GeV to 5 TeV. The resolution is worse in

the end-caps than in the barrel region due to the presence of more material.

The “crack-region”, i.e. the transition region between the barrel and the end-

cap cryostats (1.37 < |η| < 1.52), is usually not used for photon identification

nor for precision measurements with electrons since the energy resolution is

significantly degraded, despite the presence of scintillators to correct for the

energy lost in the barrel cryostat flange.

2.2.5.2 The Hadronic Calorimeter

The ATLAS hadronic calorimeter is composed by three parts with different

detection techniques in function of pseudorapidity:

Tile Calorimeter (TileCal) [31] is placed in the region |η| < 1.7, directly

outside the EM calorimeter envelope, behind the LAr electromagnetic

calorimeter. It is subdivided into a central barrel (|η| < 1.0) and two

extended barrels (0.8 < |η| < 1.7). It is a sampling calorimeter using

steel as the absorber and scintillating tiles as the active material. It is

segmented in depth in three layers for a total radial depth of 7.4 λ10. The

total detector thickness at the outer edge of the tile-instrumented region

is 9.7 λ at η = 0. The tiles are oriented radially and normally to the beam

line and are staggered in depth. Two side of scintillating tiles are read

out by wavelength shifting fibres into two separate photomultiplier tubes.

Between the barrel and the extended barrels there is a gap of about 68

cm, which is needed for the inner detector and LAr cables, electronics and

services. This gap region is instrumented with special modules, made of

steel-scintillator sandwiches and with thin scintillator counters allowing to

partially recover the energy lost in the crack regions. TileCal is designed

to have an energy resolution for the jets reconstruction of [31]

σEE

=50%√E

⊕ 3%. (2.12)

Hadronic End-cap Calorimeter (HEC) [32] is a copper/liquid argon sam-

9The first term is the stochastic term and reflects the statistical fluctuations in the develop-ment of the shower (like the number of particles and the fraction that is lost in the absorbers).The constant term represent local non-uniformities in the calorimeter response.

10Nuclear interaction length is the mean distance travelled by a hadronic particle beforeundergoing an inelastic nuclear interaction.

66


pling calorimeter with a flat-plate design, which covers the range 1.5 <

|η| < 3.2. The HEC located directly behind the end-cap electromagnetic

calorimeter shares each of the two liquid-argon end-cap cryostats with the

electromagnetic end-cap (EMEC) and forward (FCal) calorimeters. The

HEC consists of two indipendent wheels in each end-cap cryostat: a front

wheel (HEC1) and a rear wheel (HEC2), each wheel containing two lon-

gitudinal sections, for a total of four layers per end-cap. The wheels are

cylindrical with an outer radius of 2.03 m. The HEC1 and HEC2 modules

are made of 25 and 17 copper plates, respectively. In this region the en-

ergy resolution required for the jets reconstruction is the same of Equation

(2.12). Approximately 12 λ are required to fully contain the jets from the

14 TeV pp collisions at the LHC.

Forward Calorimeters. The forward calorimeters (FCal) are located in the

same cryostats as the end-cap calorimeters and provide coverage over

3.1 < |η| < 4.9. As the FCal modules are located at high η, at a dis-

tance of approximately 4.7 m from the interaction point, they are exposed

to high particle fluxes. This has resulted in a design with a very small

liquid argon gaps. The FCal is approximately 10 λ deep, and consists

of three modules in each end-cap made of a metal matrix with regularly

spaced longitudinal channels filled with the electrode structure consisting

of concentric rods and tubes parallel to the beam axis: the first (FCal1),

made of copper, is optimised for electromagnetic measurements, while the

other two (FCal2 and FCal3), made of tungsten, measure predominantly

the energy of hadronic interactions. The total thickness of the calori-

metric system allows to reduce the cascades and the background in the

spectrometer due to the punch-through, but the momentum resolution is

worst: FCal is designed to provide an energy resolution for hadrons of

σEE

=100%√E

⊕ 10%. (2.13)

2.2.6 The Magnet System

ATLAS features a unique hybrid system of four large superconducting mag-

nets. This magnetic system [33] is 22 m in diameter and 26 m in length, with a

stored energy of 1.6 GJ. The magnet system is composed of a central solenoid

(CS) and a system of three superconducting air toroids: a barrel toroid (BT)

and two end-cap toroids (ECT) arranged in a configuration so that there is

a zero magnetic field inside the calorimeter. The magnetic system covers the

pseudorapidity range |η| < 3 and is made with a structure in the air, i.e without

the use of iron, in order to minimize the multiple scattering of the muons, which

67


degrades the momentum measurement. All of the superconducting magnets op-

erate at 4.5 K.

The central solenoid is aligned on the beam axis and is placed outside the

inner detector before the electromagnetic calorimeter. The conductor is a com-

posite that consists of a flat superconducting cable located in the center of an

aluminum stabiliser with rectangular cross-section. The central solenoid is de-

signed to provide a 2 T axial magnetic field with a peak of 2.6 T. The solenoid

is designed to be as thin as possible and shares with the LAr calorimeter one

common vacuum vessel in order to minimize material thickness in front of the

barrel electromagnetic calorimeter.

The barrel and end-cap toroids provide magnetic field for the barrel and

end-cap muon tracking chambers, respectively. Each toroid consists of eight

rectangular coils (kept in position by 16 support rings) assembled radially and

symmetrically around the beam axis. The coils are of a flat racetrack type

with two double-pancake windings made of 20.5 kA aluminum stabilized Nb-Ti

superconductor. The field strength varies from 0.15 T to 2.5 T, with a peak

of 3.9 T, in each barrel coil and varies from 0.2 to 3.5 T, with a peak of 4.1

T, in end-cap coils. In order to reduce the amount of material in the spec-

trometer, the barrel coils are housed in separate cryostats, while there are two

end-cap cryostats housing eight coils each, because the end-cap coils are smaller

in size. The end-cap coils are rotated by 22.5 ℃ with respect to the barrel ones

in order to provide radial overlap with the barrel toroid and to optimize the

bending power in the interface regions of both coil systems. Due to the finite

number of coils the magnetic field is not perfectly toroidal, the transition region

(1.4 < |η| < 1.6) is marked by large changes of the field integral and the muon

momentum resolution will suffer most in this region due to uncertainty in the

bending power.

2.2.7 The Muon Spectrometer

2.2.7.1 Muon Spectrometer Design

Muons experience weak and electromagnetic interaction, but no strong in-

teraction. Therefore they rarely produce hadronic showers and, because of their

large mass compared to electrons, less frequently produce electromagnetic show-

ers via bremsstrahlung. Thus, the main energy loss mechanism for muons is

ionization. As a result, muons can pass through the calorimeters with little

perturbation and reach the muon spectrometer (MS).

The muon spectrometer is the outermost subsystem of the ATLAS detector. The

muon momentum can be determined by measuring the position of the muon at

three points in space. The trajectory of the muon is curved due to the magnetic

68


field and higher the momentum is lower the curvature is. The curvature is mea-

sured in the track fit where the magnetic field is known in detail. However, for

a good approximation and practical application the sagitta is used (see Section

2.2.3). In order to be able to reconstruct the momentum with the three-point

method, the muon spectrometer is designed such that every muon with |η| < 2.7

will cross at least three detector stations with the exception of a few regions with

less coverage, for example those regions with support structures or passages for

services. When a particle traverses only two stations, the I.P. is taken as the

third measurement and the momentum determination is based on the difference

between the angles to the I.P.. As there is a relatively large uncertainty on the

scattering in the calorimeter, such a measurement is less precise.

The muon spectrometer is designed with the requirement of a 2-3% accuracy

Figure 2.8: Cut-away view of the ATLAS muon spectrometer [27].

and a 10% precision on pT for <100 GeV and for 1 TeV muons respectively.

Given the magnet system, the sagitta will be about 0.5 mm for 1 TeV muons.

Therefore, to get a 10% error on the momentum, a 50 µm precision on the

sagitta is required. At low momentum (pT < 30 GeV), the resolution is dom-

inated by fluctuations in the energy loss of the muons traversing the material

in front of the spectrometer. Multiple scattering in the spectrometer plays an

important role in the intermediate momentum range (30 GeV < pT <200 GeV).

For pT > 300 GeV, the single-hit resolution, limited by detector characteristics,

alignment and calibration, dominates [18]. At the LHC, very high-energy (&100

GeV) muons can be produced. At such a high energy, the sagitta of the muon

track in the relatively small inner detector becomes too small to be accurately

measured, degrading the momentum resolution (Equation (2.8)). This makes

69


the muon spectrometer extremely important in detecting high-energy muons.

The ATLAS muon spectrometer has two main objectives: to provide a stan-

dalone11 and momentum dependent trigger and secondly to provide standalone

muon reconstruction. These objectives are each fulfilled by a separate system

of detectors: high-precision tracking chambers for accurate momentum mea-

surement in the pseudorapidity range |η| < 2.7 and fast response chambers for

effective triggering in the region |η| < 2.4. Figure 2.8 shows the layout of this

spectrometer. Muon momenta down to a few GeV (∼3 GeV, due to energy loss

in the calorimeters) may be measured by the spectrometer alone. Even at the

high end of the accessible range (∼3 TeV), the stand-alone measurements still

provide adequate momentum resolution and excellent charge identification.

Precision-tracking chambers in the barrel region are located between and on the

eight coils of the superconducting barrel toroid magnet, while the end-cap cham-

bers are in front and behind the two end-cap toroid magnets. The φ-symmetry

of the toroids is reflected in the symmetric structure of the muon chamber sys-

tem, consisting of eight octants. Each octant is subdivided in the azimuthal

direction in two sectors with slightly different lateral extensions, a large and a

small sector, leading to a region of overlap in φ . This overlap of the chamber

boundaries minimizes gaps in detector coverage and also allows for the relative

alignment of adjacent sectors using tracks recorded by both a large and a small

chamber.

The chambers in the barrel are arranged in three concentric cylindrical shells

around the beam axis, while in the two end-cap regions, muon chambers form

four large wheels, perpendicular to the z-axis. Figures 2.9 give cross-sections in

the planes transverse to, and containing, the beam axis. In the center of the

detector (|η| ≈ 0), a gap in chamber coverage has been left open to maintain

service access to the solenoid magnet, the calorimeters and the inner detector.

The size of the gap varies from sector to sector depending on the service neces-

sities, the biggest gaps of 1-2 m are located in the large sectors. Additional gaps

in the acceptance are in the feet region due to the supporting structure and in

the transition region, where barrel and end-cap parts overlap.

Because the expected rates vary with pseudorapidity, four different technolo-

gies are used to cover different η regions: Monitored Drift Tube Chambers

(MDT) and Chatode Strip Chambers (CSC) as tracking chambers, Resistive

Plate Chambers (RPC) and Thin Gap Chambers (TGC) as trigger chambers.

MDT covers the region up to |η| = 2.7, except for the innermost end-cap layers

where their coverage is limited to |η| < 2.0. In the forward region (2 < |η| < 2.7),

CSC are used in the innermost tracking layer. The trigger system covers the

11Muons only reconstructed using the muon spectrometer tracks are called standalonemuons, see the Section 2.2.10.

70


Figure 2.9: Left: Cross-section of the barrel muon system perpendicularto the beam axis (non-bending plane), showing three concentric cylindri-cal layers of eight large and eight small chambers. The outer diameter isabout 20 m. Right: Cross-section of the muon system in a plane containingthe beam axis (bending plane). Infinite-momentum muons would propa-gate along straight trajectories and typically traverse three muon stations.Figures taken from [27].

pseudorapidity range |η| < 2.4 and it serves a threefold purpose: provide well-

defined pT thresholds; provide bunch crossing identification; measure the muon

coordinate in the direction orthogonal to that determined by the precision-

tracking chambers. RPCs are used in the barrel and TGCs in the end-cap

regions. In Table 2.4 the parameters of the four technologies in the muon spec-

trometer are shown. The individuals devices will be discussed in the following

sections.

By design, each tracking station provides an error of approximately 35 µm. The

alignment system, based on tracks and an optical system, will give an additional

inaccuracy of 30 µm. These individual errors are sufficiently small to obtain the

required overall precision of 50 µm.

2.2.7.2 MDT

The barrel and most of the end-cap region are equipped with MDT cham-

bers for the precision measurement of muon trajectories. The chambers are

rectangular in the barrel and trapezoidal in the end-cap. A MDT is a drift

chamber formed by an aluminum tube with a diameter of ∼ 30 mm. The tube

wall functions as a cathode. The anode wire is a gold-plated tungsten-rhenium

wire with a 50 µm diameter and is positioned at the center of the tube.

As can be seen in Figure 2.10(b) a full MDT chamber consists of two groups

of tube layers, called “multi-layers”, separated by a spacer frame consisting of

71


Monitored Drift Tube (MDT)Coverage |η| < 2.7 (inner layer |η| < 2.0)Chambers number 1150Channels number 354384

Chamber resolution (RMS)− (φ)

35 µm (z)− (time)

Function Precision trackerCathode Strip Chambers (CSC)

Coverage 2.0 < |η| < 2.7Chambers number 32Channels number 31000

Chamber resolution (RMS)5 mm (φ)40 µm (R)7 ns (time)

Function Precision trackerResistive Plate Chambers (RPC)

Coverage |η| < 1.05Chambers number 606Channels number 373000

Chamber resolution (RMS)10 mm (φ)10 mm (z)

1.5 ns (time)Function Trigger, 2a coordinate

Thin Gap Chambers (TGC)Coverage 1.05 < |η| < 2.7 (2.4 for the trigger)Chambers number 3588Channels number 318112

Chamber resolution (RMS)3−7 mm (φ)2−6 mm (R)4 ns (time)

Function Trigger, 2a coordinate

Table 2.4: Parameters of the four sub-systems of the muon detector. Thequoted spatial resolution does not include chamber-alignment uncertainties.Contributions from signal-propagation and electronics need to be added tothe intrinsic time resolution of each chamber type [27, 34].

72


(a) (b)

Figure 2.10: Figure (a): Cross-section of a MDT tube is shown [27]. Figure(b): Mechanical structure of a MDT chamber. Three spacer bars connectedby longitudinal beams form an aluminum space frame, carrying two multi-layers of three or four drift tube layers. Four optical alignment rays, twoparallel and two diagonal, allow for monitoring of the internal geometry ofthe chamber. RO and HV designate the location of the readout electronicsand high voltage supplies, respectively [35].

three lateral support beams (“cross-plates”, RO, MI and HV in Figure 2.10(b))

interconnected by two “longitudinal beams”. An MDT chamber has an internal

alignment system, which continuously measures potential deformations of the

frame. The alignment system consists of a set of four optical alignment rays,

two running parallel to the tube direction and two in the diagonal direction.

MDT chambers in the middle or outer stations of the ATLAS spectrometer are

equipped with three layers of tubes per multi-layer. The MDT chambers clos-

est to the interaction point have been equipped with four layers of tubes per

multi-layer to optimize the pattern recognition performance at high background

rates. The tubes are filled with a 93:7 Ar:CO2 gas mixture at 3 bar absolute

pressure.

The basic detection principle of a MDT is that of a drift chamber, the tubes

operate in a proportional regime with a maximum drift time of ∼ 700 ns. The

drift tubes of the MDTs are aligned perpendicular to the beam axis and approx-

imately parallel to the magnetic field lines, providing z coordinate measurement

in the barrel and η(R) coordinate measurements in the end-caps. An MDT mea-

surement gives, instead of a precise position, a radius around the wire which

the particle has crossed perpendicular as shown in Figure 2.10(a). A single tube

measures the distance to the wire with a typical average resolution of 80 µm.

Therefore, the resolution on the central point of a track segment in a 3 (4)-tubes

multi-layer is 50 (40) µm; combining the two multi-layers into a chamber yields

an accuracy of 35 (30) µm. The position along the tube cannot be measured

and has to be provided by an external measurement. It can either be provided

73


using the information from the trigger chambers or by extrapolation of tracks

from the ID into the muon system.

2.2.7.3 CSC

In the end-cap region, in the inner station and for |η| > 2, the MDT cham-

bers are replaced by the CSC chambers. In this region, due to thermalised

neutrons coming from the calorimeter, the expected particle rates for high lu-

minosity running are expected to be higher than 150 Hz/cm2. Here, the CSC

technology is chosen because it combines high spatial, time and double track

resolution with high-rate capability and low neutron sensitivity. The CSCs are

segmented into large and small chambers in φ. The whole CSC system consists

of two disks with 8 chambers each. Each chamber contains four CSC planes

resulting in four independent measurements in η and φ along each track.

The CSCs are multi-wire proportional chambers with a cathode strip readout

and with a symmetric cell in which the anode-cathode spacing is equal to the

anode wire pitch. The (anode) wires are oriented in the radial direction, both

cathodes are segmented, one with the strips perpendicular to the wires provid-

ing the precision coordinate (η(R)) and the other parallel to the wires providing

the transverse coordinate (φ). A crossing muon will cause charges on several

strips. The precision coordinate is obtained by measuring the charge induced

on the segmented cathode by the avalanche formed on the anode wire, the CSC

resolution is of 60 µm per CSC plane, combining the eight measurements12,

the total chamber resolution in η is 40 µm. In the non-bending direction the

cathode segmentation is coarser leading to a resolution of 5 mm. Due to the

small gas volume and the used gas mixture of Ar:CO2=80:20, the sensitivity for

neutrons is low and the drift times are small, less than 40 ns, resulting in a time

resolution of 7 ns per plane.

2.2.7.4 RPC

In the barrel (|η| < 1.05) the muon trigger consists of RPCs. They are used

due to good spatial and time resolution as well as adequate rate capability. The

RPCs are positioned in three concentric layers around the beam axis, referred

to as the three trigger stations, as shown in figures 2.9 and 2.12. The two in-

ner chambers (RPC1 and RPC2) sandwich the middle MDT chambers, and the

outer layer (RPC3) is assembled on the outer MDT chambers: on top of the

MDT chamber for the large sectors, and below the MDT chamber for the small

12Each crossing muon will give four independent measurements in both η and φ.

74


Figure 2.11: Cross-section through a RPC, where two units are joined toform a chamber. Each unit has two gas volumes supported by spacers (thedistance between successive spacers is 100 mm), four resistive electrodes andfour readout planes, reading the transverse and longitudinal direction. Thesandwich structure (dashed) is made of paper honeycomb. The φ-strips arein the plane of the figure and the η-strips are perpendicular to it. Dimensionsare given in mm [27].

sectors. No gaps in φ are present in this configuration. Each station consists

of two independent detector layers, each measuring η and φ. Therefore, a track

going through all three stations deliver six measurements in η and φ.

A RPC is a gaseous parallel electrode-plate detector with a typical spatial reso-

lution of 1 mm and a time resolution of 1.5 ns. The basic detecting unit consists

of a thin gas gap formed by two resistive bakelite parallel plates13, separated by

insulating spacers. The gas gap (2 mm) is filled with a gas mixture of 94.7%

tetrafluorethane (C2H2F4), 5% isobutane (C4H10) and 0.3% SF6. The chambers

are operated in the avalanche mode with a typical electric field between plates

of 4.9 kV/mm.

As can be seen in Figure 2.11 a RPC trigger chamber is made of two rectangular

detectors, called units, contiguous to each other with a small overlap to avoid

dead areas for curvature tracks. Each unit consists of two such independent

detector layers, called gas volumes, which are separated by light-weight paper

honeycomb and are each read out by two orthogonal sets of metal pick-up strips

on the outer side of the plates. The η-strips are parallel to the MDT wires and

as such determine a position in the bending plane of the magnet, the φ-strips

are orthogonal to the MDT wires and measure the coordinate orthogonal to

the bending direction, so they provide a measurement of the position along the

MDT wire required for the precise calibration of the MDT tubes.

13The plates are made of phenolic-melaminic plastic laminate.

75


2.2.7.5 TGC

For the endcaps a slightly different trigger technology is chosen: the Thin

Gap Chambers (TGCs). They provide two functions in the end-cap muon spec-

trometer: the muon trigger capability and the determination of the second,

azimuthal coordinate (φ) to complement the measurement of the MDTs in the

bending (radial) direction. TGCs are positioned in four planes around the

beam axis, as depicted in figures 2.9 and 2.12. While the RPCs are physically

connected to an MDT counterpart, there is no such connection for the TGCs.

They are constructed as double-gap units, called doublets, and triple-gap units,

called triplets. At the end-cap middle (EM) station, one layer of TGC triplets

(TGC1, 1.05 < |η| < 2.7) is placed in front of the MDTs and two layers of dou-

blets (TGC2 and TGC3, 1.05 < |η| < 2.4) behind the MDTs. The EM TGCs

are mounted on the so-called wheels at |z| ∼ 14 m, they will give seven mea-

surements in total. The TGC1 layer provides second coordinate measurements

up to an |η| of 2.7, however since there are no coincidences in the other planes,

these measurements are not used for triggering. An additional layer of TGC

doublets (TGCI) is installed at the end-cap inner station (1.05 < |η| < 1.92), it

is located in front of the innermost tracking layer, it is segmented radially into

two non-overlapping regions: end-cap (EI) and forward (FI, also known as the

small wheel). EI TGCs are mounted on support structures of the barrel toroid

coils at |z| ∼ 7 m and they are only used for measuring the second coordinate.

TGCs are multi-wire proportional chambers operated in a saturated mode, with

the difference that the anode wire pitch is larger than the cathode-anode dis-

tance. The used gas mixture is CO2:n-C5H12=55:45. Position measurements

are obtained from both the pick-up strips (φ) and the wires (η). The TGCs

have a time resolution of 4 ns.

2.2.8 Trigger System

At high luminosity LHC running, the total pp collision rate reaches 40 MHz.

The resulting amount of data is far too large to be written to storage. To reduce

the total data flow without losing interesting physics events a preselection filter

was developed. The ATLAS trigger system is organized in three levels. Each

trigger level reduces the event rate by orders of magnitude. Each higher level

has more time per event available to make a more refined decision. The final

rate will be 200 Hz with an event size of about 1.3 MB.

76


Figure 2.12: Schematics of the muon trigger system. RPC2 and TGC3 arethe reference (pivot) planes for barrel and end-cap, respectively [27].

2.2.8.1 Level-1 Trigger

The level-1 trigger (L1) is a hardware based trigger, it performs the initial

event selection based on information from the calorimeters and muon detectors.

The calorimeter selection is based on information from all the calorimeters. The

L1 Calorimeter Trigger (L1Calo) aims to identify high-ET objects such as elec-

trons and photons, jets, and τ -leptons decaying into hadrons, as well as events

with large EmissT and large total transverse energy. A trigger on the scalar sum

of jet transverse energies is also available. For the e/γ and τ triggers, isolation

can be required. Isolation implies that the energetic particle must have a mini-

mum angular separation from any significant energy deposit in the same trigger.

The information for each bunch-crossing used in the L1 trigger decision is the

multiplicity of hits for 4 to 16 programmable ET thresholds per object type.

The L1 muon trigger is based on signals in the muon trigger chambers: RPCs

in the barrel and TGCs in the end-caps. The trigger searches for patterns of

hits consistent with high-pT muons originating from the interaction region. The

logic provides 6 independently-programmable pT thresholds: three associated

with the low-pT trigger (threshold range approximately 6−9 GeV) and three as-

sociated with the high-pT trigger (threshold range approximately 9−35 GeV).

The information for each bunch-crossing used in the L1 trigger decision is the

multiplicity of muons for each of the pT thresholds. Muons are not double-

counted across the different thresholds.

The L1 is designed to reduce the 40 MHz rate to approximately 75 kHz, with

77


the possibility to upgrade to 100 kHz. The decision time (latency), which is the

time from the collision until the L1 trigger decision, must be kept as short as

possible. The L1 trigger has a latency less than 2.5 µs.

The level-1 trigger defines also the so-called Regions of Interest (RoIs). These

are detector regions in η and φ coordinates, where interesting features have

been identified, hence where the L1 trigger has identified possible trigger ob-

jects within the event. These RoIs are used by the subsequent trigger as starting

point for more refined trigger algorithms. If an event is accepted by the L1 trig-

ger the full detector is readout and the data is passed to the level-2 trigger (L2).

Muon Trigger algorithm

The trigger in both the barrel and the end-cap regions is based on three trigger

stations each. The basic principle of the algorithm is to require a coincidence

of hits in the different trigger stations within a road, which tracks the path

of a muon from the interaction point through the detector. Each coincidence

pattern corresponds to a certain deviation from straightness, i.e. curvature of

the track, which is used as a criterion for the track to have passed a predefined

momentum threshold. The deviation from straightness is the deviation of the

slope of the track segment between two trigger chambers from the slope of a

straight line between the interaction point and the hit in a reference layer called

the pivot plane, which is the second layer in the barrel (RPC2) and the last

layer in the end-cap (TGC3), as illustrated in Figure 2.12. The width of the

road is a function of the desired cut on pT : the smaller the road, the higher the

cut on pT .

In the barrel the trigger algorithm operates in the following way: if a track

hit is generated in the second RPC doublet (the pivot plane), a search for

a corresponding hit is made in the first RPC doublet, within a road whose

center is defined by the line of conjunction of the hit in the pivot plane with

the interaction point. Only a 3-out-of-4 coincidence of the four layers of the

two doublets is required for the low-pT trigger. The high-pT algorithm also

requires 1-out-of-2 possible hits of the RPC3 doublet. The scheme of the L1

muon end-cap trigger is shown on the right hand side of Figure 2.12. A 3-out-

of-4 coincidence is required for the doublet pair planes of TGC2 and TGC3, for

both wires and strips, a 2-out-of-3 coincidence for the triplet wire planes, and

1-out-of-2 possible hits for the triplet strip planes. Trigger signals from both

doublets and the triplet are involved in identifying the high-pT candidates, while

in case of the low-pT candidates the triplet station may be omitted.

78


2.2.8.2 Level-2 Trigger

The L2 trigger is a software trigger which uses the output of the L1: it

uses the RoI information on coordinates, energy, and type of signatures to limit

the amount of data which must be transferred from the detector readout. L2

selections use, at full granularity and precision, all the available detector data

within the RoIs (approximately 2% of the total event data) to further reduce

the data rate to approximatley 3.5 kHz, with an event processing time of about

40 ms, averaged over all events.

2.2.8.3 Level-3 trigger

Events selected by the L2 trigger are passed on to the L3 trigger (Event

Filter (EF)) which uses the complex reconstruction algorithms also used in AT-

LAS offline event reconstruction. The L2 and the EF together are called the

High Level Trigger (HLT). EF further selects events down to a rate which can

be recorded for subsequent offline analysis. It reduces the event rate to ap-

proximately 200 Hz, with an average event processing time of order 4 s. The

HLT algorithms use the full granularity and precision of calorimeter and muon

chamber data, as well as the data from the inner detector, to refine the trigger

selections. Better information on energy deposition improves the threshold cuts,

while track reconstruction in the inner detector significantly enhances the par-

ticle identification (for example distinguishing between electrons and photons).

The decision for accepting an event is based on trigger menus. A trigger menus

is a set of one or more event characteristics (like EmissT or a muon) with certain

thresholds. The set of trigger menus can be adjusted depending on the luminos-

ity to use the full capacity of the bandwidth. Those events that have passed the

selection criteria are sorted into data streams: electrons, muons, jets, photons,

EmissT , and τ -leptons, and B -physics. As ATLAS uses inclusive streaming, an

event can be recorded in more than one stream. In addition to the physics

streams, there are also calibration streams that are used to calibrate the detec-

tors, and express streams that are used for monitoring and perform data quality

checks.

2.2.9 Electron Reconstruction and Identification

Electron reconstruction uses information from the calorimeter and the inner

detector to reject events that fakes an electron like photons (with or without

pair conversions), QCD jets (u/d/s-hadron decays), π0/η Dalitz decays (π0/η →e+e−γ), charged hadrons and muons (because of the potential emission of a

Bremsstrahlung photon), and to identify isolated (Z, W , t, τ or µ decays) and

79


non-isolated electrons (J/ψ, b-hadron or c-hadron decays).

In the moderate pT region (20-50 GeV), a jet-rejection factor exceeding 105 will

be needed to extract a relatively pure inclusive signal from genuine electrons

above the residual background from jets faking electrons. The required rejection

factor decreases rapidly with increasing pT to ∼ 103 for jets in the TeV region.

At present, two electron reconstruction algorithms in the range |η| < 2.5 have

been implemented in the ATLAS offline software, both integrated into one single

package and a common event data model [18]:

• The standard one (egammaBuilder), which is seeded from the electromag-

netic (EM) calorimeters, starts from clusters reconstructed in the calorime-

ters and then builds the identification variables based on information from

the inner detector and the EM calorimeters.

• A second algorithm (softeBuilder), which is seeded from the inner de-

tector tracks, is optimized for electrons with energies as low as a few GeV,

and selects good-quality tracks matching a relatively isolated deposition

of energy in the EM calorimeters. The identification variables are then

calculated in the same way as for the standard algorithm.

A third algorithm is dedicated to the reconstruction of forward electrons (|η| >2.5), where no track matching is required because of the limited coverage range

of the ID (|η| < 2.5). Then, in contrast to the central electrons, forward electron

reconstruction can only use information from the calorimeters. Reconstructed

electrons from different algorithms are merged and the overlap between differ-

ent algorithms is removed during the AOD (Analysis Object Data) production.

The variable ‘‘author’’ is defined to indicate which algorithm created a certain

electron. The standard electron is defined by: author= 1||author= 3, where

“1” means the electron comes from the egammaBuilder, and “3” means both

the egammaBuilder and the softeBuilder find the electron. In cases where

both the algorithms find the same electron, the overlap is resolved and most

parameters from the standard electrons are kept with a few exceptions.

In the standard algorithm electron and photon reconstruction begins with the

creation of a preliminary set of clusters in the EM calorimeter whose size cor-

responds to 3 × 5 cells in η × φ in the middle layer. Electron and photon

reconstruction is seeded from such clusters with ET > 2.5 GeV, using a sliding

window algorithm14 over the full acceptance of the EM calorimeter. The final

cluster size is dependent on the particle hypothesis and the region of the detec-

tor: 3 × 5 for unconverted photons in the barrel, 3 × 7 for converted photons

14A sliding window algorithm with fixed size looks for regions of approximately 0.1×0.1 in∆η × ∆φ where the deposits exceed 2.5 GeV and defines the cluster position such that theenergy inside the window is maximized.

80


and electrons in the barrel, 5× 5 in all other cases15. Then a matching track is

searched for among all reconstructed tracks which do not belong to a photon-

conversion pair reconstructed in the inner detector. The track is required to

match the cluster within a broad ∆η × ∆φ window of 0.05 × 0.10. The ratio,

E/p, of the energy of the cluster to the momentum of the track is required to

be < 10. Approximately 93% of true isolated electrons, with ET > 20 GeV and

|η| < 2.5, are selected as electron candidates. The inefficiency is mainly due to

the large amount of material in the inner detector and is therefore η-dependent.

Various identification techniques can be applied to the reconstructed electron

candidates, combining calorimeter and track quantities and the TRT informa-

tion to discriminate jets and background electrons from the signal electrons.

Standard identification of high-pT electrons is based on many cuts which can all

be applied independently. Three reference sets of cuts have been defined: loose,

medium and tight, as summarised in Table 2.5 [18].

2.2.9.1 Loose cuts

This set of cuts performs a simple electron identification based only on lim-

ited information from the calorimeters. Cuts are applied on the hadronic leakage

and on shower-shape variables, derived from only the middle layer of the electro-

magnetic calorimeter (lateral shower shape and lateral shower width). This set

of cuts provides excellent identification efficiency of ∼ 88%, but low background

rejection ∼ 600 [36].

2.2.9.2 Medium cuts

This set of cuts improves the quality by adding cuts on the strips in the first

layer of the EM calorimeter and on the tracking variables:

• Strip-based cuts are effective in the rejection of π0 → γγ decays. Since the

energy-deposit pattern from π0’s is often found to have two maxima due

to π0 → γγ decay, showers are studied in a window ∆η×∆φ = 0.125×0.2

around the cell with the highest ET to look for a second maximum. If more

than two maxima are found the second highest maximum is considered.

The variables used include:

– ∆Es = Emax2 − Emin: the difference between the energy associated

with the second maximum (Emax2) and the energy reconstructed in

15In the barrel, electrons need larger clusters than photons in φ to collect bremsstrahlungphotons, as the electrons bend in φ due to the solenoid magnetic field. In the end-cap, all theparticles use the same window since the effect of the magnetic field is smaller. The windowsizes were chosen as a compromise between the spread of the energy deposits and the noise(the inclusion of more cells increases the noise).

81


Type DescriptionVariablename

Loose cutsAcceptance of the detector ? |η| < 2.47

Hadronic leakage ? Ratio of ET in the first sampling of thehadronic calorimeter to ET of the EM cluster.

Second layer ? Ratio in η of cell energies in 3× 7 versus 7× 7 cells. Rη

of EM calorimeter ? Ratio in φ of cell energies in 3× 3 versus 3× 7 cells. Rφ

? Lateral width of the showers.Medium cuts(include loose cuts)

First layer

? Difference between energy associated with ∆Es

of EM calorimeter

the second largest energy depositand energy associated with the minimal value

between the first and second maxima.? Second largest energy deposit Rmax2

normalised to the cluster energy.? Total shower width. wstot

? Shower width for three strips around maximum strip. ws3

? Fraction of energy outside core of three central strips Fside

but within seven strips.

Track quality? Number of hits in the pixel detector (at least one).? Number of hits in the pixels and SCT (at least nine).

? Transverse impact parameter (< 1 mm).Tight (isol)(include medium cuts)

Isolation? Ratio of transverse energy in a cone ∆R < 0.2

to the total cluster transverse energy.Vertexing-layer ? Number of hits in the vertexing-layer (at least one).

Track matching

? ∆η between the cluster and the track (< 0.005).? ∆φ between the cluster and the track (< 0.02).

? Ratio of the cluster energyto the track momentum. E/p

TRT? Total number of hits in the TRT.

? Ratio of the number of high-thresholdhits to the total number of hits in the TRT.

Tight (TRT) (includes tight (isol) except for isolation)? Same as TRT cuts above,

but with tighter values corresponding to about 90%efficiency for isolated electrons.

Table 2.5: Definition of variables used for loose, medium and tight electronidentification cuts. The cut values are given explicitly only when they areindependent of η and pT .

82


the strip with the minimal value, found between the first and second

maxima (Emin);

– Rmax2 = Emax2

1+9×10−3ET, where ET is the transverse energy of the clus-

ter in the electromagnetic calorimeter and the constant value 9 is in

units of GeV−1;

– wstot: the shower width over the strips covering 2.5 cells of the second

layer (20 strips in the barrel for instance);

– ws3: the shower width over three strips around the one with the

maximal energy deposit;

– Fside: the fraction of energy deposited outside the shower core of

three central strips.

• The tracking variables include the number of hits in the pixels, the number

of silicon hits (pixels plus SCT) and the tranverse impact parameter.

The medium cuts increase the jet rejection by a factor of 3-4 (up to 2000) with

respect to the loose cuts, while reducing the identification efficiency by ∼ 10%

(it is ∼ 77%) [36].

2.2.9.3 Tight cuts

This set of cuts makes use of all the particle-identification tools currently

available for electrons. In addition to the cuts used in the medium set, cuts are

applied on the number of vertexing-layer hits (to reject electrons from conver-

sions), on the number of hits in the TRT, on the ratio of high-threshold hits to

the number of hits in the TRT (to reject the dominant background from charged

hadrons), on the difference between the cluster and the extrapolated track po-

sitions in η and φ , and on the ratio of cluster energy to track momentum, as

shown in Table 2.5.

Two different final selections are available within this tight category: they are

named tight (isol) and tight (TRT) and are optimised differently for isolated

and non-isolated electrons. In the case of tight (isol) cuts, an additional energy

isolation cut is applied to the cluster, using all cell energies within a cone of

∆R < 0.2 around the electron candidate. This set of cuts provides, in general,

a reasonable electron identification efficiency of ∼ 64% (but the highest isolated

electron identification) and the highest rejection against jets (∼ 105) [36]. The

tight (TRT) cuts do not include the additional explicit energy isolation cut, but

instead apply tighter cuts on the TRT information to further remove the back-

ground from charged hadrons.

83


2.2.10 Muon Reconstruction and Identification

The ATLAS detector has been designed for efficient muon identification and

momentum resolutions as low as 3% for transverse momenta of pT = 200 GeV

and less then 10% up to pT = 1 TeV. This is achieved by a combination of mea-

surements from the inner detector and the muon spectrometer. For pT roughly

in the range between 30 and 200 GeV, the momentum measurements from the

inner detector and muon spectrometer may be combined to give precision better

than either alone. The inner detector dominates below this range, and the spec-

trometer above it [18]. As shown in the Section 2.2.8, muons with a momentum

higher than 6 GeV are triggered. However, muons with a lower momentum can

still be reconstructed in the muon spectrometer, where muons are identified and

measured with momenta ranging from 3 GeV to 3 TeV [27].

Muon reconstruction and identification is based on a combined usage of data

from three ATLAS sub-detectors: the muon spectrometer (MS), the inner detec-

tor (ID), and the calorimeter. The calorimeter, with a thickness of more than 10

λ, provides an effective absorber for hadrons, electrons and photons produced

by pp collisions at the center of the ATLAS detector. Energy measurements

in the calorimeter can aid in muon identification because of their characteristic

minimum ionizing signature and can provide a useful direct measurement of the

energy loss.

At ATLAS, four types of muons are defined to achieve high purity, efficiency and

momentum resolution: combined, stand-alone, segment tagged and calorimeter

tagged muons. The current ATLAS baseline reconstruction includes two algo-

rithms for each strategy. The algorithms are grouped into two families such

that each family includes one algorithm for each strategy. The output data

intended for use in physics analysis includes two collections of muons, one for

each family, in each processed event. The collections (and families) are referred

by the names of the corresponding combined algorithms: Staco and Muid. The

Staco collection is the current default for physics analysis [18].

Stand-Alone Muons (SA): Only the hits in the muon spectrometer are used

to reconstruct the track, see Figure 2.13(a). The standalone algorithms,

both families, start by identifying Regions of Activity, which are seeded

by the muon trigger chambers, and then employ a pattern recognition

algorithm to form local segments in each of the three muon stations in

these regions of activity. Next, the local segments are connected via a

three dimensional continuous track fit in the magnetic field to form track

candidates. Once the tracks have been found, they are extrapolated to the

beam line. The extrapolation must account for both multiple scattering

and energy loss in the calorimeter. Then at this step, despite the different

84


(a) Stand-Alone Muon: A track inthe muon spectrometer (blue), extrapo-lated through the calorimeter (orange) butwithout a matching inner detector track(dashed line).

(b) Combined muon: A track in themuon spectrometer (blue), extrapolatedthrough the calorimeter (orange) andmatched with a track in the inner detector(yellow).

(c) Segment Tagged Muon: An in-ner detector track (yellow) matched withone hit segment in the muon spectrometer(blue).

(d) Calorimeter Tagged Muon: An in-ner detector track (yellow) extrapolatedinto the calorimeter (orange) and compati-ble with the signature of a minimum ioniz-ing particle.

Figure 2.13: The four types of muon candidates defined at ATLAS: stand-alone (a), combined (b), segment tagged (c) and calorimeter tagged (d).

implementations, the general procedures are essentially the same for both

families.

The Staco family algorithm is called Muonboy. On the Muid side, Moore is

used to find the tracks and MuidStandalone for the inward extrapolation.

Muonboy assigns energy loss based on the material crossed in the calorime-

ter. Muid additionally makes use of the calorimeter energy measurements

if they are significantly larger than the most likely value and the muon

appears to be isolated.

Standalone algorithms have the advantage of slightly greater |η| coverage,up to 2.7 compared to 2.5 for the inner detector, but there are holes in

the coverage at |η| near 0 and 1.2. Very low momentum muons (around

a few GeV) may be difficult to reconstruct because they do not penetrate

to the outermost stations.

Combined Muons (CB): Muon spectrometer and inner detector perform an

independent track reconstruction. After successful combination, a joint

track is formed (see Figure 2.13(b)). Calorimeter measurements are taken

into account to reduce the fake signals of the standalone reconstruction.

Combined muons are the standard muon objects for physics analysis and

85


provide candidates of highest purity. The combined reconstruction covers

the range |η| < 2.5 due to the inner detector acceptance.

The Staco muon reconstruction attempts to statistically merge the two

independent measurements from the ID track and the MS track (this al-

goritm is called STACO). Muid does a partial refit: it does not directly use

the measurements from the inner track, but starts from the inner track

vector and covariance matrix and adds the measurements from the outer

track. The fit accounts for the material (multiple scattering and energy

loss) and magnetic field in the calorimeter and muon spectrometer. This

latter algorithm is called MuidCombined.

Segment Tagged Muons (ST): If the hits in the muon spectrometer are not

sufficient for a proper measurement, an inner detector track is still con-

sidered a muon, if the extrapolated track can be associated with a recon-

structed muon segment (see Figure 2.13(c)). In other words, tagged muons

are produced by propagating all ID tracks with sufficient momentum out

to the MS and searching for matching segments in the inner and middle

stations of the MS. The tagged muon reconstruction mainly aims to re-

construct low-pT muon tracks. Therefore segment tagged muons are used

to recover low detector efficiencies in low-pT and badly covered η regions.

The muon tagging covers |η| < 2 only. This strategy will provide infor-

mation in detector regions where standalone reconstruction is degraded,

such as the region near η = 0 and the transition region between barrel

and end-cap (|η| ∼ 1.2).

In the Staco muon reconstruction the tagged algorithm is referred to as

MuTag. In the Muid muon reconstruction, the tagged muons are found

by the MuGirl or the MuTagIMO algorithm. MuGirl considers all inner de-

tector tracks and redoes segment finding in the region around the track.

MuTag only makes use of inner detector tracks and muon spectrometer

segments not used by Staco combined algorithm. Thus MuTag serves only

to supplement STACO while MuGirl attempts to find all muons. In the

Muid collection, the overlap between combined (MuidCombined) and seg-

ment tagged muons (MuGirl) has to be taken into account, these overlaps

are removed by creating a single muon when both have the same inner

detector track.

Calorimeter Tagged Muons (CT): A trajectory in the inner detector is

identified as a muon if the associated energy depositions in the calorime-

ters are compatible with the hypothesis of a minimum ionizing particle

(mip16). Calorimeter tagged muons are reconstructed by separate algo-

16Particle that passing the matter releases the minimum ionization energy.

86


rithms with respect to Staco and Muid and they recover efficiency at η ∼ 0.

The standalone, the combined, and the tagged muons are merged to improve

the muon finding efficiency, and possible overlaps between different algorithms

are removed, i.e. cases where the same muon is identified by two or more

algorithms. The overlap removal requires muons have different inner detector

tracks and merges standalone muons that are too close to one another. Closeness

is defined by η − φ separation with a default limit of 0.4 [18]. Similar to the

electron case, the variable ‘‘author’’ is defined to indicate the algorithm by

which a certain muon is built.

87

Chapter 3Higgs search in the decay channel

H → ZZ(∗) → 4l

The search for the Standard Model Higgs boson is a major goal of the LHC.

As already said in 1.3.2, the experimentally cleanest signature for the discovery

of the Higgs boson is its “golden” decay to four leptons (electrons and muons):

H → ZZ → 4l. The excellent energy resolution and linearity of the recon-

structed electrons and muons leads to a narrow 4-leptons invariant mass peak

on top of a smooth background. At the same time Higgs analyses in four leptons

final states have great impact on the discovery sensitivity and if a discovery takes

place, H → 4l offers several possibilities to study the properties of the Higgs

boson.

This chapter details the first one of the two physics analyses studied in this

thesis, the cut-based analysis, both related to the searches for a SM Higgs boson

in four leptons final states. The present analysis was published by the ATLAS

Collaboration in [37]: “Search for the Standard Model Higgs boson in the decay

channel H → ZZ(∗) → 4l with 4.8 fb−1 of pp collisions at√s = 7 TeV” and all

details are available on the online Higgs group twiki page [38].

The chapter is organized as following: characteristics of the signal and main

backgrounds are discussed in Section 3.1, afterwards the pileup reweighting, the

lepton reconstruction and the applied corrections are described, then Section 3.5

explains the analysis strategy (event selection and mass reconstruction). Finally

the background estimation and the results are presented.

89

3. Higgs search in 4l 3.1 Signal and Main Backgrounds

3.1 Signal and Main Backgrounds

The main characteristics of the signal and main backgrounds, already men-

tioned in section 1.3.2, are here summarized. The importance of four leptons

final states on Higgs searches is partially due to the high branching ratio of

H → ZZ. For Higgs masses greater than 120 GeV, decays to a pair of Z

bosons, as already underlined in section 1.3.2, are above the 1% level, becoming

the sub-leading process for mH ∼ 160 GeV and contributing with roughly 1/3

of the branching fraction from 200 GeV on. The Zs then decay to charged lep-

tons, neutrinos or quarks, the latter ones being hardly accessible due to QCD

backgrounds. While both the taus and the neutrinos induce significant amounts

of missing transverse energy, events with only electrons and muons can be fully

reconstructed. From the experimental point of view, these are the cleanest sig-

natures available. The excellent transverse energy and momentum resolutions

for electrons and muons provide narrow invariant mass distributions when the

Z and Higgs bosons are reconstructed. The Higgs signal can be identified by

a peak on a 4-leptons invariant mass spectrum, sitting on top of a relatively

smooth background. For mH ≥ 180 GeV, H → 4l is the “golden channel”, with

the Higgs decaying to two on-shell Z bosons.

The main and almost only background in this region is the non-resonant pro-

duction of Z boson pairs, which is nearly irreducible, possessing the same char-

acteristics as the signal. The leading diagrams for H → 4l and ZZ → 4l are

represented in figures 3.1 and 3.2(a) respectively. In both cases, each Z can

decay to electrons or muons, leading to three final states: four electrons (4e),

four muons (4µ) or two electrons and two muons (2e2µ). The last one has twice

the yield of each of the other two modes.

Additionally to ZZ, the most important ones are SM processes which gener-

Figure 3.1: Main diagram for Higgs to four leptons production.

ate four real leptons with high-pT , such as Zbb and tt, represented in figures

3.2(b) and 3.2(c). In both cases the dominant contribution is from two leptons

originating from the leptonic decay of the Z or the W s, and the other two from

decays of the b-quarks. Lepton production from top quark decays is illustrated

in Figure 3.3.

90


(a) ZZ (b) Zbb (c) tt

Figure 3.2: Some of the diagrams for the backgrounds to H → 4l searches.

Despite the low fake rates, the role of events with three leptons plus a fake

Figure 3.3: Lepton production chains from top quark decays.

one or even two fakes must be carefully evaluated as their cross section can

be significantly higher than the ones with four leptons. The list of potentially

dangerous processes is thus complemented by WZ and Z + jets, with the W s

and Zs decaying leptonically, and finally by gg2ZZ, which is the loop-induced

gluon-fusion process gg → Z(∗)(γ(∗))Z(∗)(γ(∗)) → lll′ l′. In this last process for

Higgs masses below the Z-pair threshold, where one Z boson is produced off-

shell, in particular the photon contribution to the background is important [39].

Below mH = 2mZ , one of the Zs coming from the Higgs is off-shell and decays

to low momentum leptons, making the Zbb and tt backgrounds more harmful.

Their yields can exceed the ones for the signal by a few orders of magnitude

in this region, imposing additional selection cuts. Apart from the leptons from

W s and Zs, the dominant contribution is expected from semi-leptonic decays

of heavy flavour quarks (b and c). Exploiting the hadronic activity around the

leptons and the large lifetime of hadrons containing these quarks, both track

and calorimetric isolation provide good discriminating power, complemented

by impact parameter requirements, it will be discussed in Section 3.5.4. Re-

jection factors above 10 can be achieved with each discriminant, keeping the

backgrounds under control. On processes that do not contain vector bosons,

the rejection of the isolation cuts is much higher and thus their contribution is

91


negligible.

The challenges related to the analysis arise mainly from two factors. On one side

is the low signal cross section. The reduced branching ratio for Z → ll (∼ 3.4%)

and the presence of four leptons impose ultimate identification efficiencies for

both muons and electrons, which can be particularly intricate at low pT . At

the same time, the background yields are known with large uncertainties.

3.1.1 Data and Monte Carlo Samples

3.1.1.1 Data Samples

The data used in this analysis were recorded with the ATLAS detector dur-

ing the 2011 LHC run. The data are subject to a number of quality requirements

ensuring that all essential elements of the ATLAS detector are working as ex-

pected (see Section 3.5.1). The integrated luminosities are 4.81 fb−1, 4.81 fb−1

and 4.91 fb−1 corresponding to data analysed for the 4µ, 2e2µ and 4e final

states, respectively.

3.1.1.2 Monte Carlo Samples

The H → ZZ(∗) → 4l signal is modelled in the range 110 to 600 GeV using

the powheg Monte Carlo (MC) event generator[40, 41], which calculates sep-

arately the gluon and vector boson fusion production mechanisms of the Higgs

boson with matrix elements up to next-to-leading order (NLO). The Higgs boson

transverse momentum (pT ) spectrum in the gluon fusion process is reweighted in

order to include quantum chromodynamics (QCD) corrections up to NLO and

QCD soft-gluon resummations up to next-to-next-to-leading logarithm (NNLL).

powheg is interfaced to pythia [42] for showering and hadronization, which in

turn is interfaced to photos [43] for QED radiative corrections in the final-state

and to tauola [44, 45] for the simulation of τ decays. The cross sections for

Higgs boson production are derived to next-to-next-to-leading order (NNLO) in

QCD for the gluon fusion and vector boson fusion. In addition, QCD soft-gluon

resummations up to next-to-next-to-leading log (NNLL) are available for the

gluon fusion process [46], while the NLO electroweak (EW) corrections are ap-

plied to both the gluon fusion and vector boson fusion. The cross section times

the branching ratio values used for signal samples in the following are listed in

Table 3.1 [47, 38].

The simulated background samples considered in this analysis along with their

cross sections provided by the generators and their total number of events are

reported in Table 3.2. The background samples are generated in the following

ways:

92


mH MC Total σ·BR[GeV] Generator Events [nb]

130 powheg-pythia 200000 5.89 · 10−6

150 powheg-pythia 199998 8.95 · 10−6

180 powheg-pythia 50000 4.12 · 10−6

200 powheg-pythia 50000 13.60 · 10−6

360 powheg-pythia 50000 7.08 · 10−6

400 powheg-pythia 49999 5.53 · 10−6

600 powheg-pythia 50000 0.90 · 10−6

Table 3.1: Signal samples along with their Monte Carlo generator, theirtotal events number and cross section times the branching ratio values forthe gluon fusion processes in pp collisions at

√s = 7 TeV.

• The irreducible ZZ(∗) → 4l background is generated using pythia. pythia

implements the qq initial state and takes into account the Z − γ interfer-

ence.

• The inclusive Z boson1 and Zbb production is modelled using alpgen

[48]. alpgen generator is interfaced to jimmy [49] for simulation of the

underlying event. Indeed in the Zbb process the b quarks can lead also

to the emission of one or more partons, besides the two leptons. For the

inclusive Z boson and Zbb processes overlaps between the two samples are

removed2.

• For the tt production mc@nlo [50] is employed. mc@nlo generator is

interfaced to jimmy for simulation of the underlying event.

• The WZ background is produced with herwig [51].

• The gg2ZZ background is generated with jimmy.

ProcessesMC Total σ·BR Filter

Generator Events [nb] Efficiency

ZZ → 4l pythia 597958 7.3467 · 10−5 0.62

WZ herwig 249949 1.1481 · 10−2 0.31

gg2ZZ jimmy 65000 2.7900 · 10−6 0.60

tt mc@nlo 14965993 1.4562 · 10−1 0.54

Continues to the following page

1The inclusive Z boson term is referred to the Z+ jets process with subsequent Z → l+l−

decay. This process can contribute to the background if the accompanying jets are mis-identified as leptons.

2Namely, bb pairs with separation ∆R ≥ 0.4 between the jets are taken from the matrix-element calculation, whereas for ∆R < 0.4 the parton-shower jets are used.

93

3. Higgs search in 4l 3.2 Pileup Reweighting

Table 3.2 – continues from previous page

ProcessesMC Total σ·BR Filter

Generator Events [nb] Efficiency

Zbb

→ e+e−bb+ 0p (NoFilter) alpgen-jimmy 150000 6.5529 · 10−3 1




→ µ+µ−bb+ 0p (NoFilter) alpgen-jimmy 149950 6.5650 · 10−3 1




Z inclusve

→ e+e− + 0p (pT = 20GeV) alpgen-jimmy 6615302 6.6960 · 10−1 1






→ µ+µ− + 0p (pT = 20GeV) alpgen-jimmy 6614248 6.6956 · 10−1 1






→ τ+τ− + 0p (pT = 20GeV) alpgen-jimmy 10609203 6.6955 · 10−1 1






Table 3.2: Background samples along with their Monte Carlo generator,their total events number, filter efficiencies and the LO cross sections timesthe branching ratio values at

√s = 7 TeV, except for tt, which is NLO. It

is noted that in this table l = e, µ, τ . Exclusive channels were used for theZbb background with the number of additional partons,“p”, listed above.

3.2 Pileup Reweighting

As discussed in Section 2.1, in the LHC protons will collide every 25 ns at a

design instantaneous luminosity of 1034 cm−2 s−1. In each recorded event, apart

from the hard scattering interaction, on average 23 minimum bias proton-proton

interactions, varying according to a Poisson distribution, will be present. These

interactions contaminate the event of interest with additional charged tracks in

94


the inner detector and account as a considerable background. The challenge for

ATLAS then is understanding which tracks and energy deposits to attribute to

which interaction. The described phenomenon is called “pileup”. Pileup is dis-

tinct from “underlying events” in that it describes events coming from additional

proton-proton interactions, rather than additional interactions originating from

the same proton collision.

On top of this so-called “in-time pileup”, which refers to the additional min-

imum bias collisions “piled up” in each bunch crossing, comes concern about

“out-of-time pileup”, which refers to events from successive bunch crossings.

Eventually, the LHC will carry 2808 proton bunches per orbit, making them

exceptionally close in space and time (just 25 ns apart). This is faster than the

read-out response of many of the ATLAS sub-detectors, making the detector

sensitive to events from many bunch crossings. As a result the out-of-time pile-

up occurs because the signal from the calorimeter cells is integrated over a time

window larger than the time spread between two proton collisions.

The actual LHC bunch structure, however, is more complicated than this. The

bunches are arranged in “trains” of varying length, partially dependent on the

spacing between individual bunches. In 2011, the LHC ran with 75 ns spacing

between bunches for ATLAS data periods B-D, then switched to 50 ns spacing

for periods thereafter. With the 50 ns spacing, trains extend up to 144 bunches

in length. In addition to the finite train length, there are also bunch-to-bunch

variations in intensity, so different bunches within a train have varying lumi-

nosity. The machine parameter’s evolution over time results in variations of the

number of interactions occurring per bunch crossing and in the distance between

consecutive bunches. Figures 3.4(a) and 3.4(b) show the luminosity recorded

versus the average number of interactions per bunch crossing per group of pe-

riod and for each individual period.

The measured luminosity is multiplied by the inelastic cross section σinelasticpp

to obtain 〈µ〉, the average number of interactions per bunch crossing, which is

typically used to characterize the average amount of pileup. There is an appar-

ent 〈µ〉 variation between bunches, since details about previous bunch crossings

are not available for an event of interest, out-of-time pileup must be estimated

on average. As 〈µ〉 comes from the luminosity measurement, it is available for

each luminosity block (LB), which is the smallest possible portion of data where

the luminosity is determined. Then within a LB, it can either be averaged over

all bunches (〈µ〉|LB,BCID, where BCID is the bunch crossing ID), or calculated

separately for each bunch (〈µ〉|LB(BCID), averaged across the LB).

Usually, Monte Carlo samples are produced before or during a given data tak-

ing period. By that design, only a best-guess of the data pileup conditions can

be put into the Monte Carlo. Thus, there is the need at the analysis level to

95


(a) (b)

(c)

Figure 3.4: (a) Integrated luminosity for data period B-D, E-H, I-K and L-M versus average number of interactions per bunch crossing. (b) Integratedluminosity per data period B to M versus average number of interactionsper bunch crossing. (c) Distribution of the average number of interactionsper bunch crossing for MC11b and for period B-D, period E-H, period I-Kand period L-M. Figures are taken from [37].

96

3. Higgs search in 4l 3.3 Lepton Reconstruction and Identification

reweight the Monte Carlo pileup conditions to what is found in the data taken.

In this analysis a reweighting dependent on the distribution 〈µ〉|LB,BCID is ap-

plied to the Monte Carlo data and Figure 3.4(c) shows the average number of

interactions per bunch crossing for the different periods of the Monte Carlo that

simulates the data period. It has been proved that the reweighted Monte Carlo

correctly reproduces the different data periods.

3.3 Lepton Reconstruction and Identification

Lepton identification and reconstruction is of particular importance for the

H → 4l channel. Electron candidates consist of electromagnetic clusters to

which inner detector (ID) tracks are matched in a window between the clus-

ter position and the extrapolated track. The electron transverse energy (ET )

is computed from the cluster energy and the track direction at the interac-

tion point. The baseline electron identification in ATLAS relies on cuts using

variables that provide good separation between isolated electrons and jets, as

detailed in Section 2.2.9. These variables include calorimeter, tracker and com-

bined calorimeter/tracker information. Cuts on those variables can be applied

independently and the following new three reference selections have been defined

for the data Release 17, on which is based the current analysis, with increasing

background rejection power: loose++, medium++ and tight++. More gener-

ally, the ++menu offers better balanced performance (efficiency/rejection) than

the standard menu (loose, medium, tight).

Shower shape variables of the first and second calorimeter layer, hadronic leak-

age variables, track quality and the ∆η between the extrapolated track and the

cluster are used in the loose++ selection. Then the loose++ requirement adds

additional cuts to the standard loose operating point but cuts on them in a

looser way (standard loose cuts on shower shapes variables at the same value as

medium and tight). In particular the used variables for cuts are [52]:

• Shower shapes:

– el reta: the ratio in η of cell energy in 3× 7 versus 7× 7 cells (see

Table 2.5);

– rHad: the ratio between the transverse energy ET leakage in the

hadronic calorimeter and ET of the electromagnetic cluster;

– rHad1: the ratio between ET leakage in the first sampling of the

hadronic calorimeter and ET of the electromagnetic cluster;

– el weta2: the 3×5 window lateral width, it is defined as the η cluster

97

3. Higgs search in 4l 3.3 Lepton Reconstruction and Identification

variance weighted by the energy3;

– el wstot: the total shower width;

– el f1: the ratio between ET in the first sampling and the cluster

energy.

• Number of hits in the pixels and in the SCT: the number of pixel hits plus

the number of the pixel outliers is required to be ≥ 1, while the sum of

the number of hits in the pixel and SCT outliers and of hits in the silicon

detector (it is the number of hits in the pixels plus the number of hits in

the SCT) is required to be ≥ 7.

• Loose tracker-cluster matching in η: el deltaeta1, it is ∆η of the track

extrapolated to first calorimeter sampling and the cluster, it should be

less than 0.015.

• DEmaxs1:(el emaxs1-el Emax2

)/(el emaxs1+el Emax2

), where el emaxs1

is the maximum energy in strips and el Emax2 is the second maximum in

strips.

The medium++ requires one B-layer hit (if the module is not dead) and adds

extra selections on the impact parameter of the matched track and on the TRT

high threshold hits ratio. The medium++ menu offers efficiencies a few % lower

than of medium but with background rejections closer to those of tight.

The tight++ selection adds requirements on E/p (ratio of the cluster energy

to the track momentum), on the ∆φ between the extrapolated track and the

cluster, and on the number of TRT hits and also checks for overlaps with re-

constructed photon conversions. No substantial gains are made in tight++ over

tight. Tight++ offers slightly better efficiency (1-2%) in most bins with slightly

better rejection.

Muons are simply identified, as described in Section 2.2.10, by reconstruction of

tracks in the muon spectrometer alone (standalone), by the fitted combination

of inner detector and muon spectrometer tracks (combined), or by matching

an inner detector track of sufficient momentum with a reconstructed track seg-

ment of the muon spectrometer (segment tagged), or with energy deposits in

the calorimeters compatible with the hypothesis of a minimum ionizing parti-

cle (calorimeter tagged). Throughout this analysis, the loose++ electron and

combined or segment tagged muon selection are used if not explicitly stated

otherwise.

3wη2 =√(∑

i Ei · η2)/∑

i Ei −(∑

i Ei · η/∑

i Ei

)2.

98

3. Higgs search in 4l 3.4 Lepton Corrections

3.3.1 GSF Electrons

In the standard e/γ reconstruction, all tracks in the inner detector are fit-

ted using the pion hypothesis, this means that the algorithm does not allow

for any energy loss along the track. As a result the track momentum is un-

derestimated and the track parameters, especially on the bending plane, will

not be optimal. In fact electrons in ATLAS lose on average between 20% and

50% of they energy (depending on |η|) by the time they have left the SCT. The

bremsstrahlung emission introduces, in general, non-Gaussian contributions to

the event-by-event fluctuations of the calorimetry and tracking measurements.

By fitting electron tracks in such a way as to allow for proper modeling of the

energy loss due to bremsstrahlung, it is possible to improve the reconstructed

track parameters.

In this Higgs search analysis the Gaussian-sum filter (GSF) was used in order

to account for energy losses due to bremsstrahlung. GSF is a non-linear gener-

alization of the Kalman filter4, which takes into account non-Gaussian noise by

modeling it as a weighted sum of Gaussian components and therefore acts as

weighted sum of Kalman filters operating in parallel. By allowing for changes

in the curvature of the track, the bremsstrahlung recovery algorithms follow the

track better and correctly associate more of the hits. In this work, a dedicated

algorithm has been used in order to re-process the electrons (egammaBremRec).

This new algorithm can only use the existing e/γ clusters and the available track

particles, so it does not allow to “recover” electrons but it only does the “refit”,

allowing to change the best match between track and cluster, providing better

track parameters and it is also expected to reduce the rate of the misidentified

charge.

3.4 Lepton Corrections

For electrons and for muons various corrections are provided by the e/γ and

by the muon combined performance (MCP) group respectively and are applied

to the data and Monte Carlo samples. In the following these corrections are

only listed, for a more detailed description see the documentation provided by

the group experts.

• Energy scale corrections must be applied to the data only and just for

electrons. The used tool corrects electromagnetic cluster energy by apply-

ing the energy scales obtained from resonances such as Z → ee, J/ψ → ee

4The Kalman filter is a recursive estimator. This means that only the estimated state fromthe previous timestep and the current measurement are needed to compute the estimate forthe current state.

99

3. Higgs search in 4l 3.5 Event Selection

or E/p studies using isolated electrons from W → eν. The code is trivial

and simply rescales the energy of the electromagnetic cluster in certain η/φ

bins by applying energy scales using the formula Ecorr = E/(1 + scale).

• Since the Monte Carlo samples does not reproduce the lepton momentum

resolution in data, by default a smearing procedure should be applied both

to the electron ET and muon pT in Monte Carlo.

In particular muon momentum smearing is performed separately on the

inner detector and muon spectrometer tracks of the muon. The momenta

of calorimeter tagged and MS-segment tagged muons are the momenta of

the associated inner detector tracks. Therefore the momentum smearing

of the inner detector momenta has to be applied to the calorimeter or seg-

ment tagged muons. The momenta of standalone muons are based on the

muon spectrometer momentum measurement. Hence the MS momentum

smearing must be applied to standalone muons. In the current analysis

the smearing correction is applied to the q/pT distribution, where q is the

charge of the lepton, instead of to the simple pT one.

• The reconstruction scale factor (SF) is the ratio of the measured recon-

struction efficiency in data and in Monte Carlo and is used to correct the

Monte Carlo to more correctly model the observed data. In general, this

represents inefficiencies in the quality cuts used in selecting high pT lep-

tons. For muons, this is largely due to mismeasurements in the transition

region of the muon spectrometer, and for the electrons, this is largely due

to problems in electron identification. Therefore for electrons a recon-

struction SF, including track quality requirement, and an identification

efficiency SF are provided by the e/γ group. These two scale factors have

to be multiplied and the associated errors should be added quadratically.

It should be stressed that the energy corrections and smearing functions are

applied at the beginning of the analysis, the scale factors at the end.

3.5 Event Selection

After the description of the applied corrections, it can be discussed the cut-

based analysis strategy, consisting in making some cuts in order to select Higgs

events. The event selection criteria for this study can be divided into five parts:

1. the Good Run List (GRL), larError, vertex and trigger cuts application;

2. the event preselection, which includes the basic kinematic requirements

on leptons;

100


3. the creation of quadruplet candidates and the application of the mass

dependent criteria for the selection of a Higgs boson candidate, which are

based on the invariant mass of each of the two di-leptons from the two Z

boson decays;

4. the criteria for the rejection of reducible backgrounds, which rely on over-

lap removal and on isolation and impact parameter properties of the lep-

tons;

5. the Higgs boson mass reconstruction.

3.5.1 Preliminary Cuts

At the beginning of the analysis there are four important requirements that

the events must have in order to survive:

• The ATLAS Data Quality group monitors the functionality of the detector

and the quality of the taken data online, i.e. while data taking, as well as

offline. The data are required to satisfy a number of conditions ensuring

that all essential elements of the ATLAS detector - detectors, magnets,

trigger, etc. - were performing as expected while the data were collected

during LHC collisions. If there were problems of any kind, the corre-

sponding detector sub-system would get flagged accordingly. An analysis

depending on the proper functionality of one or more particular parts of

the detector can implement a list of run periods which were flagged “good”

for these sub-systems: a GoodRunsList (GRL). In this specific analysis the

events must satisfy the 4l GRL, that is applied only to data and in a sep-

arate way for each final state (4e, 4µ, 2e2µ) to maximize the integrated

luminosity.

• In electron and photon analysis a selection has to be applied to reject bad

quality clusters or fake clusters originating from calorimeter problems.

In particular, events with noise bursts and data integrity errors in the

LAr calorimeter can be identified with the flag larError; its value is 0

to indicate “OK” events, it is 1 in case of events with noise burst and

it is 2 if they are events with data integrity errors. For release 17 it is

recommended to remove all events with larError>1.

• The events must outgo the vertex cut, in which it is required at least one

reconstructed vertex with 3 associated tracks.

• The events must outgo the trigger cut: for the present study the lowest pT

101


Single-lepton triggers

Period B-I J K L-M

4µ EF mu18 MG EF mu18 MG medium EF mu18 MG medium EF mu18 MG medium4e EF e20 medium EF e20 medium EF e22 medium EF e22vh medium1

2e2µ 4µ OR 4e

Di-lepton triggers

Period B-I J K L-M

4µ EF 2mu10 loose EF 2mu10 loose EF 2mu10 loose EF 2mu10 loose4e EF 2e12 medium EF 2e12 medium EF 2e12T medium EF 2e12Tvh medium

2e2µ 4µ OR 4e

Table 3.3: Triggers used in data. In each data taking period, the OR ofsingle- and di-lepton triggers is used to select each signature.

MC trigger according to the data taking period

4µ EF mu18 MG, EF mu18 MG medium OR EF 2mu10 loose4e EF e20 medium, EF e22 medium, EF e22 medium1 OR EF 2e12 medium, EF 2e12T medium

2e2µ 4µ OR 4e

Table 3.4: Triggers used in the Monte Carlo samples.

single or double lepton unprescaled5 triggers are being considered. The

single lepton triggers with thresholds of 20 GeV or 22 GeV for electrons,

depending on the LHC instantaneous luminosity, and 18 GeV for muons;

and di-lepton triggers with thresholds of 12 GeV for electrons and 10 GeV

for muons were chosen. The list of triggers used is provided in Table 3.3

for data and in Table 3.4 for Monte Carlo, in this case the trigger matches

the unprescaled trigger during data taking. The efficiency of these triggers

on signal events, with respect to the offline selection, is close to 100% [37].

3.5.2 Event Preselection

Events passing the trigger selection are required to satisfy additional lepton

preselection criteria [38].

An electron must have been generated by the author 1 or 3 (see Section 2.2.9)

and it must be identified as a Loose++ GSF Electron, as discussed in Section

3.3. The kinematic requirements that it must have are:

• a pseudorapidity of the electromagnetic cluster, including the crack region,

of |ηCluster| < 2.47.

5A prescale is a random selection of events accepted by the trigger, used in order to reducethe rates of a given trigger signature, usually to cope with the limited bandwidth of eventrecording.

102


• a transverse energy of ET > 7 GeV (ET is the ratio between the electro-

magnetic cluster energy and the track direction);

In 2011 data and Monte Carlo, the quality of the electron object has to be

checked using the Object Quality Flag, it is required that el GSF OQ&1446==0.

Besides the electrons must have the longitudinal impact parameter with respect

to the primary vertex, the z value at the point of the closest approach, z0 (see

Figure 3.5), less than 10 mm to reduce the mean number of pileup vertices.

After these cuts it is kept the electron with highest ET on the cluster between

electrons sharing the same inner detector track. Finally, electrons sharing the

same inner detector track with a muon candidate, within ∆R =√

∆φ2 +∆η2 <

0.02, are removed.

The muons must be identified as Tight Muons6 for the Muid algorithm, instead

they must have been generated by the author 6 or 7 if it is considered the Staco

algorithm. The muons are selected by requiring:

• a transverse impact parameter relative to the primary vertex, defined as

the reconstructed vertex with the highest∑p2T of associated tracks among

the reconstructed vertices with at least three associated tracks, less than

1 mm to reject cosmic rays (|d0| < 1 mm);

• a pseudorapidity of |η| < 2.77.

• a transverse momentum of pT > 7 GeV;

Then in order to select “high η” muons, in the region of 2.5 < η < 2.7, if

the muon is identified as standalone, it is required that it has hits in all three

stations, otherwise if it is identified as combined or segment tagged, the following

inner detector hits requirements are applied:

• A pixel B-layer hit on the muon track except the extrapolated muon track

passed an uninstrumented or dead area of the B-layer.

• The number of pixel hits plus the number of crossed dead pixel sensors

> 1.

• The number of SCT hits plus the number of crossed dead SCT sensors

≥ 6.

6The classification of loose and tight muons depends on the level of calorimeter and trackerisolation of the candidate. The isolation in the calorimeter is based on the cell energies ina hollow cone of 0.1 < ∆R < 0.4. The tracker isolation is defined as the scalar sum of thetransverse momenta of all tracks in a cone of ∆R < 0.4 around the muon track. The energiesfor both calorimeter and tracker isolation are required to be less than 2.5 GeV (4 GeV) fortight (loose) muons.

7The η requirement has been removed completely for combined and segment-tagged mouns.The acceptance is limited by the acceptance of the muon spectrometer and the inner detector.

103


(a) (b)

Figure 3.5: An illustration of track parameters in the transverse (a) andlongitudinal (b) planes expressed with respect to the origin of the detectorand the primary vertex. Indeed these parameters can also be expressed withrespect to the point of closest approach to the interaction vertex (primaryvertex), or the beam-spot, indicated by the superscript “PV” and “BS“,respectively. d0 is the transverse impact parameter, i.e. the distance ofthe closest approach of the trajectory to the origin of the detector in thetransverse (x − y) plane. The point of the closest approach is referred toas perigee. z0 is the longitudinal impact parameter, i.e. the z coordinate ofthe trajectory at perigee. φ0 is the angle at perigee of the trajectory in thetransverse plane. The polar angle, θ, is the angle with respect to the z axismade by the trajectory.

• The number of pixel holes plus the number of SCT holes < 3.

• A successful TRT extension where expected (i.e. in the η acceptance of

the TRT). An unsuccessful extension corresponds to either no TRT hit

associated, or a set of TRT hits associated as outliers. Therefore defining

n = nhitsTRT + noutliersTRT , where nhitsTRT denotes the number of TRT hits on

the muon track and noutliersTRT denotes the number of TRT outliers on the

muon track, the technical recommendation is:

– if |η| < 1.9, it is required n > 5 and noutliersTRT < 0.9n;

– if |η| ≥ 1.9 and n > 5, it is required that noutliersTRT < 0.9n.

Finally these selected muons must fulfill the condition |z0| < 10 mm, as a con-

servative choice offering protection against the possibility of discarding physics

information and at the same time keeping the pileup contribution as small as

possible.

In the final stage of the above described event preselection, it is required that

the event contains at least four selected leptons (4µ, or 4e, or 2e2µ).

104


3.5.3 Quadruplet Candidates and

Higgs Candidate Selection

After the selection cuts, the candidate quadruplets are formed by select-

ing two same flavour (SF) and opposite sign (OS) lepton pairs with at least

two of these leptons having pT > 20 GeV in order to suppress the reducible

backgrounds, because leptons from on-shell Z bosons are expected to have con-

siderable transverse momentum, while those from the reducible backgrounds are

usually softer. Within a quadruplet, the SFOS di-lepton pair with a mass m12

closest to the nominal Z-boson mass is considered the primary di-lepton, while

the second di-lepton pair of the quadruplet with a mass m34 is the sub-leading

one. The physical argument for this choice comes from the Breit-Wigner distri-

bution: the Z is more likely to be found near the pole mass and this method

selects the correct candidates in more than 90% of the cases. The analysis is

splitted in 4 final states: 2µ2µ, 2e2µ, 2µ2e, 2e2e, where the primary di-lepton is

mentioned first. For 2e2µ and 2µ2e channels it is also required that: mee ≥ 15

GeV and mµµ ≥ 15 GeV.

Physics analysis are usually interested in knowing if a reconstructed offline ob-

ject cause the considered trigger chain to fire, meaning matches to a trigger

object passing the trigger. Thus, it is important to determine the relationship

between objects generated by offline reconstruction and objects generated by the

trigger algorithms (L1, L2, and EF), to easily map between reconstructed and

trigger objects. Therefore before the mass requirement on the leading di-lepton,

it is necessary to check the trigger matching: it must ask Higgs candidates to

match event trigger, in particular it is required to match at least one (for single

lepton triggers) or two (for di-lepton triggers) of the leptons in the quadruplet

to the trigger object.

Passed this check, for each quadruplet there is a mass window requirement ap-

plied to the invariant mass of each of the two di-lepton pairs. The cut values

are chosen event-by-event using the reconstructed four-lepton invariant mass.

m12 is required to be within 15 GeV of the nominal Z mass. m34 is required

to exceed a threshold, mthreshold, which varies as a function of the four-lepton

invariant mass, m4l, and it should always be below 115 GeV. A set of threshold

cut values is shown in Table 3.5, where the actual cut value used for any other

reconstructed Higgs mass is obtained by linear interpolation between these mass

points.

Finally the four leptons of the quadruplets are required to be well separated,

min[∆R(li, lj)] > 0.10, it will be justified in Section 3.5.4. When more than

one quadruplet is found, the one with a primary di-lepton mass closest to the

nominal Z mass, and with the highest pT leptons associated to the second Z,

105


m4l [GeV] ≤120 130 140 150 160 165 180 190 ≥ 200

mthreshold [GeV] 15 20 25 30 30 35 40 50 60

Table 3.5: Summary of thresholds applied to m34 for reference values ofm4l, the reconstructed invariant 4-leptons mass. For other m4l values, theselection requirement is obtained via linear interpolation.

which corresponds to the highest off-shell Z mass, is chosen. In this way only

one lepton quadruplet is selected for each event.

3.5.4 Reducible Background Rejection

Reducible backgrounds processes require additional lepton criteria to further

decrease their contributions since their cross sections are larger than that of the

Standard Model Higgs boson.

Further discrimination can be achieved, since the leptons originating from the Z

boson decays are expected to be significantly more isolated compared to those

originating from the leptonic decays of the heavy quark.

Leptons from Z boson decays are also expected to originate from the main

interaction point, while the leptons from b and c quarks should come from sec-

ondary displaced vertices. In the following, discriminators based on the above

discussed properties of the leptons will be studied in more detail. The discrimi-

nant variable cuts, adopted in the analysis, have been optimized by the ATLAS

Collaboration using the expected distributions for signal and backgrounds.

3.5.4.1 Lepton Isolation

In order to provide a strong suppression of the main reducible backgrounds,

first of all it has been required, as already seen, that the minimum value of ∆R

of leptons in the quadruplet satisfies: min[∆R(li, lj)] > 0.10; then calorimetric

and track-based isolation criteria have both been imposed on each muon and

electron8. Even though the two quantities are physically correlated, they carry

statistically uncorrelated information as they are measured in different parts

of the detector. Their combination can therefore improve the rejection of the

reducible backgrounds.

The track isolation discriminant is defined as the sum of the transverse mo-

menta of the inner detector tracks in a cone of radius ∆R < 0.20 around the

lepton normalized to the lepton pT (∑pT /pT ). Summed tracks are of good

quality and pass a minimum pT cut, in particular the considered tracks have at

least four silicon hits and pT > 1 GeV [37]; so that no significant bias by pileup

8Although partial calorimetric isolation along the η direction is already part of the electron-id requirement, an extra calorimetric isolation is applied.

106


interactions is introduced in the track isolation estimate with the introduced

requirements. After having defined the requirements for inclusion of an inner

detector track in the isolation cone, it can be noted that in the Higgs analysis

each lepton is required to have a normalized track isolation less than 0.15. The

inner detector track corresponding to the lepton of interest is excluded from

the sum. Moreover care is taken to exclude possible contributions to the lep-

ton isolation variables originating from overlap with other leptons of the Higgs

candidate quadruplet; the contribution of overlapping leptons is removed for

∆R < 0.20, in fact in this case pT of the leptons entering this cone size are

subtracted from the isolation energy of the lepton of interest.

The calorimetric isolation is defined in a similar manner, summing the trans-

verse energy ET deposited in the calorimeter cells inside the isolation cone

around the lepton and normalizing to the muon pT or electron ET . The calorime-

ter cells which are expected to receive energy deposit from the lepton itself, in

the case of electrons these are the cells containing the electromagnetic shower,

are excluded from the sum. All leptons of the quadruplet must have a normal-

ized calorimetric isolation, inside a cone of ∆R < 0.20, less than 0.30. The

contribution of overlapping leptons is removed for ∆R < 0.18. This strategy is

now implemented only for electrons. When multiple electrons are present in the

final state a performance degradation of the calorimetric isolation is observed

owing to events where the electromagnetic shower of an electron in the Higgs

quadruplet coincides with the isolation cone of another signal electron (or muon

in the 2e2µ channel). Thus, if the angular distance between two electrons (or

between a muon and an electron) in the Higgs candidate quadruplet is smaller

than 0.18, the ET of the former is subtracted from the total isolation energy of

the latter (it is always the muon in the 2e2µ channel).

Besides, the lepton energy is corrected to remove pileup effects before using it

to compute the normalized calorimetric isolation contribution. Indeed at a lu-

minosity L > 1033 cm−2s−1, the pileup effect has to be considered. It is known

that the track isolation is not affected, whereas the calorimetric isolation has

been chosen in order to minimize the pileup effects [37]. The chosen cone size

of ∆R = 0.20, for both track and calorimetric isolation, is found to have the

optimum performance in the desired signal efficiency region and, in addition, it

is expected to have less effect from pileup with respect to wider cones.

In general, the EtconeXX calorimeter isolation variables are calculated by tak-

ing a simple sum of calorimeter cell energies inside of a cone of a certain radius

around the cluster barycenter (∆R < 0.XX), excluding a 5× 7 grid of cells in

(η, φ) in the center of the cone; in this analysis the Etcone20 variable is used.

There are at least two effects that modify this value in unwanted ways [53]:

107


• An electron will leak some of its energy outside of this central core, and

will cause the isolation energy to grow as a function of ET .

• Soft energy deposits from pileup interactions will change the isolation

energy depending on the amount of activity in the current event (in-time

pileup) as well as previous events (out-of-time pileup).

Typically, in-time pileup increases the energy of the EtconeXX variables, while

out-of-time pileup actually tends to decrease it, as is discussed below. Both in-

time pileup and out-of-time pileup tend to increase the width of the EtconeXX

distributions. In-time pileup is fairly straightforward in its effects: particles

from other interactions in the same bunch crossing leave energy deposits in

the calorimeter, and when these fall within the cone used for the EtconeXX

variable, the observed energy increases. The larger the cone size used, the more

additional energy is caught. More energy deposits contributing to the isolation

sum increase the spread and thus the measured width.

To parameterize in-time pileup, tracking-related variables are typically used, as

the tracks found should be almost entirely from the bunch crossing of interest,

in particular in this analysis the number of reconstructed primary vertices with

at least two tracks associated (NPV ) is used. This gives a direct handle on the

number of additional interactions in the current bunch crossing, but does not

provide any information on the “hardness” of those interactions. In events with

many interactions, the vertex reconstruction may suffer and vertices may be

merged, leading to an underestimation of the number of additional interactions.

For the out-of-time pileup a typical handle is the average number of interactions

per bunch crossing, 〈µ〉, which was described above. Studies, carried out by the

experts, show that the isolation energy slowly decreases as the out-of-time pileup

increases, because it cancels the in-time pileup, and the larger cone sizes exhibit

a larger dependence on out-of-time pileup. For events with high NPV and low

〈µ〉, the peak of the EtconeXX distribution is higher, because in-time pileup

wins out. Conversely, for low NPV and high 〈µ〉, EtconeXX is low as out-of-time

pileup wins out. Looking at the width, it increases with both NPV and 〈µ〉consistent with expectations.

To make a simple, first-pass correction for pileup effects, the slopes of EtconeXX

versus NPV were taken from fits. A simple linear calorimetric energy correction

is then parameterized in NPV both for electrons and muons, because adding 〈µ〉

108


information little improves this9:

EtconeXX Npv corrected = EtconeXX − slope ·NPV (3.1)

After the corrections, the width of the EtconeXX variable distribution is slightly

improved, and because energy is subtracted off, the peak of the distribution is

shifted to smaller values.

It should be stressed that in this analysis there is no more a further correc-

tion for the electron leakage energy outside the cone, because the same results

were obtained using the leakage corrected EtconeXX variables and the EtconeXX

variables with no corrections applied [53].

3.5.4.2 Lepton Impact Parameter

As previously said, leptons from Zbb and tt are expected to originate from

displaced vertices, so they will have a larger impact parameter. The approach

that allows to reject the reducible background is to require that the impact

parameter significance, defined as the impact parameter of the lepton normal-

ized to its measurement error (|d0|/σd0), of all four leptons in the experiment

does not exceed a predefined value. The impact parameter is the distance of

the closest approach on the transverse plane and it is calculated with respect to

the event vertex fitted using a set of tracks reconstructed in the ID. This allows

to remove the effect of the transverse spread of the vertex position, which at

LHC is 15 µm. The sensitivity of the impact parameter significance is limited

by the uncertainties entering the impact parameter estimation, i.e. the intrinsic

impact parameter resolution (18 µm) and the uncertainty in the primary vertex

position (10 µm). However is more likely that the lowest pT leptons will origi-

nate from the decay of the b quarks. For this reason the adopted approach is to

apply the impact parameter significance cut only to the two lowest pT leptons10

of the quadruplet form4l <190 GeV and it is required to be < 3.5 for muons and

<6 for electrons. The difference is explained by the emission of bremsstrahlung

photons that limits the accuracy on the electron track reconstruction, indeed,

for electrons, bremsstrahlung smears the impact parameter distribution, hence

reducing the discriminating power of this cut with respect to muons [54].

Both for electrons and muons the efficiencies for the isolation cut and for

92D parameterized corrections, using NPV and 〈µ〉, were also considered by thee/γ group experts: EtconeXX Npv corrected = EtconeXX − p1 · NPV − p2µ orEtconeXX npv corrected = EtconeXX − (p1 + p2µ) · NPV . For both of these, however,p2 was found to be at least an order of magnitude smaller than p1, so the 1D correction basedon NPV was adopted.

10It is considered pT for muons and ET for electrons.

109


the impact parameter cut are above 99% [37]. These efficiencies decrease by few

percent when moving to lower ET electrons.

3.5.5 Higgs Boson Mass Reconstruction

The final discrimination variable is the mass of the lepton quadruplet. The 4-

leptons Higgs boson candidate mass reconstruction proceeds after selecting one

single lepton quadruplet in an event. For low Higgs masses, below 220 GeV,

the mass resolution on the Higgs candidates is directly affecting the sensitivity

of the Higgs searches. The resolution of the di-lepton mass can be improved by

applying a Z mass constraint to the pair with a mass closest to the Z invariant

mass. For Higgs boson masses of 200 GeV and above, when both Zs are on-

shell, the Z mass constraint can be applied to both lepton pairs. However for

Higgs masses larger than 230 GeV the Higgs natural width dominates over the

detector resolution and as a result the improvement in the Higgs mass resolution

is less important for the discovery potential. One additional advantage offered

by the Z mass constraint is the reduced sensitivity of the obtained resolution

on mis-calibrations and mis-alignments of the detector.

In the current application of the constraint fit, two particles are fitted using a

single constraint:

m2ll −m2

Z = 0 (3.2)

Nevertheless, in the case of the Z mass constraint there is an additional compli-

cation due to the intrinsic width of the Z boson, which is of similar magnitude

with the di-lepton mass resolution. Given the Breit-Wigner distribution of the

Z boson BW (m;mZ ,ΓZ), neglecting interference effects, and the di-lepton mass

resolution, which is described by a Gaussian distribution around the generated

mass of the di-lepton, i.e. centered at the measured Z value, G(mReco;m,σm)

with σm equal to the experimental resolution (∼ 1.7 GeV), the observed recon-

structed distribution for the di-lepton mass is given by the convolution of the

two distributions

f(mReco) =

∫ ∞

0

BW (m;mZ ,ΓZ) ·G(mReco;m,σm)dm. (3.3)

For a given reconstructed di-lepton mass the most likely mass at generation

level can be estimated by maximizing the quantity

L(m;mReco, σm,mZ ,ΓZ) = BW (m;mZ ,ΓZ) ·G(mReco;m,σm), (3.4)

this most likely mass is the one used as the constraint in the fit. Then this

mass divided for the reconstructed di-lepton mass provides a scale factor used

110

3. Higgs search in 4l 3.6 Background Estimation

to rescale the leptons momenta, and these rescaled momenta are used to com-

pute the Higgs mass. Finally, the Higgs candidate is accepted if the invariant

mass of the quadruplet is found within ±2σ with respect to the input mass.

It is noted that in the low masses, due to phase-space suppression, the Z line

shape could exhibit a stronger tail towards the lower masses with respect to

the distribution predicted by the Breit-Wigner distribution, however this effect

should not affect significantly these results since the leading di-lepton is required

to be within 15 GeV from the Z mass. This procedure does not introduce signif-

icant biases in the mean mass. When the electrons are included, the distribution

has an important tail towards low values due to bremsstrahlung upstream the

calorimeter. In the case of muons there is a small component probably due to

final-state radiation from Z decays, therefore in the 4µ channel the bias intro-

duced by the Z mass constraint is negligible. In general the shift on the mean

reconstructed mass as a function of the input value is way below the percent

level. Besides it has been shown that the Z mass constraint improves the mass

resolution by 10% to 17% [18].

Then the full set of cuts performed in this analysis are summarized in Table

3.6.

3.6 Background Estimation

The dominant ZZ(∗) background is estimated using MC simulation. Gen-

erated events are required to pass the complete analysis selection and the final

yield is normalized to its theoretical cross section and to the integrated lumi-

nosity.

For the Z+jets and tt processes data-driven methods are used. The estimation

of the dominant reducible backgrounds can be extracted by selecting appropri-

ate control regions where no signal is expected.

The control sample is formed by selecting events with a pair of same-flavour,

opposite-sign isolated leptons consistent with the Z boson mass, |mZ−m12| < 15

GeV, and a second same-flavour, opposite-sign lepton pair where kinematic,

but no isolation or impact parameter, requirements are applied. The Zbb back-

ground dominates the Z + µµ sample, and the Z + light jets background dom-

inates in the Z + ee sample. The heavy flavour contribution in the Z + µµ

final state is estimated by subtracting from data the light jet component ob-

tained from measurements of the rate at which other particles are misidentified

as muons. The Z + light jets contribution in the Z + ee final state is estimated

by extrapolation, using MC simulation, from a background-dominated region

defined by inverting the electron identification requirement on the transverse

111

3. Higgs search in 4l 3.6 Background Estimation

Preliminary cuts

4l GRL & larErrror 6= 2Vertex cutTrigger cut

Event Preselection

Combined orauthor==6 or 7 for Staco; tight for Muid|d0| <1 mm

segment-tagged|η| <2.7pT >7 GeV

muons:ID Hits requirements|z0| < 10 mm

GSF electrons:

author==1 or 3Loose++ quality|ηCluster| <2.47ET >7 GeVObject Quality requirement ((el GSF OQ&1446)==0)|z0| < 10 mme− e overlap removale− µ overlap removal

Event Selection

Kinematics Cuts:

At least 4 leptons.At least one quadruplet of 2 pairs of SFOS leptonsfulfilling the following requirements.At least 2 leptons with pT >20 GeV.Trigger matching.|mZ −m12| < 15 GeV.mthreshold < m34 < 115 GeV as in Table 3.5.

Isolation Cuts:

min[∆R(li, lj)] > 0.10 for all leptons in the quadruplet.

Track isolation: pT [cone20]/pT < 0.15overlap removalfor ∆R < 0.20

Calo isolation: µ : ET [cone20]/pT < 0.30 overlap removale : ET [cone20]/ET < 0.30 for ∆R < 0.18

IP cuts:

Apply IP cut to the 2 less energetic leptons.For µ : |d0|/σd0 < 3.5For e : |d0|/σd0 < 6For m4l > 190 GeV no requirement applied.

Table 3.6: Summary of the event selection requirements for the analysis(SFOS pairs means pairs of same flavour opposite sign and “l” stands forlepton).

112

3. Higgs search in 4l 3.7 Results

shower shape of the electromagnetic energy deposit11. Then these data-driven

backgrounds are extrapolated to the signal region by applying the efficiencies

found in MC simulation.

The normalization of the tt background, which also contributes substantially in

the Z + µµ final state, is verified using a control region of events containing

an opposite-sign electron-muon pair consistent with the Z boson mass and two

additional same-flavour leptons. In this control sample the observed events are

then compared to the tt expectation from Monte Carlo. The expected num-

bers of background events, with their systematic uncertainty, obtained by the

ATLAS Collaboration studies are summarized in Table 3.7 [37, 55].

3.7 Results

With the set of cuts described above, the reducible backgrounds were kept

under control as desired. The contribution of Z inclusive, Zbb, and tt is well be-

low the irreducible ZZ component. The survival rate of WZ and gg2ZZ events

is negligible. The signal is affected as much as ZZ throughout the analysis,

except when it comes to the Higgs mass window cut.

In 2011, as already said, LHC delivered an integrated luminosity of 5.6 fb−1

of pp collisions at 7 TeV center of mass energy. This outstanding performance

enabled the ATLAS experiment to collect and analyse an integrated luminosity

corresponding to 4.9 fb−1 of data fulfilling all quality requirements to search for

the Standard Model Higgs boson.

The search in the channel H → ZZ(∗) → 4l, here discussed, has been per-

formed for mH hypotheses in the full 110 GeV to 600 GeV mass range using

data corresponding to an integrated luminosity of 4.8 fb−1. The number of

events observed in each final state, evaluated separately for m4l < 180 GeV and

m4l ≥ 180 GeV, are compared with the expectations for background and signal

for various mH hypotheses by the ATLAS Collaboration studies. Their values

are summarized in Table 3.7. In total 71 candidate events are selected by the

analysis: 24 4µ, 30 2e2µ , and 17 4e events, while in the same mass range 62±9

events are expected from the background processes: 18.6 ± 2.8 4µ, 29.7 ± 4.5

2e2µ and 13.4± 2.0 4e. Figure 3.6 shows the m4l spectrum with superimposed

the total expected background and the Higgs signal expected from three mass

hypotheses.

The discovery of a SM Higgs boson can be claimed once the signal is considered

statistically significant, which means that it is unlikely to be reproduced by a

11Rη , defined in Table 2.5, is required to be < 0.7, in order to reduce the contributions toZ + ee, in which the additional electrons can originate from photons, e.g. from π0 decays, orheavy quark mesons decaying semi-leptonically.

113


4µ 2e2µ 4eLow-m4l High-m4l Low-m4l High-m4l Low-m4l High-m4l

Int. Luminosity 4.8 fb−1 4.8 fb−1 4.9 fb−1

ZZ(∗) 2.1± 0.3 16.3± 2.4 2.8± 0.6 25.2± 3.8 1.2± 0.3 10.4± 1.5Z + jets and tt 0.16± 0.06 0.02± 0.1 1.4± 0.5 0.17± 0.08 1.6± 0.7 0.18± 0.08

Total Background 2.2± 0.3 16.3± 2.4 4.3± 0.8 25.4± 3.8 2.8± 0.8 10.6± 1.5Data 3 21 3 27 2 15

mH = 130 GeV 1.00± 0.17 1.22± 0.21 0.43± 0.08mH = 150 GeV 2.1± 0.4 2.9± 0.4 1.12± 0.18mH = 200 GeV 4.9± 0.7 7.7± 1.0 3.1± 0.4mH = 400 GeV 2.0± 0.3 3.3± 0.5 1.49± 0.21mH = 600 GeV 0.34± 0.04 0.62± 0.10 0.30± 0.06

Table 3.7: The expected numbers of background events, with their sys-tematic uncertainty, separated into “Low-m4l” (m4l < 180 GeV) and “High-m4l” (m4l ≥ 180 GeV) regions, compared to the observed numbers of events.The expectations for a Higgs boson signal for five different mH values arealso given [55].

mere fluctuation of the background. Upper limits are set on the Higgs boson

production cross section at 95% C.L., using the CLs modified frequentist for-

malism with the profile likelihood test statistic. The test statistic is evaluated

with a maximum likelihood fit of signal and background models to the observed

m4l distribution. Figure 3.7(a) shows the expected and observed 95% C.L. cross

section upper limits calculated by the ATLAS Collaboration using ensembles of

simulated pseudo-experiments as a function of mH . The SM Higgs boson is ex-

cluded at 95% C.L. in the mass ranges 134 GeV−156 GeV, 182 GeV−233 GeV,

256 GeV−265 GeV and 268 GeV−415 GeV. The expected exclusion ranges are

136 GeV−157 GeV and 184 GeV−400 GeV [55].

The significance of an excess is given by the p0-value, it is the probability of

upward fluctuations in the background as high as or higher than the excesses

observed in data. The consistency of the observed results with the background-

only hypothesis expressed as p0-values is shown in Figure 3.7(b) over the full

mass range of the analysis. The most significant upward deviations from the

background-only hypothesis are observed for mH = 125 GeV with a local p0-

value of 1.6% (2.1σ), mH = 244 GeV with a local p0-value of 1.3% (2.2σ) and

mH = 500 GeV with a p0-value of 1.8% (2.1σ). The median expected local

p0-values in the presence of a SM Higgs boson are 10.6% (1.3σ), 0.14% (3.0σ)

and 7.1% (1.5σ) for mH = 125 GeV, 244 GeV and 500 GeV, respectively.

These values do not account for the so-called “look-elsewhere effect” (LEE),

which takes into account that such an excess (or a larger one) can appear any-

where in the search range as a result of an upward fluctuation of the background.

When considering the complete mass range of this search, the global p0-value for

each of the three excesses becomes of O(50%). Thus, once the look-elsewhere

114


effect is considered, none of the observed local excesses is significant by itself

[55].

[MeV]4lm100 200 300 400 500 600 700

310×

Eve

nts/

10 G

eV

0

2

4

6

8

10

4l→(*)ZZ→H

-1Ldt=4.8 fb∫=7 TeVs

Background =150 GeV)H

Signal (m

=180 GeV)H

Signal (m =360 GeV)H

Signal (m

DATA

Figure 3.6: m4l distribution of the selected candidates, compared to thebackground expectation for the full mass range of the analysis. The signalexpectation for several mH hypotheses is also shown. The resolution of thereconstructed Higgs mass is dominated by detector resolution at low mH

values and by the natural Higgs boson width at high mH .

3.7.1 Combined Results

It is also important to consider a preliminary combination of Standard Model

Higgs searches with the ATLAS experiment, using datasets corresponding to

integrated luminosities from 4.6 fb−1 to 4.9 fb−1 of pp collisions collected at√s = 7 TeV. It has been presented in [56], the individual channels contributing

to the combination are reported in Table 3.8, while the combined esclusion plot

is shown in Figure 3.8. The Higgs boson mass ranges from 110.0 GeV to 117.5

GeV, 118.5 GeV to 122.5 GeV and 129 GeV to 539 GeV are excluded at the 95%

confidence level, while the expected Higgs boson mass exclusion in the absence

of a signal ranges from 120 GeV to 555 GeV at the 95% C.L. or higher. An

exclusion of the Standard Model Higgs boson production cross section at the

99% C.L. is reached in the regions between 130 GeV and 486 GeV. An excess

of events is observed in the H → γγ and H → ZZ(∗) → 4l channels for a Higgs

boson mass hypothesis close to mH ∼ 126 GeV. The expected sensitivity in

terms of local significance for a 126 GeV Higgs boson for both of these channels

is approximately 1.4σ, while the observed local significances of the individual

excesses are 2.8σ and 2.1σ, respectively. The combined local significance of

115


the observed excess is 2.5σ, where the expected significance in the presence of

a Standard Model Higgs boson with mH = 126 GeV is 2.9σ. A preliminary

estimate of the global probability for such an excess to occur anywhere in the

full explored Higgs mass domain (from 110 GeV to 600 GeV) is approximately

30%, and in the range not excluded at the 99% confidence level by the LHC

combined Higgs boson search results (from 110 GeV to 146 GeV) it amounts to

approximately 10%.

(a) (b)

Figure 3.7: Figure (a): the expected (dashed) and observed (full line)95% C.L. upper limits on the Standard Model Higgs boson production crosssection as a function of mH , divided by the expected SM Higgs boson crosssection. The dark (green) and light (yellow) bands indicate the expectedlimits with ±1σ and ±2σ fluctuations, respectively. Figure (b): the observedlocal p0, the probability that the background fluctuates to the observednumber of events or higher, is shown as the solid line. The dashed curveshows the expected median local p0 for the signal hypothesis when tested atmH . The two horizontal dashed lines indicate the p0 values correspondingto local significances of 2σ and 3σ. Figures are taken from [55].

116


Higgs Subsequent mH range LDecay Decay [GeV] [fb−1]H → γγ − 110-150 4.9

H → ZZlll′l′ 110-600 4.8llνν 200-280-600 4.7llqq 200-300-600 4.7

H →WWlνlν 110-300-600 4.7lνqq′ 300-600 4.7

H → τ+τ−ll4ν 110-150 4.7

lτhad3ν 110-150 4.7τhadτhad2ν 110-150 4.7

V H → bbZ → νν 110-130 4.6W → lν 110-130 4.7Z → ll 110-130 4.7

Table 3.8: Summary of the individual channels contributing to the com-bination of the Higgs Standard Model searches. The central number in thethree-part mass ranges indicates the transition from low-mH to high-mH

optimised event selections [56].

Figure 3.8: The observed (full line) and expected (dashed line) 95% C.L.combined upper limits on the SM Higgs boson production cross section di-vided by the Standard Model expectation as a function of mH in the fullmass range considered in this analysis. The dotted curves show the medianexpected limit in the absence of a signal and the green and yellow bandsindicate the corresponding 68% and 95% intervals [56].

117

Chapter 4Angular Analysis and TMVA

The “golden channel“ for the Higgs boson search is characterized by a good

separation between signal and background thanks to the good lepton identi-

fication and momentum resolution of the ATLAS detector. The traditional

search strategy using the golden channel focuses on measuring the invariant

mass spectrum of the four leptons. However, given that four-momenta of all

decay products can be reconstructed with sufficient resolution, it is possible to

measure more than just the total invariant mass of the four leptons. In fact,

there are a total of five angles that can be measured. Obviously it would be

advantageous to incorporate all available kinematic information when searching

for the Higgs boson. Additional kinematic variables can be included in an ex-

perimental measurement by multivariate analyses.

Angular correlations of Higgs decays in the golden channel have been studied

previously to determine the spin and CP properties of the putative Higgs reso-

nance. Recent works include the computation of the angular correlations of the

final state leptons resulting from the production of a resonance (with arbitrary

spin less than or equal to two) which in turn decays, via general couplings, to a

pair of Z bosons, which subsequently decay leptonically. The present work, in-

stead of comparing angular correlations for different spin and CP assumptions

for a singly produced resonance, aims at distinguishing the SM Higgs boson

signal from the dominant irreducible background qq → ZZ(∗) → 4l using the

TMVA, a Toolkit for Multivariate Data Analysis. This can be done along two

different lines: first, due to the different spin properties of the Higgs, that is a

scalar, with respect to ZZ, that is a mixture of different spin states, the decay

angles of the two Z and of the leptons are expected to have also different distri-

butions; second, due to the different production mechanism of the Higgs boson

with respect to the ZZ, the transverse momentum pT of the 4 leptons system

119

4. Angular Analysis and TMVA 4.1 Angular Analysis

is expected to be different in the two cases.

In this work it has been considered the possibility to use a similar set of vari-

ables to study the discrimination capability of ATLAS using angular variables in

the 4 leptons analysis. In the following the angular distributions based on MC

samples of signal and backgrounds are presented. Two Higgs masses (130 GeV,

360 GeV) have been considered, and the analysis has been limited to the ZZ

background, neglecting the lower contribution of other surviving background

events. For the ZZ background it has been considered a low mass region (110

GeV < mH < 150 GeV) as background for the 130 GeV Higgs signal, and a high

mass region (300 GeV < mH < 420 GeV) as background for the 360 GeV Higgs

signal. In the following, after a definition of the angles and a description of

how to compute them, the MC samples used are mentioned, the discrimination

power using decay or production angles is discussed considering reconstructed

events. Then the same discrimination power is evaluated exploiting multivariate

analysis.

4.1 Angular Analysis

4.1.1 Kinematics

As noted above, in this study it has been considered events in which two Z

bosons are produced from the decay of a SM Higgs boson produced either in the

gluon fusion channel or in the vector boson fusion channel. Each Z boson, which

could be either on or off the mass shell, decays to a lepton (l) and an anti-lepton

(l). Events with additional particles in the final state are not considered; thus

the transverse momentum of the 4l system is assumed to be negligible. In other

words, only exclusive ZZ(∗) → 4l processes are considered.

In these events, the final state can be completely reconstructed. In general

the kinematics can be specified in terms of two production angles of the ZZ(∗)

system, one of which is irrelevant; four decay angles describing ZZ(∗) → 4l; and

the invariant masses of the Zs. In hadron colliders it is also necessary to know

the momentum fractions of the initial massless partons, x1, and x2, in order to

compute the differential cross sections. In the following section the convention

for the angles which specify the event and how to obtain them for a particular

event are described. In particular, it is provided Lorentz-invariant definitions of

all angles, allowing for their determination from four-momenta reconstructed in

the laboratory (Lab) frame [57].

120


(a) (b)

Figure 4.1: Figure (a): two decay planes of Zi → li li, i = 1, 2. The polarangles θi shown are defined in the rest frames of Zi with respect to ki, whilethe azimuthal angles shown are in fact 2π−φ1 = −φ1 and π−φ2. Figure (b):the coordinate system in the CM frame and the definition of the productionangle Θ.

4.1.1.1 Definitions of Angles

Let p1 and p2 be the momenta of the lepton pair coming from Z1, and p3

and p4 be the momenta of the lepton pair from Z2, while k1,2 are the momenta

of Z1,2. The notation used is such that p1 = l1, p2 = l1, p3 = l2, p4 = l2,

i.e. p1 is the momentum of the lepton from Z1 decay, p2 the momentum of

the antilepton from Z1 decay, etc. The momenta of the incoming partons are

denoted by kq and kq. Therefore the total momentum of the ZZ(∗) system is

P = kq + kq = k1 + k2 = p1 + p2 + p3 + p4, which satisfies P 2 = s ≡ M2. For

Higgs production in the gluon fusion channel, the incoming partons are self-

conjugate, kq = kgluon,1, kq = kgluon,2, and the total momentum P is the Higgs

momentum.

As shown in Figure 4.1, it is chosen the coordinate system in the center of mass

(CM) frame of the two Z’s system as:

zCM = k1, yCM =kq × k1

|kq × k1|, xCM = yCM × zCM =

−kq + k1(kq · k1)|kq × k1|

.

(4.1)

Furthermore, Z1 is defined as the rest frame of the Z1 boson by boosting the CM

frame along k1, while Z2 is obtained by first rotating CM frame with respect

to yCM by π and then boosting along k2. The production angle Θ and decay

angles {θ1, θ2, φ1, φ2} are defined as follows:

• Θ is the polar angle of the momentum of the incoming quark in the CM

frame, i.e. it is the angle between the 2 decaying Zs direction in the Higgs

reference frame and the z axis.

• θ1,2 is the polar angle of the momentum of the negative lepton (l1,2) in

the Z1,2 frame, i.e. it is the angle between the decaying leptons and the

121


Z direction in the two Zs center of mass;

• φ1,2 is the azimuthal angle of the negative lepton (l1,2) in the Z1,2 frame;

The azimuthal production angle is irrelevant and chosen to be zero. In these

definitions, three-momenta of l1,2 in the Z1,2 frame can be written as

~pli in the Zi frame = |~pli |(sin θi cosφi, sin θi sinφi, cos θi), i = 1, 2, (4.2)

therefore

cos θi =~pli · ~ki|~pli~ki|

(4.3)

where ~ki is the Z 3-momentum in the 2Zs center of mass, whereas the three-

momentum of the incoming parton in the CM frame is

~kq in the CM frame = |~kq|(− sinΘ, 0, cosΘ). (4.4)

In hadron colliders the CM frame of the two Z’s system is different from the

Lab frame and the event as a whole will be boosted along the beam axis with

respect to the Lab frame, P = (P 0, 0, 0, P z). Also, it has been chosen to define

the coordinate system in the CM frame such that the z axis is defined by the Z1

three-momentum, rather than by the three-momentum of the incident partons,

as is natural in the Lab frame.

The total energy and momentum of the event P in the Lab frame can be

used to determine the momentum fractions of the incident partons: kq =

x1(Ecm, 0, 0, Ecm) and kq = x2(Ecm, 0, 0,−Ecm), where Ecm =√s/2 is the

CM energy of the colliding protons. From P = kq+kq, it results that s = x1x2s

and

kq =1

2(P 0 + P z, 0, 0, P 0 + P z), (4.5)

kq =1

2(P 0 − P z, 0, 0, P z − P 0), (4.6)

where are valid in the Lab frame.

4.1.1.2 Lorentz-Invariant Construction of Angles

In the CM frame, ~k1 and ~k2 are back to back and of equal magnitude, as

are kq and kq. Using P =M(1, 0, 0, 0), the energy and three-momentum of the

incoming partons can be expressed as,

Eq = Eq = |~kq| = |~kq| =√s

2, (4.7)

122


as well as that of the two Z’s,

Ei =P · kiM

, |~ki| =

√(P · k1M

)2

−m212 ≡ λZ , i = 1, 2, (4.8)

where m2ij = (pi + pj)

2 = 2pi · pj . Alternatively, λZ =

√(P · k2/M

)2 −m234.

Since cosΘ = kq · k1, by computing kq · k1, it is simple to derive

cosΘ =−kq · k1 + EqE1

|~kq||~k1|=

(kq − kq) · k1MλZ

(4.9)

By definition cosΘ changes sign under kq ↔ kq, which is manifest in Equation

(4.9). Thus when the direction of the incoming quark cannot be distinguished

from the anti-quark, as is the case for hadron colliders, or when the incoming

partons are self-conjugate as in the Higgs gluon fusion production channel, one

can only determine cosΘ up to a minus sign. Because Θ is only defined between

0 and π, it is not necessary to compute sinΘ.

Focusing now on θi, since k1 = p1+p2, in the CM frame k1 = (E1+E2, 0, 0, |~p1+~p2|) and considering the boost that takes k1 from CM frame to the rest frame

of Z1 where it is (m12, 0, 0, 0), it can be written:(m12

0

)=

(γ −γβ

−γβ γ

)(E1 + E2

|~p1 + ~p2|

), (4.10)

from which it is derived

β =|~p1 + ~p2|E1 + E2

, γ =E1 + E2

m12. (4.11)

The inverse boost would then take p1 = (m12/2)(1, sin θ1 cosφ1, sin θ1 sinφ1, cos θ1)

in the Z1 rest frame to p1 = (E1, ~p1) in the CM frame, implying the following

relation:

E1 = γm12

2(1 + β cos θ1), (4.12)

from which the definition of cos θ1 can be obtained:

cos θ1 =E1 − E2

|~p1 + ~p2|=E1 − E2

|~k1|=

1

MλZP · (p1 − p2). (4.13)

The last expression is stemmed using Equation (4.8). For cos θ2, simply replace

p1 and p2 by p3 and p4, respectively, and then

cos θ2 =1

MλZP · (p3 − p4). (4.14)

123


mH MC Total σ·BR[GeV] Generator Events [nb]

ggF Signal

130 powheg-pythia 20000 5.89 · 10−6

360 powheg-pythia 50000 7.08 · 10−6

VBF Signal

130 powheg-pythia 30000 0.481 · 10−6

360 powheg-pythia 29900 0.605 · 10−6

Table 4.1: Signal samples along with their cross section times the branchingratio values used for the angular distributions.

To compute φi, the unit normal vectors of the two decay planes are constructed,

N1 =~p1 × ~p2|~p1 × ~p2|

, N2 =~p3 × ~p4|~p3 × ~p4|

, (4.15)

so that

N1 · xCM = sinφ1, N1 · yCM = − cosφ1, (4.16)

N2 · xCM = − sinφ2, N2 · yCM = − cosφ2. (4.17)

A more detailed calculation of these last angles can be found in [57]. It is just

pointed out that when kq → −kq, φi → π + φi. So in hadron colliders or

gluon fusion production it cannot be distinguished between an event described

by angles (Θ, θ1, θ2, φ1, φ2) and an event described by angles (π−Θ, θ1, θ2, φ1+

π, φ2 + π).

Besides these angles, the Φ angle, defined as Φ ≡ π − φ1 − φ2 has been also

considered in this work.

4.1.2 Angular and pT Distributions

In order to exploit the angular distributions of the four leptons in the chan-

nel H → ZZ(∗) → 4l to disentangle signal from background, the same angular

distributions, whose theoretical study has been already presented in [57], have

been studied for the reconstructed events in the Higgs search analysis. Note

that for these distributions are not applied scale factors.

For the ZZ background sample the same pythia generator of the Higgs cut-

based analysis has been used and the Monte Carlo signal samples relative to an

Higgs of 130 GeV and 360 GeV, in which powheg interfaced to pythia, have

been considered (see Table 4.1).

Figure 4.2 shows normalized singly differential distributions in cosΘ, cos θ1,

124

4. Angular Analysis and TMVA 4.2 TMVA

cos θ2, Φ, φ1 and φ2 for the gluon fusion signal at mH =√s = 130 GeV and

mH =√s = 360 GeV and the corresponding low mass (110 < mH < 150 GeV)

and high mass (300 < mH < 420 GeV) ZZ irreducible background. The same

distributions for the vector boson fusion signals are very similar to these, are

not significant, therefore they are reported in appendix A. The distributions in

the production angle Θ indicate that the ZZ pair produced by the background

process tend to be in the forward region inside the detector; this is especially

pronounced when the invariant mass is high. In the signal case, on the contrary,

the Zs are produced isotropically, as expected from the fact that the Higgs is a

scalar particle (spin 0), thus the signal shape has to be flat within the angular

ranges. These variables are not used in the present Higgs analysis. A quantita-

tive assessment of the improvement in sensitivity is in progress, in view of the

use of them in the 2012 analysis.

Besides the transverse momentum distributions for the same samples of signal

and backgrounds are visible in Figure 4.3.

The study presented in this section shows that adding angular and pT infor-

mation in the signal search can give additional discriminating power between

signal and ZZ background. In particular the decay angular distributions cosΘ,

cos θ1 and cos θ2 can be useful at high Higgs masses, but their discriminating

power is negligible at low mass. The discriminating power is such, that these

variables will have to be combined in a multivariate analysis. In the following

section the multivariate analysis will be applied to the samples at high Higgs

mass, being variables more discriminating in this mass range.

4.2 TMVA

The Toolkit for Multivariate Analysis (TMVA), the principal tool used in this

work for the optimization of the Higgs signal observables, will be introduced. By

applying TMVA this work is, after finding the right observables, able to split the

signal from background, or at least to increase the ratio between signal efficiency

and background rejection with respect to a cut-based Higgs search analysis.

The advanced techniques applied are boosted decision trees and artifical neural

networks, they are described in Section 4.2.2 and 4.2.3. Then the obtained

results are showed in Section 4.3.

4.2.1 What is TMVA

In high energy and nuclear physics, the signal which is searched for, for

example a signature of a Higgs boson, is typically overlaid by background pro-

cesses with a similar signature. Commonly used methods of classification into

125


Θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

1θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.01

0.02

0.03

0.04

0.05

0.06

2θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Φ0 1 2 3 4 5 6

0

0.01

0.02

0.03

0.04

0.05

0.06

1φ

0 1 2 3 4 5 60

0.01

0.02

0.03

0.04

0.05

0.06

0.07

2φ

0 1 2 3 4 5 60

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

(a) mH = 130 GeV

Θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.02

0.04

0.06

0.08

0.1

0.12

0.14

1θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

2θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Φ0 1 2 3 4 5 6

0

0.01

0.02

0.03

0.04

0.05

0.06

1φ

0 1 2 3 4 5 60

0.01

0.02

0.03

0.04

0.05

0.06

2φ

0 1 2 3 4 5 60

0.01

0.02

0.03

0.04

0.05

0.06

0.07

(b) mH = 360 GeV

Figure 4.2: Angular distributions for the gluon fusion Higgs signal atmH =130 GeV and at mH = 360 GeV (the blue histogram) and for the ZZirreducible background in the mass window 130 GeV < mH < 150 GeV or300 GeV < mH < 420 GeV, respectively (the red histogram). These plotsshow angular distributions for reconstructed events.

126


[MeV]T

p0 50 100 150 200 250 300

310×0

0.02

0.04

0.06

0.08

0.1

0.12

(a) mH = 130 GeV

[MeV]T

p0 50 100 150 200 250 300

310×0

0.02

0.04

0.06

0.08

0.1

(b) mH = 360 GeV

Figure 4.3: pT distributions of reconstructed events for the gluon fusionsignal at mH = 130 GeV and at mH = 360 GeV (the blue histogram)comparing to the same distribution for the irreducible background in themass window 110 GeV < mH < 150 or 300 GeV < mH < 420 GeV (the redhistogram), respectively.

signal and background events reach their limitations when the signal is very

small and/or part of the information of if an event is signal or background is

hidden in not well known correlations between the observables. The automated

multivariate analysis toolkit, TMVA, provides the ability to exploit the available

information from the observables efficiently. The advantage of MVA classifier

is, generally speaking, that it can achieve a better discrimination power with

respect to a simple cut analysis, especially in presence of poorly discriminating

variables. These variables are usually called weak variables and are character-

ized by having similar distributions for signal and background samples.

Indeed since in ATLAS there is a need for searching for a small Higgs signal

in a large data set, it is essential to extract the maximum of available informa-

tion from the signal characteristics. TMVA has been designed to find the best

separating function between the signal and background. It contains a variety

of multivariate classification algorithms. For this work the following methods

have been selected: boosted decision tree and artificial neural network.

The algorithms consist of two independent phases. The first one is the training,

where the program learns to classify data from a finite but representative set of

samples. In the second phase, the already trained classification system is tested

against new samples unknown to it. In this way it is possible to assess its real

classification capabilities for arbitrary samples of data. In fact after training

and testing the multivariate methods, the chosen methods are applied to the

concrete classification problem they have been trained for, i.e. to data.

127


4.2.1.1 Evaluation of MVA Methods

Classification in TMVA [58, 59] derives from the input variables (observables)

a classifier output where signal- (background-) like events have values close to

1 (0). The mapping to the signal and background class is done by defining

all events with a classifier output y > ycut as signal and all other events as

background. For each trained classifier the distribution of the classifier output

y for signal and background events can be inspected by the user (see Figure

4.4(a) for example). Signal and background efficiencies are computed for a set

of cuts on the classifier output. For each cut value ycut the signal efficiency

εeff,signal, purity and background rejection (1− εeff,background) are calculated1.

From the sets of signal efficiencies and background rejections defined by the cuts

MLPBFGS response0 0.5 1 1.5 2

dx / (1

/N)

dN

0

0.5

1

1.5

2

2.5

3

SignalBackground

U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.0

, 0.0

)%

TMVA response for classifier: MLPBFGS

(a) (b)

Figure 4.4: (a) Example of a classification output y. Classification outputsclose to 0 (1) denote the event being background-like (signal-like). Since forthe training the true class of the event (signal/background) is known, theclassification output is plotted for both classes independently. (b) The ROCcurve which shows the relationship between signal efficiency (εeff,signal) andbackground rejection (1− εeff,background) [59].

on y the Reciever Operating Characteristics (ROC) curve is plotted. In Figure

4.4(b) several exemplary ROC curves with different classification performances

are shown. The larger the area below the curve, the better the separation

of signal and background which can be achieved. Which point on the ROC

curve the user should choose as working point (i.e. which signal efficiency and

which background rejection) depends on the type of analysis the user wants to

perform. For trigger selection, a high efficiency will be chosen to prevent from

signal events being discarded at a too early stage. For a signal search, the best

cut is where S/√B has a maximum. When a signal is found, the best cut for

measuring the cross section is where S/√S +B has a maximum. Finally for

precision measurements one aims for a high purity.

1The efficiency ε is defined as ε = Events passing the cut selectionTotal events

.

128


Therefore in order to ease the choice on the best MVA classifier to employ for

a particular classification problem, TMVA computes a number of benchmark

quantities that assess the performance of the methods on the independent test

sample. For classification these are:

• The signal efficiency at three representative background efficien-

cies obtained from a cut on the classifier output.

• The area of the background rejection versus signal efficiency function.

• The separation 〈S2〉 of a classifier y, defined by the integral

〈S2〉 = 1

2

∫ (yS(y)− yB(y)

)2yS(y) + yB(y)

dy, (4.18)

where yS and yB are the signal and background PDFs (Probability Density

Functions) of y, respectively. The separation is zero for identical signal

and background shapes, and it is one for shapes with no overlap.

• The discrimination significance of a classifier, defined by the difference

between the classifier means for signal and background divided by the

quadratic sum of their root-mean-squares.

The results of the evaluation are printed to standard output. In addition to the

MVA response value y of a classifier, TMVA also provides the classifier’s signal

and background PDFs, yS(B). The PDFs can be used to derive classification

probabilities for individual events, or to compute the Rarity transformation.

• Classification probability: The probability for event i to be of signal

type is given by,

PS(i) =fS · yS(i)

fS · yS(i) + (1− fS) · yB(i), (4.19)

where fS = NS/(NS +NB) is the expected signal fraction, and NS(B) is

the expected number of signal (background) events (default is fS = 0.5).

• Rarity: The Rarity R(y) of a classifier y is given by the integral

R(y) =

∫ y

−∞yB(y

′)dy′, (4.20)

which is defined such that R(yB) for background events is uniformly dis-

tributed between 0 and 1, while signal events cluster towards 1. The signal

distributions can thus be directly compared among the various classifiers.

129


The stronger the peak towards 1, the better is the discrimination. An-

other useful aspect of the Rarity is the possibility to directly visualize

deviations of a test background (which could be physics data) from the

training sample, by exhibition of non-uniformity.

4.2.1.2 Overtraining

When choosing what kind of classifier one wants to use in the analysis,

one should keep in mind that everything brings benefits, but also risks. The

main risk of using MVA classifier is overtraining. In such case, the classifica-

tion method specializes itself in differentiating the specific examples given for

training, learning them by memory instead of learning the underlying rules and

principles necessary to classify them. Therefore overtraining leads to a seeming,

artificial increase in the classification performance over the objectively achiev-

able one if measured on the training sample, and to an effective performance

decrease when measured with an independent test sample. A convenient way to

detect overtraining and to measure its impact is therefore to compare the per-

formance results between training and test samples. Such a test is performed

by TMVA with the results printed to standard output.

Overtraining occurs when a machine learning problem has too few degrees of

freedom, when the size of the decision system is too big for the complexity of

the training data, when the size of the training set is too small, or when it is

not representative of the whole input space, when the training sample events

are not statistically independent (oversampling) or in general in the case of a

wrong tuning of the classifier’s parameters.

The sensitivity to overtraining depends on the MVA method, so when one has

to choose what is the best classifier for own purposes, one has to pay attention

to achieve a good discrimination power, but also to avoid overtraining. Several

techniques are employed to check for overtraining and reduce its impact. Over-

training is quantified using a Kolmogorov-Smirnov (KS) test. The result is the

likelihood that the distribution obtained from the test sample could have been

obtained from the training sample, which for an overtrained classifier is unlikely.

4.2.2 Boosted Decision Tree (BDT)

A decision tree is a binary tree structured classifier similar to the one sketched

in Figure 4.5. Repeated left/right (yes/no) decisions are taken on one single vari-

able at a time until a stop criterion is fulfilled. The phase space is split this

way into many regions that are eventually classified as signal or background,

depending on the majority of training events that end up in the final leaf node.

The main issue in using tree-based methods is the high variance of this method,

130


Figure 4.5: Schematic view of a decision tree. Starting from the root node,a sequence of binary splits using the discriminating variables xi is appliedto the data. Each split uses the variable that at this node gives the bestseparation between signal and background when being cut on. The samevariable may thus be used at several nodes, while others might not be usedat all. The leaf nodes at the bottom end of the tree are labeled “S” for signaland “B” for background depending on the majority of events that end up inthe respective nodes [58].

i.e. the possibility of having very different results with a small change in the

training sample; this problem is solved using the boosting algorithms described

below.

4.2.2.1 Boosting

Boosting is a general procedure whose application is not limited to decision

trees. The boosting of a decision tree extends the concept illustrated above from

one tree to several trees which form a forest. The classification of an event is

made on a majority vote of the classifications done by each tree in the forest.

A simple algorithm for boosting works like this: it starts by applying some

method, in this case a tree classifier, to the learning data, where each observa-

tion is assigned an equal weight. It computes the predicted classifications, and

applies weights to the observations in the learning sample that are inversely pro-

portional to the accuracy of the classification. In other words, it assigns greater

weight to those observations that were difficult to classify (where the misclas-

sification rate was high), and lower weights to those that were easy to classify

(where the misclassification rate was low). Boosting will generate a sequence

of classifiers, where each consecutive classifier in the sequence is an “expert”

in classifying observations that were not well classified by those preceding it.

Therefore the trees are derived from the same training ensemble by reweighting

131


(boosting) events, i.e the same classifier is trained several times using a suc-

cessively boosted training event sample; an event will then be processed by all

trees and they are finally combined into a single classifier which is given by a

(weighted) average of the individual decision trees. Boosting stabilizes the re-

sponse of the decision trees with respect to fluctuations in the training sample

and is able to considerably enhance the separation performance compared to a

single tree. In many cases, the boosting performs best if applied to trees (clas-

sifiers) that, taken individually, have not much classification power, i.e. small

trees.

The boosting algorithm used in this analysis is the Gradient Boost. The func-

tion F (x) (being x the tuple of input variables) under consideration is assumed

to be a weighted sum of parametrized base functions f(x; am), so-called “weak

learners”:

F (x;P ) =M∑

m=0

βmf(x; am); P ∈ {βm; am}M0 . (4.21)

Thus each base function in this expansion corresponds to a decision tree. The

boosting procedure is now employed to adjust the parameters P such that the

deviation between the model response F (x) and the true value y obtained from

the training sample, where y = 1 for signal and y = −1 for backgrounds, is

minimized. The deviation is measured by the so-called loss-function L(F, y).

The current TMVA implementation of Gradient Boost uses the binomial log-

likelihood loss

L(F, y) = ln

(1 + e−2F (x)y

)(4.22)

for classification. As the boosting algorithm corresponding to this loss function

cannot be obtained in a straightforward manner, one has to resort to a steepest-

descent approach to do the minimization. This is done by calculating the current

gradient of the loss function and then growing a regression tree whose leaf values

are adjusted to match the mean value of the gradient in each region defined by

the tree structure. Iterating this procedure yields the desired set of decision trees

which minimizes the loss function. Gradient Boost is typically less susceptible

to overtraining. Its robustness can be enhanced by reducing the learning rate of

the algorithm through the Shrinkage parameter (see Table 4.2), which controls

the weight of the individual trees. A small shrinkage (0.1− 0.3) demands more

trees to be grown but can significantly improve the accuracy of the prediction

in difficult settings.

In certain settings Gradient Boost may also benefit from the introduction of

a bagging-like resampling procedure using random subsamples of the training

events for growing the trees. This is called stochastic gradient boosting and can

be enabled by selecting the UseBaggedGrad option. The sample fraction used in

132


each iteration can be controlled through the parameter GradBaggingFraction,

where typically the best results are obtained for values between 0.5 and 0.8.

4.2.2.2 BDT Training

The training or growing of a decision tree is the process that defines the

splitting criteria for each node. The training starts with the root node, where

an initial splitting criterion for the full training sample is determined. The split

results in two subsets of training events that each go through the same algorithm

in order to determine the next splitting iteration. This procedure is repeated

until the whole tree is built. At each node, the split is determined by find-

ing the variable and corresponding cut value that provides the best separation

between signal and background. The node splitting stops once each terminal

node, called leaf, is pure signal or pure background, or once it has reached the

minimum number of events which is specified in the BDT configuration (option

nEventsMin). A standard value from the literature is:

nEventsMin = max

(40,

Ntraining events

N2variables

1

10

)(4.23)

A variety of separation criteria can be configured (option SeparationType in

Table 4.2) to assess the performance of a variable and a specific cut requirement.

Imagine the events are weighted with each event having weight wi. The purity

of the sample in a branch is defined by [60]

P =

∑s ws∑

s ws +∑

b wb(4.24)

where∑

s is the sum over signal events and∑

b is the sum over background

events. It has to be noted that P (1−P ) is 0 if the sample is pure signal or pure

background. For a given branch the Gini index is defined as

Gini =

( n∑i=1

wi

)P (1− P ) (4.25)

where n is the number of events on that branch. The criterion is to minimize

Ginileft daughter +Giniright daughter (4.26)

To determine the increase in quality when a node is split into two branches, one

maximizes

Criterion = Ginifather −Ginileft daughter −Giniright daughter. (4.27)

133


Actually there are three major measures of node impurity used in practice. If

p is defined as the proportion of the signal in a node2, then the three measures

are:

• Gini Index, defined by p(1− p);

• Cross entropy, defined by −p ln p− (1− p) ln(1− p);

• Misclassification error, defined by 1−max(p, 1− p).

All separation criteria have a maximum where the samples are fully mixed, i.e.,

at purity p = 0.5, and fall off to zero when the sample consists of one event

class only. The three measures are similar, but the Gini Index and the Cross

entropy are differentiable, and hence more amenable to numerical optimization.

Therefore the splitting criterion being always a cut on a single variable, the

training procedure selects the variable and cut value that optimizes the increase

in the separation index between the parent node and the sum of the indexes of

the two daughter nodes, weighted by their relative fraction of events.

The cut values are optimized by scanning over the variable range with a granu-

larity that is set via the option nCuts. The default value of nCuts=20 proved to

be a good compromise between computing time and step size. Finer stepping

values did not increase noticeably the performance of the BDTs.

At the end, the leaf nodes are classified as signal or background according to

the class the majority of events belongs to. If the option UseYesNoLeaf is set

the end-nodes are classified in the same way. If UseYesNoLeaf is set to false the

end-nodes are classified according to their purity, i.e. if a leaf has purity greater

than 1/2 (or whatever is set), then it is called a signal leaf and if the purity is

less than 1/2, it is a background leaf. The resulting tree is a decision tree.

In principle, the splitting could continue until each leaf node contains only sig-

nal or only background events, which could suggest that perfect discrimination

is achievable. However, such a decision tree would be strongly overtrained. To

avoid overtraining a decision tree must be pruned.

Pruning is the process of cutting back a tree from the bottom up after it has

been built to its maximum size. Its purpose is to remove statistically insignif-

icant nodes and thus reduce the overtraining of the tree. It has been found

to be beneficial to first grow the tree to its maximum size and then cut back,

rather than interrupting the node splitting at an earlier stage. This is because

apparently insignificant splits can nevertheless lead to good splits further down

the tree. Whereas this technique is useful for a single tree, it is not completely

clear yet if this also applies for the tree in the forest. Currently it looks as if

2 p (also called purity of a node) is given by the ratio of signal events to all events in thatnode. Hence pure background nodes have zero purity.

134


in TMVA, better results for the whole forest are often achieved when pruning

is not applied, but rather the maximal tree depth is set to a relatively small

value (3 or 4) already during the tree building phase. In particular the Gradient

Boost does not apply a pruning algorithm. In this case it is recommended to

restrict the number of nodes in the tree to values between 5 to 20 by using

option NNodesMax or the maximal allowed depth of the tree (MaxDepth option).

4.2.3 Artificial Neural Network (ANN)

Figure 4.6: Multilayer perceptron with one hidden layer [58].

A more complex estimator that is able to manage also with highly corre-

lated variables, or variables which have a very poor discrimination power is

the Artificial Neural Network (ANN). The ANN is composed by some nodes

called neurons, which are arranged in different layers and connected each other.

An Artificial Neural Network is any simulated collection of interconnected neu-

rons, with each neuron producing a certain response at a given set of input

signals. One can therefore view the neural network as a mapping from a space

of input variables x1, . . . , xnvar onto a one-dimensional (e.g. in case of a signal-

versus-background discrimination problem) or multi-dimensional space of out-

put variables y1, . . . , ymvar . The mapping is nonlinear if at least one neuron has

a nonlinear response to its input.

135


4.2.3.1 Network Architecture

While in principle a neural network with n neurons can have n2 directional

connections, the complexity can be reduced by organizing the neurons in layers

and only allowing directional connections from one layer to the immediate next

one (see Figure 4.6). This kind of neural network is termed multi-layer percep-

tron (MLP). The first layer of a multilayer perceptron is the input layer, the

last one the output layer, and all others are hidden layers. For a classification

problem with nvar input variables and 2 output classes, the input layer consists

of a bias neuron and nvar neurons that hold the input values, x1, . . . , xnvar , and

the output layer consists of one neuron that holds the output variable, the neu-

ral net estimator yANN .

Each directional connection between the output of one neuron and the input of

another has an associated weight. The output value of each neuron is multiplied

with the weight to be used as input value for the next neuron.

Each neuron has a neuron response function ρ, which maps the neuron in-

put i1, . . . , in in onto the neuron output. Often it can be separated into a

Rn 7→ R synapsis function κ, and a R 7→ R neuron activation function α, so

that ρ = α ◦ κ. The functions κ and α can have the following forms:

κ :(y(l)1 , . . . , y(l)n |w(l)

0j , . . . , w(l)nj

)→

w

(l)0j +

∑ni=1 y

(l)i w

(l)ij Sum,

w(l)0j +

∑ni=1

(y(l)i w

(l)ij

)2Sum of squares,

w(l)0j +

∑ni=1

∣∣y(l)i w(l)ij

∣∣ Sum of absolutes,

(4.28)

α : x→

x Linear,

11+e−kx Sigmoid,

ex−e−x

ex+e−x Tanh,

e−x2/2 Radial.

(4.29)

where yli is the output of the ith neuron in the lth layer and wlij is the weight

of the connection between the ith neuron in the lth layer and the jth neuron in

the (l + 1)th layer.

4.2.3.2 ANN Training

There are two algorithms for adjusting the weights that optimize the clas-

sification performance of a neural network: the so-called back-propagation and

BFGS.

136


Back-propagation (BP)

The most common algorithm is the so-called back-propagation. It belongs to the

family of supervised learning methods, where the desired output for every input

event is known. The output of a network (here for simplicity assumed to have

a single hidden layer with a tanh activation function, and a linear activation

function in the output layer) is given by

yANN =

nh∑j=1

y(2)j w

(2)j1 =

nh∑j=1

tanh

( nvar∑i=1

xiw(1)ij

)· w(2)

j1 , (4.30)

where nvar and nh are the number of neurons in the input layer and in the

hidden layer, respectively, w(1)ij is the weight between input-layer neuron i and

hidden-layer neuron j and w(2)j1 is the weight between the hidden-layer neuron

j and the output neuron. Simple summation was used in Equation (4.30) as

synapse function κ.

During the learning process the network is supplied with N training events xa =

(x1, . . . , xnvar)a, a = 1, . . . , N . For each training event a the neural network

output yANN,a is computed and compared to the desired output ya ∈ {1, 0} (1

for signal events and 0 for background events). An error function E, measuring

the agreement of the network response with the desired one, is defined by

E(x1, . . . ,xN|w) =

N∑a=1

Ea(xa|w) =

N∑a=1

1

2(yANN,a − ya)

2, (4.31)

where w denotes the ensemble of adjustable weights in the network. The set

of weights that minimizes the error function can be found using the method

of steepest or gradient descent, provided that the neuron response function is

differentiable with respect to the input weights. Starting from a random set of

weights w(ρ) the weights are updated by moving a small distance in w-space

into the direction −∇wE where E decreases most rapidly

w(ρ+1) = w(ρ) − η∇wE (4.32)

where η is a positive number called learning rate, which is responsible to avoid

serious overtraining of the network. The weights connected with the output

layer are updated by

∆w(2)j1 = −η

N∑a=1

∂Ea

∂w(2)j1

= −ηN∑

a=1

(yANN,a − ya)y(2)j,a (4.33)

137


and the weights connected with the hidden layers are updated by

∆w(1)ij = −η

N∑a=1

∂Ea

∂w(1)ij

= −ηN∑

a=1

(yANN,a − ya)y(2)j,a(1− y

(2)j,a)w

(2)j1 xi,a (4.34)

where it has been used tanh′ x = tanhx(1−tanhx). This method of training the

network is denoted bulk learning, since the sum of errors of all training events is

used to update the weights. An alternative choice, which is that implemented in

TMVA, is the so-called online learning, where the update of the weights occurs

at each event. The weight updates are obtained from Equation (4.33) and (4.34)

by removing the event summations.

BFGS

The Broyden-Fletcher-Goldfarb-Shannon (BFGS) method differs from back-

propagation by the use of second derivatives of the error function to adapt

the synapse weight by an algorithm which is composed of four main steps.

1. Two vectors, D and Y are calculated. The vector of weight changes D

represents the evolution between one iteration of the algorithm (k − 1)

to the next (k). Each synapse weight corresponds to one element of the

vector. The vector Y is the vector of gradient errors.

D(k)i = w

(k)i − w

(k−1)i , (4.35)

Y(k)i = g

(k)i − g

(k−1)i , (4.36)

where i is the synapse index, gi is the i-th synapse gradient, wi is the

weight of the i-th synapse, and k denotes the iteration counter.

2. Approximate the inverse of the Hessian matrix, H−1, at iteration k by

H−1(k) =D ·DT ·

(1 + Y T ·H−1(k−1) · Y

)Y T ·D

−D·Y T ·H+H·Y ·DT+H−1(k−1),

(4.37)

where superscripts (k) are implicit for D and Y .

3. Estimate the vector of weight changes by D(k) = −H−1(k) · Y (k).

4. Compute a new vector of weights by applying a line search algorithm. In

the line search the error function is locally approximated by a parabola.

The algorithm evaluates the second derivatives and determines the point

where the minimum of the parabola is expected. The total error is eval-

uated for this point. The algorithm then evaluates points along the line

defined by the direction of the gradient in weights space to find the ab-

solute minimum. The weights at the minimum are used for the next

138


iteration.

The learning rate can be set with the option Tau, which is line search size

step. The learning parameter, which defines by how much the weights are

changed in one epoch along the line where the minimum is suspected, is

multiplied with the learning rate as long as the training error of the neural

net with the changed weights is below the one with unchanged weights.

If the training error of the changed neural net were already larger for the

initial learning parameter, it is divided by the learning rate until the train-

ing error becomes smaller. The iterative and approximate calculation of

H−1(k) turns less accurate with an increasing number of iterations. The

matrix is therefore reset to the unit matrix every ResetStep steps.

4.2.4 Optimization of the MVA methods

In order to improve the performance of a multivariate analysis, in general

classifiers have tuning parameters to optimize separation between the signal and

background candidates. First of all it is necessary to fix the MC samples and

the variables that one intends to use, thereafter one can proceed to the tuning

of the parameters.

4.2.5 Monte Carlo Samples

The tuning of the MVA methods’ parameters and so the following analysis

have been obtained using:

• for the signal:

– 7856 mc11b’ gluon fusion (gg) signal selected events at mH = 360

GeV (from a collection holding 12024 selected events);

– 4703 mc11b’ vector boson fusion (VBF) signal selected events at

mH = 360 GeV (from a collection holding 7207 selected events);

• for the ZZ background: 26938 mc11b’ ZZ background selected events

(from a collection holding 28938 selected events), of which only the events

in the mass window 300 GeV < mH < 420 GeV have been considered.

The signal and background samples, from which these are stemmed, are the

same already mentioned in Table 4.1 and for the Higgs cut-based analysis, re-

spectively. The signal file is created merging the two above signal sub-samples

rescaled according to their corresponding cross section and in turn the back-

ground file is created rescaling the sub-sample with its cross section.

139


4.2.6 Input Variables

[MeV]Hm200 300 400 500 600

310×

1.4e

+04

MeV

/ (1

/N)

dN

0

5

10

15

20

25

-610×SignalBackground

U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.1

, 0.0

)%

HInput variable: m

(a)

Θcos-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

0.05

12

/ (1

/N)

dN

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8 SignalBackground

U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.0

, 0.0

)%

ΘInput variable: cos

(b)

[MeV]T

p100 200 300 400 500 600

310×

1.74

e+04

MeV

/ (1

/N)

dN

0

5

10

15

20

25

30

-610×SignalBackground

U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.1

, 0.0

)%

TInput variable: p

(c)

Figure 4.7: Input Variables for the multivariate analysis.

The idea of analyzing all possible information coming from an experiment

in order not to miss anything may sound tempting. However, the inclusion

of trivial or correlated variables can introduce noise into the system without

actually providing new information, thus not improving the performance. The

observables used for the optimization are the ones that turn out to be the most

discriminating at high mass, as can be seen from Figure 4.2 the natural choice

is:

1. Invariant mass of the four leptons selected event (mH).

2. The angular variable cosΘ.

3. Transverse momentum of the four leptons selected event (pT ).

These variables are not very correlated, as can be seen from Figure 4.8, and

the optimization of the MVA methods has to be done adding one at a time the

variables, starting from the mass.

140


-100

-80

-60

-40

-20

0

20

40

60

80

100

Θcos Hm

Tp

Θcos

Hm

Tp

Correlation Matrix (signal)

100 -2 1

-2 100 1

1 1 100

Linear correlation coefficients in %

(a) Signal

-100

-80

-60

-40

-20

0

20

40

60

80

100

Θcos Hm

Tp

Θcos

Hm

Tp

Correlation Matrix (background)

100 -3

-3 100 5

5 100

Linear correlation coefficients in %

(b) Backgound

Figure 4.8: Variables correlation matrices for signal (a) and background(b).

4.2.7 Tuning Parameters for the implemented BDTG

In this work a Gradient Boosted Decision Tree (BDTG) has been imple-

mented, taking into account the considerations expressed in Section 4.2.2 the

main features of this classifier have been chosen and their optimal values are

summarized in Table 4.2.

Since the purity in the leaf nodes is sensitive to overtraining and therefore

typically overestimated, the end-nodes are classified as signal or background

according to the class the majority of events belongs to. The used separation

criteria is the Gini Index, defined by p · (1− p), as already described in Section

4.2.2. It is adopted because it is the default separation criterion and tests have

revealed no significant performance disparity between the most important sep-

aration criteria. Besides no pruning algorithm is applied by the Gradient Boost

decision tree, then the maximum depth of tree is just reduced to three levels.

The choice of the maximum dimension of a tree should take into account two

considerations: a very large tree might overfit the data, while a small tree might

not capture the important structure. Tree size is a tuning parameter govern-

ing the model’s complexity. The number of trees and the maximum number of

nodes have been chosen in such way to minimize overtraining and to maximize

the ROC curve area in order to obtain the best performance.

4.2.7.1 mH

Using the Monte Carlo samples above quoted and as input variable only the

invariant mass of the four leptons selected event mH (see Figure 4.7(a)), the

141


curves in Figure 4.9 are obtained setting NNodesMax to 53 or to 6 and varying the

number of trees. It has also been investigated the BDTG’s response when the

number of nodes increases, but after 6 nodes the situation remains unchanged.

These curves are very similar, then in order to choice the BDTG’s parameters

that assure the best performance, the plots in Figure 4.10 have been done. The

performance difference is more evident from these graphs which show that since

a too small number of nodes or trees is meaningless and an excessive number of

nodes or trees increases the overtraining problems, the best results are obtained

for NTrees = 800 and NNodesMax = 6.

Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Bac

kgr

reje

ctio

n (1

-eff)

0

0.2

0.4

0.6

0.8

1

NNodesMax=5=30TreesN =50TreesN=100TreesN =200TreesN=300TreesN =400TreesN=500TreesN =600TreesN=700TreesN =800TreesN=900TreesN =1000TreesN=1100TreesN

(a)

Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Bac

kgr

reje

ctio

n (1

-eff)

0

0.2

0.4

0.6

0.8

1

NNodesMax=6=30TreesN =50TreesN=100TreesN =200TreesN=300TreesN =400TreesN=500TreesN =600TreesN=700TreesN =800TreesN=900TreesN =1000TreesN=1100TreesN

(b)

Figure 4.9: ROC curves produced from BDTG varying the number of treesand fixing the number of nodes to 5 or 6.

4.2.7.2 mH and cosΘ

Using as input variables the mass of the four leptons selected event (mH)

and the angular variable cosΘ, that is the most discriminating at high mass

3The performance results for NNodesMax=4 and NNodesMax=5 are the same.

142


TreesN0 200 400 600 800 1000 1200

Sig

nific

ance

0.54

0.55

0.56

0.57

0.58

0.59

=5NodesMaxN=6NodesMaxN

TreesN0 200 400 600 800 1000 1200

RO

C

0.714

0.716

0.718

0.72

0.722

0.724

0.726

0.728

=5NodesMaxN=6NodesMaxN

(a)

Figure 4.10: Significance and ROC curve integral values produced fromBDTG using only the invariant mass mH as input variables and varying thenumber of trees and fixing the number of nodes to 5 or 6.

between all the angular variables described in Section 4.1, the same above study

has been carried out varying the maximum number of nodes from 4 to 14 and

the number of trees from 30 to 1100. As above the ROC curves are not useful

in order to make a choice for the parameters since they are very overlaid, but

analysing Figure 4.11, it has been concluded that the setting NNodesMax = 6

and Ntrees = 900 is the best choice.

TreesN0 200 400 600 800 1000 1200

Sig

nific

ance

0.72

0.74

0.76

0.78

0.8

0.82

0.84

0.86

0.88

0.9

=5NodesMaxN =6NodesMaxN



(a)

TreesN0 200 400 600 800 1000 1200

RO

C

0.76

0.77

0.78

0.79

0.8

0.81

0.82




(b)

Figure 4.11: Significance and ROC curve integral values produced fromBDTG using mH and cosΘ as input variables and varying the number oftrees and fixing the number of nodes to 5, 6, 8, 10, 12 and 14.

4.2.7.3 mH , cosΘ and pT

If one adds the transverse momentum as input variable to the previous ones,

there is a worse overtraining and the plots in Figure 4.12 are obtained varying

the maximum number of nodes from 4 to 14 and the number of trees from 30

to 1100. Increasing the number of trees and nodes the overtraining enhances,

then a good choice for the parameter in order to maximize the ROC curve area

and to minimize the overtraining effect turns out to be NNodesMax = 5 and

143


NTrees = 600.

TreesN0 200 400 600 800 1000 1200

Sig

nific

ance

0.95

1

1.05

1.1

1.15




(a)

TreesN0 200 400 600 800 1000 1200

RO

C

0.83

0.835

0.84

0.845

0.85

0.855

0.86

0.865

0.87




(b)

Figure 4.12: Significance and ROC curve integral values produced fromBDTG using mH , cosΘ and pT as input variables and varying the numberof trees and fixing the number of nodes to 5, 6, 8, 10, 12 and 14.

Finally, the BDTG’s features and its chosen tuning parameters are all sum-

marized in Table 4.2.

Option Value Description

BoostType Grad * Boosting type for the trees in the forest.

UseBaggedGrad True* Use only a random subsample of all eventsfor growing the trees in each iteration.

GradBaggingFraction 0.5* Defines the fraction of eventsto be used in each iteration.

Shrinkage 0.1 * Learning rate for GradBoost algorithm.

UseYesNoLeaf True* Use Sig or Bkg categories, not purity,as classification of the leaf node.

SeparationType GiniIndex * Separation criterion for node splitting.

nEventsMin max(40,NEvtsTrain/NVar2/10)* Minimum number of events required ina leaf node (default uses given formula).

nCuts 20* Number of steps duringnode cut optimization.

MaxDepth 3 * Max depth of the decision tree allowed.

mH mH and cosΘ mH , cosΘ and pT

NTrees 800 900 600 * Number of trees in the forest.NNodesMax 6 6 5 * Max number of nodes in tree.

Table 4.2: BDTG configuration.

4.2.8 Tuning Parameters for the implemented ANNs

In this work three Artificial Neural Networks (ANNs) have been imple-

mented, they are termed as MLP, MLPBNN and MLPBFGS.

The TMVA implementation of ANNs supports random and importance event

sampling. With event sampling enabled, only a fraction (set by the option

Sampling) of the training events is used for the training of the ANN. Values

144


in the interval [0, 1] are possible. Setting the option SamplingImportance to 1,

the events are selected randomly, in fact only for a value below 1 the probability

for the same events to be sampled again depends on the training performance

achieved for classification. Indeed in this last case if for a given set of events

the training leads to a decrease of the error of the test sample, the probabil-

ity for the events of being selected again is multiplied with the factor given in

SamplingImportance and thus decreases. In the case of an increased error of

the test sample, the probability for the events to be selected again is divided

by the factor SamplingImportance and thus increases. The probability for an

event to be selected is constrained to the interval [0, 1].

Event sampling is performed until the fraction specified by the option Sampling-

Epoch of the total number of epochs (NCycles) has been reached. Afterwards,

all available training events are used for the training. Event sampling can be

turned on and off for training and testing events individually with the options

SamplingTraining and SamplingTesting.

Since it is typically not known beforehand how many epochs are necessary to

achieve a sufficiently good training of the neural network, a convergence test

can be activated by setting ConvergenceTests to a value above 0. This value

denotes the number of subsequent convergence tests which have to fail (i.e. no

improvement of the estimator larger than ConvergenceImprove) to consider

the training to be complete. Convergence tests are performed at the same time

as overtraining tests. The test frequency is given by the parameter TestRate.

Finally it is recommended to set the option VarTransform = Norm, such that

the input is normalized to the interval [−1, 1].

Then the common configuration options of the three ANNs are summarized

in Table 4.3. The three ANNs differ from each other for the training method

applied to the net or for the neuron activation function α and for the number

of training cycles (NCycles) and the specification of hidden layer architecture

(HiddenLayers). As illustrated in Table 4.4, whereas the type of function α

and of the training method are arbitrarily fixed, the last two parameters have

been chosen for each variables set tempting to maximize the ROC curve area

taking care of the overtraining.

In order to configure the three ANNs it has been used the signal sample at

mH = 360 GeV and the irreducible background sample in the mass window 300

GeV < mH < 420 GeV.

Focusing now on the number of hidden layers. This parameter defines the net-

work architecture by setting the number of neurons per layer in the network

and the number of hidden layers. The selection of the number of layers was

based on the Weierstrass theorem, which states: “For a Multilayer perceptron

a single hidden layer is sufficient to approximate a given continuous correlation

145


Options Value Description

VarTransform Norm * Variable transformation performed before training.TestRate 5 * Test for overtraining performed at each #th epoch.

NeuronInputType sum * Neuron input function type.

ConvergenceTests 3* Number of steps (without improvement)required for convergence.

UseRegulator True * Use regulator to avoid over-training.LearningRate 0.02 * ANN learning rate parameter.DecayRate 0.01 * Decay rate for learning parameter.

Sampling 1* Only “Sampling” (randomly selected) eventsare trained each epoch.

SamplingEpoch 1* Sampling is used for the first “SamplingEpoch” epochs,afterwards, all events are taken for training.

SamplingImportance 1* The sampling weights of events in epochs which successful(worse estimator than before) are multiplied with“SamplingImportance”, else they are divided.

SamplingTraining True * The training sample is sampled.SamplingTesting False * The testing sample is sampled.

ResetStep 50 * How often BFGS should reset history.Tau 3 * Line search “size step”.

BPMode sequential * Back-propagation learning mode.

ConvergenceImprove 1e-30* Minimum improvement which counts as improvement(< 0 means automatic convergence check is turned off).

UpdateLimit 10000 * Maximum times of regulator update.

Table 4.3: Common configuration options of the three ANNs.

ANN Option Value

MLP

Training method BPNeuron activation function tanh


NCycles 200 100 100HiddenLayers N+5 N+7 N+6

MLPBFGS

Training method BFGSNeuron activation function tanh



MLPBNN

Training method BFGSNeuron activation function sigmoid



Table 4.4: Features differing the three ANNs.

146


function to any precision, given an arbitrary large number of neurons in the

hidden layer. If the available computing power and size of the training data

sample are sufficient, one can thus raise the number of neurons in the hidden

layer until the optimal performance is reached” [58].

Whereas a too small number of neurons is not effective, an excessive number of

neurons and hidden layers slows the process while creating overtraining prob-

lems, hence decreasing the performance of the system against the test sample.

Therefore after choosing the input set variables, it is first of all assumed a neural

network with only one hidden layer formed by a certain number of neurons, vari-

able from N +1 to N +10, where N is the number of the input variables. Then

it is fixed the number of neurons in the hidden layer and it is varied the number

of training cycles; for each value of the HiddenLayers it is chosen the number

of NCycles up to which there is an increase of the ROC curve, i.e increasing the

number of epochs after this value there is no improvement in the performance,

but there is a regularity of the results. Besides because a too small number of

cycles is not operative, it is always chosen a number at least greater than 100,

and the increase of the performance has always been compared to the increase

of overtraining, finding a compromise.

4.2.8.1 mH

In Figure 4.13 are illustrated the best ROC curves obtained for different

settings of the three neural networks using as input variable only the invariant

mass of the four leptons selected event (mH). Comparing these curves and at the

same time visualizing the outputs of the networks in these configurations it has

been concluded that the best performance is obtained setting HiddenLayers =

N +5 and NCycles = 200 for MLP; HiddenLayers = N +8 and NCycles = 600

for MLPBFGS and HiddenLayers = N + 8 and NCycles = 400 for MLPBNN.

4.2.8.2 mH and cosΘ

In Figure 4.14 are illustrated the best ROC curves obtained for different

settings of the three neural networks using as input variables the invariant

mass of the four leptons selected event (mH) and the angular variable cosΘ.

In this case it is more evident that the best performance is obtained setting

HiddenLayers = N+7 and NCycles = 100 for MLP; HiddenLayers = N+9 and

NCycles = 600 for MLPBFGS and HiddenLayers = N+7 and NCycles = 1100

for MLPBNN.

147


Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Bac

kgr

reje

ctio

n (1

-eff)

0

0.2

0.4

0.6

0.8

1

=100Cycles

=N+10; NNeuronsN =200Cycles

=N+9; NNeuronsN

=200Cycles


=N+7; NNeuronsN

=200Cycles


=N+5; NNeuronsN

=200Cycles


=N+3; NNeuronsN

=100Cycles


=N+1; NNeuronsN

(a) MLP

Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Bac

kgr

reje

ctio

n (1

-eff)

0

0.2

0.4

0.6

0.8

1

=400Cycles


=N+9; NNeuronsN

=600Cycles


=N+7; NNeuronsN

=300Cycles


=N+5; NNeuronsN

=300Cycles


=N+3; NNeuronsN

=400Cycles


=N+1; NNeuronsN

(b) MLPBFGS

Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Bac

kgr

reje

ctio

n (1

-eff)

0

0.2

0.4

0.6

0.8

1

=400Cycles


=N+9; NNeuronsN

=400Cycles


=N+7; NNeuronsN

=600Cycles


=N+5; NNeuronsN

=500Cycles


=N+3; NNeuronsN

=800Cycles


=N+1; NNeuronsN

(c) MLPBNN

Figure 4.13: ROC curves produced from MLP (a), MLPBFGS (b) andMLPBNN (c) varying the number of training cycles and the number ofneurons in the only one hidden layer (N is the number of the input variables)and using as input variable the invariant mass mH .

148


Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Bac

kgr

reje

ctio

n (1

-eff)

0

0.2

0.4

0.6

0.8

1

=100Cycles


=N+9; NNeuronsN

=100Cycles


=N+7; NNeuronsN

=100Cycles


=N+5; NNeuronsN

=100Cycles


=N+3; NNeuronsN

=100Cycles


=N+1; NNeuronsN

(a) MLP

Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Bac

kgr

reje

ctio

n (1

-eff)

0

0.2

0.4

0.6

0.8

1

=100Cycles


=N+9; NNeuronsN

=300Cycles


=N+7; NNeuronsN

=200Cycles


=N+5; NNeuronsN

=100Cycles


=N+3; NNeuronsN

=200Cycles


=N+1; NNeuronsN

(b) MLPBFGS

Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Bac

kgr

reje

ctio

n (1

-eff)

0

0.2

0.4

0.6

0.8

1

=400Cycles


=N+9; NNeuronsN

=300Cycles


=N+7; NNeuronsN

=600Cycles


=N+5; NNeuronsN

=700Cycles


=N+3; NNeuronsN

=700Cycles


=N+1; NNeuronsN

(c) MLPBNN

Figure 4.14: ROC curve produced from MLP (a), MLPBFGS (b) andMLPBNN (c) varying the number of training cycles and the number ofneurons in the only one hidden layer (N is the number of the input variables)and using as input variables the invariant mass mH and the angular variablecosΘ.

149


4.2.8.3 mH , cosΘ and pT

In Figure 4.15 are illustrated the best ROC curves obtained for different set-

tings of the three neural networks using as input variables the angular variable

cosΘ of the incoming quark, the invariant mass mH and the transverse mo-

mentum pT of the four leptons selected event. This time it has been chosen the

setting HiddenLayers = N + 6 and NCycles = 100 for MLP; HiddenLayers =

N + 7 and NCycles = 800 for MLPBFGS and HiddenLayers = N + 9 and

NCycles = 600 for MLPBNN.

Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Bac

kgr

reje

ctio

n (1

-eff)

0.2

0.4

0.6

0.8

1

=100Cycles


=N+9; NNeuronsN

=100Cycles


=N+7; NNeuronsN

=100Cycles


=N+5; NNeuronsN

=100Cycles


=N+3; NNeuronsN

=100Cycles


=N+1; NNeuronsN

(a) MLP

Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Bac

kgr

reje

ctio

n (1

-eff)

0.2

0.4

0.6

0.8

1

=400Cycles


=N+9; NNeuronsN

=200Cycles


=N+7; NNeuronsN

=900Cycles


=N+5; NNeuronsN

=300Cycles


=N+3; NNeuronsN

=300Cycles


=N+1; NNeuronsN

(b) MLPBFGS

Signal eff0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Bac

kgr

reje

ctio

n (1

-eff)

0.2

0.4

0.6

0.8

1

=500Cycles


=N+9; NNeuronsN

=400Cycles


=N+7; NNeuronsN

=500Cycles


=N+5; NNeuronsN

=300Cycles


=N+3; NNeuronsN

=200Cycles


=N+1; NNeuronsN

(c) MLPBNN

Figure 4.15: ROC curve produced from MLP (a), MLPBFGS (b) andMLPBNN (c) varying the number of training cycles and the number ofneurons in the only one hidden layer (N is the number of the input variables)and using as input variables the invariant mass mH , the angular variablecosΘ and the transverse momentum pT .

4.2.9 Comparing MVA Methods Performance

Comparing for the three different sets of variables the MVA methods’ per-

formance, the plots in Figure 4.16 are obtained. In Table 4.5 the values of the

150

4. Angular Analysis and TMVA 4.3 Results

signal efficiency corresponding to a background efficiency equal to 0.30, of the

ROC curves area, of the separation between signal and background and of the

significance, quantities already explained in Section 4.2.1, are summarized to

ease the evaluation of the different techniques. The methods are ranked from

the best to the worst by the area of signal efficiency and purity curves.

The variable ranking is different for the four methods. Considering mH and

cosΘ, for the neural networks the best ranked variable is the mass whereas for

BDTG it is cosΘ. Assuming all the three variables as input set, the MLPBNN

and MLPBFGS methods rank the invariant mass on the top followed by the

transverse momentum and then the angular variable; for MLP pT is the best

afterwards mH and cosΘ are classified respectively, finally for the decision tree

classifier the ranking is cosΘ, mH and pT . The neural networks implement a

variable ranking that uses the sum of the weights-squared of the connections

between the variable’s neuron in the input layer and the first hidden layer. The

importance Ii of the input variable i is given by

Ii = x2i

nh∑j=1

(w

(1)ij

)2, i = 1, . . . , nvar (4.38)

where xi is the sample mean of input variable i and w(1)ij is the weight between

input-layer neuron i and hidden-layer neuron j. The ranking of the BDTG

input variables is derived by counting how often the variables are used to split

decision tree nodes, and by weighting each split occurrence by the separation

gain-squared it has achieved and by the number of events in the node. This

measure of the variable importance can be used for a single decision tree as well

as for a forest [58]. The importance’s values of the variables are reported in

Table 4.6.

4.3 Results

After having optimized the advanced techniques of the MVA toolkit, it is

possible to compare the ROC curves obtained applying multivariate analysis

with that obtained with a cut-based analysis. In this last case it has been

computed the signal efficiency and the background rejection simply cutting the

signal and background invariant mass histograms coming from the application

of the Higgs analysis described in the previous chapter. It is evident from Figure

4.17 that when using a multivariate analysis there is always an improvement for

the MLPBFGS and MLPBNN neural networks with respect to the cut-based

analysis. This improvement is much more visible when increasing the number

of input variables, especially for the decision tree classifier, which on the other

151


mH

MVA method Signal Efficiency(error) ROC Separation SignificanceMLPBFGS 0.746(05) 0.790 0.262 0.813MLPBNN 0.734(05) 0.780 0.251 0.791

MLP 0.706(05) 0.766 0.225 0.707BDTG 0.653(05) 0.728 0.166 0.589

mH and cosΘ

MVA method Signal Efficiency(error) ROC Separation SignificanceMLPBNN 0.803(04) 0.825 0.323 0.944MLPBFGS 0.804(04) 0.824 0.320 0.938

BDTG 0.788(05) 0.818 0.310 0.882MLP 0.678(05) 0.725 0.179 0.567

mH , cosΘ and pT

MVA method Signal Efficiency(error) ROC Separation SignificanceBDTG 0.869(04) 0.870 0.417 1.148

MLPBFGS 0.841(04) 0.856 0.382 1.117MLPBNN 0.835(04) 0.852 0.371 1.088

MLP 0.753(05) 0.787 0.252 0.842

Table 4.5: Evaluation results for the implemented MVA methods ranked bybest signal efficiency and purity (area). The top method is the best ranked.The signal efficiency here reported correspond to a background efficiencyequal to 0.30.

(a) mH and cosΘ

MLPBNN

Variable ImportancemH 1.456 · 102cosΘ 2.457 · 10−4

MLPBFGS

Variable ImportancemH 3.721 · 10cosΘ 8.420 · 10−5

MLP

Variable ImportancemH 4.360cosΘ 9.499 · 10−5

BDTG

Variable ImportancecosΘ 5.677 · 10−1

mH 4.323 · 10−1

(b) mH , cosΘ and pT

BDTG

Variable ImportancecosΘ 4.141 · 10−1

mH 3.368 · 10−1

pT 2.491 · 10−1

MLPBFGS

Variable ImportancemH 3.413 · 10pT 9.309

cosΘ 9.622 · 10−5

MLPBNN

Variable ImportancemH 9.382 · 10pT 6.949 · 10

cosΘ 1.537 · 10−4

MLP

Variable ImportancepT 1.189 · 10mH 3.595cosΘ 8.150 · 10−5

Table 4.6: Variables ranking.

152


Signal efficiency0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Bac

kgro

un

d r

ejec

tio

n

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

MVA Method:

MLPBFGS

MLPBNN

MLP

BDTG

Background rejection versus Signal efficiency

(a) mH


Bac

kgro

un

d r

ejec

tio

n

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

MVA Method:

MLPBNN

MLPBFGS

BDTG

MLP


(b) mH , cosΘ


Bac

kgro

un

d r

ejec

tio

n

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

MVA Method:

BDTG

MLPBFGS

MLPBNN

MLP


(c) mH , cosΘ and pT

Figure 4.16: Comparing ROC curves for the different optimized MVAmethods using as input variables mH (a), mH and cosΘ (b), mH , cosΘ andpT (c).

153


hand does not perform very well when using only one input variable. The MLP

net does not seem to have the same performance, when considering mH and

cosΘ together, thus it is not appropriate for the Higgs search in this channel.

The next step is to decide the optimal cut on MVA output in order to obtain

the maximum value of S/√B (the so-called working point), where S and B are

the number of the signal and background events passing the cut, respectively.

The graphs in Figure 4.18 show the values of this ratio for each possible cut

versus the corresponding cut. These plots are consistent with the MVA outputs

shown in Figure 4.20, 4.21 and 4.22 for mH , mH and cosΘ, mH , cosΘ and pT

as input variables set, respectively. The optimal cut values corresponding to the

maximum of S/√B are summarized in Table 4.7. For completeness Figure 4.19

also shows the S/√S +B quantity for each cut value.

As reported in Table 4.8, it can be noted that the application of the multivariate

mH

MVA method Cut Value S/√

B S/√

S + B Signal Eff. Bkg. Eff.MLPBFGS 0.821 0.532 0.283 0.691 0.246MLPBNN 0.767 0.512 0.281 0.715 0.278

MLP 0.879 0.492 0.272 0.682 0.273BDTG 0.485 0.464 0.260 0.603 0.249

mH and cosΘ


B S/√

S + B Signal Eff. Bkg. Eff.MLPBNN 0.795 0.451 0.214 0.666 0.177MLPBFGS 0.902 0.509 0.210 0.474 0.090

BDTG 0.704 0.406 0.194 0.650 0.177MLP 0.663 0.379 0.218 0.692 0.310

mH , cosΘ and pT


B S/√

S + B Signal Eff. Bkg. Eff.BDTG 0.975 0.671 0.089 0.120 0.002

MLPBFGS 0.939 0.467 0.143 0.380 0.037MLPBNN 0.959 0.348 0.110 0.364 0.037

MLP 0.871 0.449 0.231 0.650 0.212

mH Cut-Based

S/√

B S/√

S + B Signal Eff. Bkg. Eff.0.165 0.156 0.617 0.254

Table 4.7: The optimal cut values that a MVA output should exceed to getthe maximum value of S/

√B, with the corresponding signal and background

efficiencies and S/√B and S/

√S +B quantities. The top method is the

best ranked for each input variables set. The same sensitivity values arealso reported for the cut-based analysis.

analysis leads to an improvement in the maximum value of S/√B with respect

to a cut-based analysis. The latter is obtained considering the ratio between

signal and background events in the mass window 343 GeV < mH < 377 GeV.

It is worth stressing that the cut-based S/√B value and ones reported in Table

154


MVA TMVA mH Cut-Based Relative Improvement

method S/√

B S/√

B TMVA/Cut-Based

mH

MLPBFGS 0.532

0.165

3.2MLPBNN 0.512 3.1

MLP 0.492 3.0BDTG 0.464 2.8

mH and cosΘ

MLPBNN 0.451

0.165

2.7MLPBFGS 0.509 3.1

BDTG 0.406 2.5MLP 0.379 2.3

mH and cosΘ and pT

BDTG 0.671

0.165

4.1MLPBFGS 0.467 2.8MLPBNN 0.348 2.1

MLP 0.449 2.7

Table 4.8: Comparison between MVA methods and cut-based analysis interms of the maximum of S/

√B.

4.7 are derived in two different ways. It is therefore not possible, or at least it is

not sufficient, a direct comparison between them, in order to claim the success

of the MVA application to the standard analysis. This is particularly evident

from Figure 4.17, where ROC curves corresponding to better S/√B are not also

the best when studying their areas. The values in Table 4.7 are useful however

to understand the behaviour of MVA methods when varying the input variables

set. For example for the MLP method considering mH and cosΘ there is an in-

crease of signal efficiency with respect to the use of the only invariant mass but

background efficiency also grows thus reducing background rejection, so S/√B

and S/√S +B decrease. On the other hand using the complete set of variables,

even though signal and background efficiencies decrease with respect to the only

mH use, both S/√B and S/

√S +B raise, because signal efficiency is reduced

less than the background one. Therefore this neural network shows a deteriora-

tion in performance, as evident from Table 4.5 and from Figure 4.17(d), adding

only the angular variable to the invariant mass. The reason can be attributed

to the training method here adopted, maybe not appropriate to describe the

complexity of the training data in case of this input variables set.

Instead the MLPBNN and MLPBFGS neural networks show a similar and a

regular behaviour: the signal and background efficiencies and S/√B decrease

adding variables, together with the value S/√S +B. This observation is re-

flected in the trend of curves in figures 4.18(b), 4.18(c), 4.19(c) and 4.19(b), i.e.

the greater the number of variables the lower is the curve. So even if according

155


Table 4.5 their performance seems to increase when adding variables, the best

improvement with respect to a cut-based analysis is obtained just including

only the invariant mass as discriminant variable. We can therefore conclude

that these networks attribute a greater importance to mH .

Finally, for the BDTG MVA S/√B decreases adding only cosΘ but undergoes

a sharp rise if all the three variables are considered, while S/√S +B keeps to

decrease. Therefore the best gain with respect to the standard strategy is ob-

tained applying the decision tree and considering the complete set of variables

as input. This is also evident in the comparison between the evaluation results

for the MVA methods reported in Table 4.5.


Bac

kgro

und

reje

ctio

n

0

0.2

0.4

0.6

0.8

1

Cuts-BasedHmHm

Θ+cosHm

T+pΘ+cosHm

(a) BDTG


Bac

kgro

und

reje

ctio

n

0

0.2

0.4

0.6

0.8

1

Cuts-BasedHm

HmΘ+cosHm

T+pΘ+cosHm

(b) MLPBFGS


Bac

kgro

und

reje

ctio

n

0

0.2

0.4

0.6

0.8

1

Cuts-BasedHm

HmΘ+cosHm

T+pΘ+cosHm

(c) MLPBNN


Bac

kgro

und

reje

ctio

n

0.2

0.4

0.6

0.8

1

Cuts-BasedHmHm

Θ+cosHm

T+pΘ+cosHm

(d) MLP

Figure 4.17: Comparing ROC curves for the different optimized MVAmethods varying the input variables set and for the cut-based analysis.

156


Cut value applied on BDTG output-1 -0.5 0 0.5 1

BS

/

3

4

5

6

7

8

9

10

HmΘ+cosHm

T+pΘ+cosHm

(a) BDTG

Cut value applied on MLPBFGS output-0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4

BS

/

3

4

5

6

7

8

9

10

HmΘ+cosHm

T+pΘ+cosHm

(b) MLPBFGS

Cut value applied on MLPBNN output-2 -1.5 -1 -0.5 0 0.5 1 1.5

BS

/

2

3

4

5

6

7

8

9

10

HmΘ+cosHm

T+pΘ+cosHm

(c) MLPBNN

Cut value applied on MLP output0.2 0.4 0.6 0.8 1 1.2 1.4

BS

/

2

3

4

5

6

7

8

9

10

HmΘ+cosHm

T+pΘ+cosHm

(d) MLP

Figure 4.18: Comparing S/√B curves for the different optimized MVA

methods using as input variables mH (black points), mH and cosΘ (redpoints), mH , cosΘ and pT (violet points).

157


Cut value applied on BDTG output-1 -0.5 0 0.5 1

S+

BS

/

0

1

2

3

4

5

6

HmΘ+cosHm

T+pΘ+cosHm

(a) BDTG

Cut value applied on MLPBFGS output-0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4

S+

BS

/

0

1

2

3

4

5

6

HmΘ+cosHm

T+pΘ+cosHm

(b) MLPBFGS

Cut value applied on MLPBNN output-2 -1.5 -1 -0.5 0 0.5 1 1.5

S+

BS

/

0

1

2

3

4

5

6

HmΘ+cosHm

T+pΘ+cosHm

(c) MLPBNN

Cut value applied on MLP output0.2 0.4 0.6 0.8 1 1.2 1.4

S+

BS

/

0

1

2

3

4

5

6

HmΘ+cosHm

T+pΘ+cosHm

(d) MLP

Figure 4.19: Comparing S/√S +B curves for the different optimized MVA

methods using as input variables mH (black points), mH and cosΘ (redpoints), mH , cosΘ and pT (violet points).

158


BDTG response0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

dx / (1

/N)

dN

0

5

10

15

20

25 Signal (test sample)

Background (test sample)

Signal (training sample)

Background (training sample)

Kolmogorov-Smirnov test: signal (background) probability = 0.536 (0.999)

U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.0

, 0.0

)%

TMVA overtraining check for classifier: BDTG

(a) BDTG

MLPBFGS response0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1

dx / (1

/N)

dN

0

2

4

6

8

10

12

14

16

18






U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.0

, 0.0

)%

TMVA overtraining check for classifier: MLPBFGS

(b) MLPBFGS

MLPBNN response0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2

dx / (1

/N)

dN

0

2

4

6

8

10






U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.0

, 0.0

)%

TMVA overtraining check for classifier: MLPBNN

(c) MLPBNN

MLP response0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3

dx / (1

/N)

dN

0

2

4

6

8

10

12






U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.0

, 0.0

)%

TMVA overtraining check for classifier: MLP

(d) MLP

Figure 4.20: MVA methods’ outputs using mH as input variable. In theseplots is also possible to visualize the overtraining check.

159


BDTG response-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8

dx / (1

/N)

dN

0

1

2

3

4

5

6

7






U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.0

, 0.0

)%


(a) BDTG

MLPBFGS response0.2 0.4 0.6 0.8 1 1.2

dx / (1

/N)

dN

0

1

2

3

4

5

6

Signal (test sample)




Kolmogorov-Smirnov test: signal (background) probability = 1 (0.804)

U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.0

, 0.0

)%


(b) MLPBFGS

MLPBNN response-0.2 0 0.2 0.4 0.6 0.8 1 1.2

dx / (1

/N)

dN

0

1

2

3

4

5

6

7Signal (test sample)




Kolmogorov-Smirnov test: signal (background) probability = 1 (0.817)

U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.0

, 0.0

)%


(c) MLPBNN

MLP response-0.2 0 0.2 0.4 0.6 0.8

dx / (1

/N)

dN

0

1

2

3

4

5






U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.0

, 0.0

)%


(d) MLP

Figure 4.21: MVA methods’ outputs using mH and cosΘ as input vari-ables. In these plots is also possible to visualize the overtraining check.

160


BDTG response-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8

dx / (1

/N)

dN

0

1

2

3

4






U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.0

, 0.0

)%


(a) BDTG

MLPBFGS response0 0.2 0.4 0.6 0.8 1 1.2 1.4

dx / (1

/N)

dN

0

0.5

1

1.5

2

2.5

3

3.5 Signal (test sample)





U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.0

, 0.0

)%


(b) MLPBFGS

MLPBNN response-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4

dx / (1

/N)

dN

0

0.5

1

1.5

2

2.5






U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.0

, 0.0

)%


(c) MLPBNN

MLP response0.4 0.6 0.8 1 1.2 1.4

dx / (1

/N)

dN

0

0.5

1

1.5

2

2.5

3

3.5






U/O

-flo

w (

S,B

): (

0.0,

0.0

)% /

(0.0

, 0.0

)%


(d) MLP

Figure 4.22: MVA methods’ outputs using mH , cosΘ and pT as inputvariables. In these plots is also possible to visualize the overtraining check.

161

Conclusions

The discovery or exclusion of the Higgs boson, the only unobserved particle

of the Standard Model, is one of the main goals of the LHC. In the Standard

Model the possible Higgs boson decays channels are to pairs of fermions or

bosons. The Higgs boson decay H → ZZ(∗) → 4l, where l indicates an elec-

tron or a muon, is defined as the “Golden channel” at the ATLAS experiment.

It provides high sensitivity for its discovery, a narrow 4-leptons invariant mass

peak stands on top of a smooth background. This channel has been the subject

of this thesis.

After a presentation of the Higgs mechanism and a description of the ATLAS

detector in Chapter 1 and 2, in Chapter 3 the standard cut-based Higgs search

analysis in the decay channel H → ZZ(∗) → 4l that I have performed is illus-

trated in details. This analysis has been performed for Higgs boson mass (mH)

hypotheses in the full 110 GeV to 600 GeV mass range using data collected by

the ATLAS experiment in 2011, corresponding to an integrated luminosity of

4.8 fb−1. In total 71 candidate events are selected by the analysis, while in the

same mass range ∼ 62 events are expected from the background processes. The

SM Higgs boson is excluded at 95% C.L. in the mass ranges 134 GeV−156 GeV,

182 GeV−233 GeV, 256 GeV−265 GeV and 266 GeV−415 GeV.

This thesis aimed further to investigate the possibility to integrate the standard

Higgs search analysis with a multivariate one in order to better discriminate

between signal and background events. Therefore in Chapter 4 an angular

distribution study has been carried out upon the Monte Carlo signal and back-

ground selected events in order to find other discriminant variables besides the

invariant mass. The angular variable cosΘ of the momentum of the incoming

quark in the CM frame together with the Higgs transverse momentum and in-

variant mass have been selected as input variables to apply the multivariate

analysis at an Higgs signal sample of mH = 360 GeV.

TMVA, the Toolkit for Multivariate Data Analysis, provides a good set of tools

163

Conclusions

with the benefit of a common interface to a still growing number of methods.

Nevertheless the user still has to check the classifier results carefully and needs

to spend time on optimizing the settings of each classifier to be used. In addition

a sufficient number of candidates with known event class (signal/background)

dedicated to training and testing only have to be available in order not to suffer

from systematic effects.

In this thesis I have tested four MVA methods: a boosted decision tree and

three artificial neural networks, and the tuning of their parameters in order to

optimize separation between the signal and background candidates have been

described in Chapter 4. Then the curves representing background rejection ver-

sus signal efficiency have been compared to the one derived from a cut-based

analysis showing a growing improvement when increasing the number of input

variables. The presented strategies analyse signal and background distributions

in two different ways, therefore a direct comparison between the corresponding

S/√B values is not possible, or at least is not sufficient, in order to claim the

success of the MVA application to the standard analysis. However from this

comparison it seems that the best gain with respect to the cut-based analysis is

obtained applying the decision tree and considering the complete set of variables

as input.

I have carried out this study in the high mass range, in which the SM Higgs

cross section has been already excluded, because here the considered variables,

in particular the angular one, are more discriminating. Thus, this tudy is just

preliminary to investigate the potential of a such analysis starting from a region

in which the separation between signal and background is not too low, for then

extrapolating the learned knowledge about MVA tools and using them at the

low mass region in attempt to provide an help for the Higgs discovery in the

not yet excluded range.

From my thesis work, it can be concluded that the application of a multivari-

ate analysis to the Higgs search could enhance the discovery sensitivity. However

it is also true that at low mass the angular variables are not so much discrim-

inating, so they should be integrated with other variables. Then the tuning of

MVA methods could turn out to be more difficult, increasing the complexity of

neural network or decision tree structure needed to optimize their performance,

leading to a greater computing time and so increasing the overtraining at the

same time.

Thus, the future prospectives are a H → ZZ(∗) → 4l new optimization study

for a low mass Higgs, and then applying the MVA methods to new data. In

fact in the next months this channel will play a central role together with the

H → γγ channel for the Higgs discovery or esclusion.

164

Appendix AVector Boson Fusion Angular

Distributions

Θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

1θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.01

0.02

0.03

0.04

0.05

0.06

2θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Φ0 1 2 3 4 5 6

0

0.01

0.02

0.03

0.04

0.05

0.06

1φ

0 1 2 3 4 5 60

0.01

0.02

0.03

0.04

0.05

0.06

0.07

2φ

0 1 2 3 4 5 60

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

Figure A.1: Angular distributions for the vector boson fusion Higgs signalat mH = 130 GeV (the brown histogram) and for the ZZ irreducible back-ground in the mass window 130 GeV< mH < 150 GeV (the red histogram).These plots show angular distributions for reconstructed events.

166

A. Vector Boson Fusion Angular Distributions

Θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.02

0.04

0.06

0.08

0.1

0.12

0.14

1θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

2θcos-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Φ0 1 2 3 4 5 6

0

0.01

0.02

0.03

0.04

0.05

0.06

1φ

0 1 2 3 4 5 60

0.01

0.02

0.03

0.04

0.05

0.06

2φ

0 1 2 3 4 5 60

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Figure A.2: Angular distributions for the vector boson fusion Higgs signalat mH = 360 GeV (the brown histogram) and for the ZZ irreducible back-ground in the mass window 300 GeV< mH < 420 GeV (the red histogram).These plots show angular distributions for reconstructed events.

167

Bibliography

[1] K. Nakamura et al. “Particle Data Group”. J. Phys. G 37, 075021, 2010.

[2] L. Anchordoqui and F. Halzen. Lessons in Particle Physics. University of

Wisconsin, 2009. arXiv:0906.1271v2.

[3] A. Signer. The Standard Model. University of Durham, 2002.

[4] W. Greiner. Relativistic Quantum Mechanics Wave Equations. Springer,

third edition.

[5] S. Dawson. Introduction to electroweak symmetry breaking, 1999.

arXiv:hep-ph/9901280v1.

[6] T. Hambey and K. Riesselmann. SM Higgs mass bounds from theory. DESY

97-152 D0-TH 97/18, 1997.

[7] M. Gomez-Bock, M. Mondragon, M. Muhlleitner, R. Noriega-Papaqui,

I. Pedraza, M. Spira, and P. M. Zerwas. “Electroweak symmetry breaking

and Higgs physics: basic concepts”. Journal of Physics: Conference Series

18:74–135, 2005. arXiv:hep-ph/0712.2419v1.

[8] L. Reina. TASI 2004 lecture notes on Higgs boson physics, 2005. arXiv:hep-

ph/0512377v1.

[9] M. Spira and P. M. Zerwas. Electroweak symmetry breaking and Higgs

physics. CERN-TH/97-379 DESY 97-261, 1998. arXiv:hep-ph/9803257v2.

[10] The TEVNPH Working Group for the CDF and DØ Collaborations. Com-

bined CDF and DØ Search for Standard Model Higgs Boson Production with

up to 10 fb−1 of Data. FERMILAB-CONF-12-065-E, CDF Note 10806, DØ

Note 6303, 2012.

169

Bibliography

[11] H. Flacher, M. Goebel, J. Haller, A. Hocker, K. Monig, and J. Stelzer.

“Revisiting the global electroweak fit of the Standard Model and beyond

with Gfitter”. The European Physical Journal C 60, 543 (publisher), 2009.

arXiv:0811.0009v4.

[12] M. Baak, M. Goebel, J. Haller, A. Hocker, D. Ludwig, K. Monig, M. Schott,

and J. Stelzer. “Updated status of the global electroweak fit and constraints

on new physics ”. Submitted to the European Physical Journal C, 2011.

arXiv:1107.0975v1.

[13] U. Aglietti et al. Tevatron for LHC report: Higgs. FERMILAB-CONF-06-

467-E-T, 2007. arXiv:hep-ph/0612172v2.

[14] A Generic Fitter Project for HEP Model Testing. URL http://gfitter.

desy.de/.

[15] M. Spira. QCD effects in Higgs physics. CERN-TH/97-68, 1997. arXiv:hep-

ph/9705337v2.

[16] A. Djouadi. The anatomy of electro-weak symmetry breaking. Tome I: The

Higgs boson in the Standard Model. LPT-Orsay-05-17, 2005. arXiv:hep-

ph/0503172v2.

[17] LHC Higgs Cross Section Working Group, S. Dittmaier, C. Mariotti,

G. Passarino, and R. Tanaka (Eds.). Handbook of LHC Higgs cross sec-

tions: 1. Inclusive observables. CERN-2011-002 (CERN,Geneva), 2011.

arXiv:hep-ph/1101.0593v3.

[18] The ATLAS Collaboration. Expected performance of the ATLAS exper-

iment: Detector, Trigger and Physics. CERN-OPEN-2008-020, CERN,

Geneva, 2008.

[19] A. Djouadi, J. Kalinowski, and M. Spira. HDECAY: a program for Higgs

boson decays in the Standard Model and its supersymmetric exstension,

1997. arXiv:hep-ph/9704448v1.

[20] V. Buscher and K. Jakobs. “Higgs boson searches at hadron colliders”. In-

ternational Journal of Modern Physics Letters A Vol. 20, 2005. arXiv:hep-

ph/0504099v1.

[21] The ATLAS Collaboration. ATLAS detector and physics performance.

Technical design report, vol. 2. CERN-LHCC-99-15, 1999.

[22] E. Lyndon (ed.) and B. Philip (ed.). “LHC machine”. J. Instrum. 3,

(S08001), 2008.

170

Bibliography

[23] J. Stirling. Available at. URL http://projects.hepforge.org/mstwpdf/

plots/plots.html.

[24] Daniel Fournier. Performance of the LHC, ATLAS and CMS in 2011. HCP,

2011.

[25] M. Benedikt, P. Collier, V. Mertens, J. Poole, , and K. Schindl. LHC

Design Report. 3. The LHC injector chain, 2004.

[26] Available at. URL http://lhc-machine-outreach.web.cern.ch/

lhc-machine-outreach/lhc_in_pictures.htm.

[27] G Aad, N Groot, F Filthaut, and D Froidevaux. “The ATLAS experiment

at the CERN Large Hadron Collider”. J. Instrum. 3, (S08003), 2008.

[28] The ATLAS detector available at. URL http://www.atlas.ch/detector.

html.

[29] The ATLAS Collaboration. ATLAS liquid-argon calorimeter: Technical

Design Report. ATLAS-TDR-002, CERN-LHCC-96-041, CERN, Geneva,

1996.

[30] The ATLAS Collaboration. “Drift Time Measurement in the ATLAS

Liquid Argon Electromagnetic Calorimeter using Cosmic Muons”. The

European Physical Journal C - Particles and Fields, 70:755–785, DOI

10.1140/epjc/s10052-010-1403-6, 2010.

[31] The ATLAS Collaboration. ATLAS tile calorimeter: Technical Design Re-

port. ATLAS-TDR-003, CERN-LHC-96-042, CERN, Geneva, 1996.

[32] D. M. Gingrich et al. “Construction, assembly and testing of the ATLAS

hadronic end-cap calorimeter”. J.Instrum. 2, (P05005), 2007.

[33] The ATLAS magnet available at. URL http://www.atlas.ch/magnet.

html.

[34] Muon system layout (Parameter Books). Available at. URL http://atlas.

web.cern.ch/Atlas/GROUPS/MUON/layout/parameter_book.html.

[35] S. Palestini. The muon spectrometer of the ATLAS experiment. CERN,

Geneva, 2002.

[36] O. Fedin. Reconstruction and identification of photons and electrons with

the ATLAS detector. Petersburg Nuclear Physics Institute, St. Petersburg,

188300, Russia, 2009.

171

Bibliography

[37] C. Anastopoulos et al. Search for the Standard Model Higgs boson in the

decay channel H → ZZ(∗) → 4l with 4.8 fb−1 of pp collisions at√s = 7

TeV. ATLAS Internal Note, December 20, 2011. Draft version.

[38] Search for the SM Higgs boson in the decay channel H → ZZ →lll Winter 2012. URL https://twiki.cern.ch/twiki/bin/viewauth/

AtlasProtected/HiggsZZllllWinter2012.

[39] T. Binoth, N. Kauer, and P. Mertsch. Gluon-induced QCD Corrections to

pp→ ZZ → lll′ l′, 2008. arXiv:hep-ph/0807.0024v1.

[40] S. Alioli, P. Nason, C.Oleari, and E. Re. “NLO Higgs boson production via

gluon fusion matched with shower in POWHEG”. JHEP 0904:002, 2009.

arXiv:hep-ph/0812.0578v2.

[41] P. Nason and C.Oleari. “NLO Higgs boson production via vector-boson fu-

sion matched with shower in POWHEG”. JHEP 1002:037, 2010. arXiv:hep-

ph/0911.5299v2.

[42] T. Sjostrand, S. Mrenna, and P. Z. Skands. “PYTHIA 6.4 Physics and

Manual ”. JHEP 0605:026, 2006. arXiv:hep-ph/0603175v2.

[43] P. Golonka and Z. Was. “PHOTOS Monte Carlo: a precision tool for

QED corrections in Z and W decays”. Eur. Phys. J. C 45:97-107, 2005.


[44] Z. Was and P. Golonka. “TAUOLA as tau Monte Carlo for future applica-

tions”. Presented at International workshop on Tau Lepton Physics, 2004.


[45] Z. Was. “TAUOLA for simulation of tau decay and production: perspec-

tives for precision low energy and LHC applications”. Presented at Interna-

tional workshop on Tau Lepton Physics, 2011. arXiv:hep-ph/1101.1652v1.

[46] S. Catani, D. de Florian, M. Grazzini, and P. Nason. “Soft-gluon resum-

mation for Higgs boson production at hadron colliders”. JHEP 0307:028,

2003. arXiv:hep-ph/0306211v1.

[47] The ATLAS Collaboration. Search for the Standard Model Higgs boson in

the decay channel H → ZZ(∗) → 4l with 4.8 fb−1 of pp collisions at√s = 7

TeV. ATLAS-CONF-2011-162, December 13, 2011.

[48] M. L. Mangano, M. Moretti, F. Piccinini, R. Pittau, and A. D. Polosa.

“ALPGEN, a generator for hard multiparton processes in hadronic colli-

sions”. JHEP 0307:001, 2003. arXiv:hep-ph/0206293v2.

172

Bibliography

[49] J. M. Butterworth, J. R. Forshaw, and M. H. Seymour. “Multiparton

Interactions in Photoproduction at HERA”. Z. Phys. C 72:637-646, 1996.


[50] S. Frixione, P. Nason, and B. R. Webber. “Matching NLO QCD and parton

showers in heavy flavour production”. JHEP 0308:007, 2003. arXiv:hep-

ph/0305252v2.

[51] G. Corcella et al. “HERWIG 6.5: an event generator for Hadron Emission

Reactions With Interfering Gluons (including supersymmetric processes)”.

JHEP 0101:010, 2002. arXiv:hep-ph/011363v3.

[52] TechnicalitiesForMedium1. URL https://twiki.cern.ch/twiki/bin/

viewauth/AtlasProtected/TechnicalitiesForMedium1.

[53] M. Hance, D.Olivito, and H. H. Williams. Performance studies for

e/gamma calorimeter isolation. University of Pennsylvania, Lawrence

Berkeley National Laboratory, 2011.

[54] C. Anastopoulos et al. ATLAS sensitivity prospects for the Standard Model

Higgs boson in the decay channel H → ZZ(∗) → 4l at√s = 10 and 7 TeV.

ATL-PHYS-INT-2010-062, 2010.

[55] The ATLAS Collaboration. “Search for the Standard Model Higgs boson

in the decay channel H → ZZ(∗) → 4l with 4.8 fb−1 of pp collisions at√s = 7 TeV with ATLAS”. Preprint submittted to Phys. Lett. B, March 2,

2012. CERN-PH-EP-2012-014. arXiv:hep-ex/1202.1415v3.

[56] The ATLAS Collaboration. “An update to the combined search for the

Standard Model Higgs boson with the ATLAS detector at the LHC using up

to 4.9 fb−1 of pp collision data at√s = 7 TeV ”. ATLAS-COM-PHYS-

2012-019, March 6,2012.

[57] J. S. Gainer, K. Kumar, I. Low, and R. Vega-Morales. “Improving the sensi-

tivity of Higgs boson searches in the golden channel”. International Journal

of Modern Physics Letters A Vol. 20, 2011. arXiv:hep-ph/1108.2274v2.

[58] A. Hoecker, P. Speckmayer, J. Stelzer, J. Therhaag, E. von Toerne, and

H. Voss. TMVA 4-Toolkit for Multivariate Data Analysis with ROOT, 2009.

arXiv:physics/0703039v5.

[59] P. Speckmayer, A. Hocker, J. Stelzer, and H. Voss. “The Toolkit for Mul-

tivariate Data Analysis, TMVA 4”. Journal of Physics: Conference Series

219, 2010. 032057.

173

Bibliography

[60] B. P. Roe, H. Yang, J. Zhu, Y. Liu, I. Stancu, and G. McGregor. “Boosted

decision trees as an alternative to artificial neural networks for particle

identification”. Nuclear Instruments and Methods in Physics Research A

543:577-584, 2005.

[61] Cern brochure, 2009. URL http://cdsweb.cern.ch/record/1165534/

files/CERN-Brochure-2009-003-Eng.pdf.

[62] E. Bruning, P. Collier, P. Lebrun, S. Myers, R. Ostojic, J. Poole, and

P. Proudlock. LHC Design Report. 1. The LHC main ring. Editorial Board,

CERN, 2004.

[63] E. Bruning, P. Collier, P. Lebrun, S. Myers, R. Ostojic, J.Poole, and

P.Proudlock. LHC Design Report. 2. The LHC infrastructure and general

services. Editorial Board, CERN, 2004.

[64] LHC homepage. URL http://lhc.web.cern.ch/lhc/.

[65] LHC collision rate. URL http://lhc-machine-outreach.web.cern.ch/

lhc-machine-outreach/collisions.htm.

[66] The ATLAS Collaboration. ATLAS muon spectrometer: Technical Design

Report. ATLAS-TDR-010, CERN-LHCC-97-022, CERN, Geneva, 1997.

[67] R. C. Fernow. Introduction to experimental particle physics. Cambridge

University Press, New York, 1986.

[68] The Snowmass Working Group on Precision Electroweak Measurements.

Present and future electroweak precision measurements and the indirect de-

termination of the mass of the Higgs boson. FERMILAB-CONF-02/010-T,

2002. arXiv:hep-ph/0202001v1.

[69] S. Gentile. Search for Higgs boson with the ATLAS detector.

[70] I. Tsukerman on behalf of CMS and ATLAS collaborations. Discovery

potential at the LHC: channels relevant for SM Higgs. ITEP, Moscow,

Russia, 2008. arXiv:hep-ph/0812.1458v1.

174

Desidero ringraziare la Professoressa Anna Di Ciaccio per avermi dato l’op-

portunita di lavorare su un campo di ricerca cosı stimolante ed innovativo,

nonche per la grande disponibilita e cortesia dimostratemi.

Sentiti ringraziamenti vanno al Dottor Andrea Di Simone il cui aiuto e stato

fondamentale per la riuscita di questo lavoro. Desidero inoltre ringraziare il

Dottor Luca Mazzaferro, sempre disponibile a risolvere problemi imminenti al

fine di non rallentare il lavoro di tesi e sempre presente con costante supporto

ed interessamento verso lo studio da me condotto.

Ringrazio soprattutto i miei compagni di studi, Andrea, Damiano, Giulio,

Lorenzo, Ludovico e Mattia, con cui ho condiviso questi cinque anni e che hanno

saputo allietare le giornate passate insieme all’universita.

Un ringraziamento speciale va alle mie due piu care amiche, Flavia e Ilaria, al

mio fianco in ogni occasione sia di studio che di svago e immolatesi spesso a

valvole di sfogo in questi ultimi mesi. Ringrazio Flavia per la sua allegria e

spensieratezza capaci di contagiarti anche nei momenti piu difficili, senza farti

mai perdere di vista l’obiettivo di tanto faticare, anzi capaci di contribuire al

suo raggiungimento con una maggiore serenita. Ad Ilaria devo la realizzazione,

nonche la stesura, di questa tesi, senza i suoi immancabili consigli, suggerimenti

e la sua giornaliera presenza cio non sarebbe stato possibile.

Infine, desidero ringraziare con affetto tutta la mia famiglia, per il loro co-

stante incoraggiamento e sostegno. In particolare i miei piu sinceri ringrazia-

menti vanno ai miei genitori, mio fratello e mia sorella, mostratisi sempre pron-

ti ad aiutarmi in qualsiasi situazione pur di agevolare il mio studio, presenti

quotidianamente con comprensione e tanta pazienza oltre ad una buona dose

immancabile, e non sempre forse meritata, di “coccole” giornaliere.

Search for the Standard Model Higgs boson in the …...aiuta ad individuare la scala di energia alla...

Documents

Transcript of Search for the Standard Model Higgs boson in the …...aiuta ad individuare la scala di energia alla...