Scalar top pair production in ATLAS

54
Scalar top pair production in ATLAS Sander van Til 23rd August 2007 Master Thesis National Institute for Nuclear Physics and High Energy Physics University of Amsterdam Supervisor: Paul de Jong Abstract In this thesis stop-pair creation in the SU3 mSUGRA model has been investigated. This includes looking at decay modes and their branching fractions. Aside from this, the discrimination potential of these stop-pair events has been looked at, with semi-leptonic top pair events and other SU3 SuSy events as background. Also, the discovery potential of the full SU3 SuSy has been researched, again with semi-leptonic toppairs as a background.

Transcript of Scalar top pair production in ATLAS

Scalar top pair production in ATLAS

Sander van Til

23rd August 2007

Master Thesis

National Institute for Nuclear Physics and High Energy Physics

University of Amsterdam

Supervisor:

Paul de Jong

Abstract

In this thesis stop-pair creation in the SU3 mSUGRA model has been investigated. This

includes looking at decay modes and their branching fractions. Aside from this, the

discrimination potential of these stop-pair events has been looked at, with semi-leptonic top pair

events and other SU3 SuSy events as background. Also, the discovery potential of the full SU3

SuSy has been researched, again with semi-leptonic toppairs as a background.

CONTENTS

Contents

1 Introduction 2

2 The Standard Model and its shortcomings 3

3 Supersymmetry 53.1 What is supersymmetry? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.2 What does SuSy solve? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.3 mSUGRA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4 The Experiment 114.1 The Large Hadron Collider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114.2 The ATLAS detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

5 Frameworks 185.1 Samples and TVModularAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185.2 Toolkit for MultiVariate Analysis (TMVA) . . . . . . . . . . . . . . . . . . . . . . . . . . 20

6 Kinematics 246.1 Discriminating variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246.2 SuSy Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256.3 Top Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256.4 Stop Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266.5 Jet and Lepton trigger conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266.6 B-tagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

7 SuSy Discovery Potential 317.1 Signal optimization using ‘Histogram Division’ . . . . . . . . . . . . . . . . . . . . . . . 31

7.1.1 Optimizing with single variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 327.1.2 Optimizing with double variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

7.2 Signal optimization using TMVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

8 Stop signal versus the rest 408.1 Final states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408.2 Signal optimization using ‘Histogram Division’ . . . . . . . . . . . . . . . . . . . . . . . 418.3 Signal optimization using TMVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

8.3.1 Results when training the backgrounds individually . . . . . . . . . . . . . . . . 448.4 Analysis without demanding a lepton . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

8.4.1 Results when training the backgrounds individually . . . . . . . . . . . . . . . . 49

9 Conclusions 51

References 52

1 INTRODUCTION

1 Introduction

The present description of elementary particles is done to good precision by the Standard Model.Among other things, this model predicts particle properties and interactions that are in agreementwith observations made at experimental setups like the LEP accelerator in Geneva and the Tevatronin Chicago. Although the Standard Model has booked big successes like the unification of electromag-netism and the weak nuclear force into a single theory (which still has to be experimentally confirmedby the existence of the Higgs boson), the theory is far from complete. One of main the problems isthe quadratic mass divergence of the Higgs boson, making its mass go to infinity. Besides this prob-lem, there are still unexplained phenomena, like why the top quark is so heavy compared to all otherparticles in the Standard Model.

Aside from this, the dream and belief of many physicists, that the whole of nature can be describedby a single theoretical framework, has been strengthened by the ‘electroweak unification’. To achievea Grand Unified Theory (GUT) the strong nuclear force has to be unified with the electroweak theoryand this does not seem to be possible in the Standard Model in its present form. A bigger problemis that, when wanting a ‘Theory Of Everything’ (TOE), gravity has to be included as well, which isfully beyond the reach of the Standard Model. This means the Standard Model cannot be a TOE andphysicists are doing extensive researches looking at theories ‘beyond-the-Standard-Model’.

One of the candidates for a GUT theory is supersymmetry (SuSy) and is, to most physicists, the bestcandidate in the next step towards a grand unified theory. Introducing SuSy in the Standard Modelprovides solutions to several main problems in the Standard Model and can lead theoretically to aunification of the strong nuclear force coupling constant with the other two coupling constants. SuSyis a complex theory and often the choice is made to start off with a highly restricted version of SuSy,called mSUGRA. It is a theory that becomes relevant at high energy scales, although the exact energyscale where it becomes relevant is not fully known, and there has been no experimental proof for SuSyin past and present experiments.

The LHC, now under construction at CERN, Geneva, is an accelerator that will operate at the frontierof experimental physics and will access the highest energy regions, so far (up to a CM energy of 14TeV). This highly energetic region is a scale where SuSy could reside and searches for experimentalproofs of SuSy are being prepared as we speak. One of the detectors, ATLAS, is a good candidate forthe discovery of SuSy (if SuSy would reside in the reachable energy scales).

In this research analysis is done on simulated supersymmetric events in the ATLAS detector. Ananalysis has been written to investigate the SuSy discovery potential in the SU3 mSUGRA phasespace point. Aside from this, a more inclusive investigation has been done, looking at a specificsupersymmetric process, called ‘scalar top pair creation’. This stop pair production is a rare processand shows some resemblances to the Standard Model counterpart of top pair creation. It is useful togather as much possible information on top quarks (and their scalar partners), since the top quarkmass is important in fixing the masses of other particles in the Standard Model. The significance ofthe stop signal is looked at, together with the branching fractions and topologies of stop decay. It willbecome clear that certain stop decays are suitable for top quark reconstruction.

2

2 THE STANDARD MODEL AND ITS SHORTCOMINGS

2 The Standard Model and its shortcomings

The Standard Model [1–3] is a theoretical framework that describes elementary particles and theirinteractions. It has proven to be extremely accurate, making theoretical predictions which are ingood agreement with experimental observations. The theory is not complete though, since it does notinclude the theory of gravity. As far as we can tell, all of nature abides by the laws of four fundamentalforces, namely the electromagnetic force, the strong and the weak force and the gravitational force.These forces are mediated by their own particle(s), called force-carriers or gauge bosons. The fourforces are:

• Electromagnetism. This is the most commonly known force. This force acts on particles thathave electric charge. The associated boson with electromagnetic interaction is the photon. It isa massless particle and, therefore, has an infinite range. Electromagnetism is responsible for allelectric and magnetic phenomena, like current and light.

• Weak force. This force is responsible for all radioactive processes like radioactive decay andnuclear fusion. The force-mediating bosons are the charged W-particles (+1 and −1) and theneutral Z-particle. Since these force-carriers are very heavy (O(100GeV)), the range of this forceis very small.

• Strong force. This force keeps the quarks together in e.g. protons and neutrons. This forceacts only on particles with ’color’ charge (quarks), through bosons called gluons.

• Gravity. This force affects all particles, which have a finite amount of energy. Theoretically,the force is mediated through the graviton and has an infinite range.

These forces act on ’regular’ particles of which all matter is built up. All visible matter (out of whichwe are built) is built up out of three particles: the up quark, the down quark (which have + 2

3 and − 13

charge, respectively.) and the electron (which is a ’lepton’, which has a charge of −1). For instance,a proton consists of two up quarks and one down quark and hydrogen consists of this proton withone electron surrounding the proton. But with time and better, more accurate experimental set-ups,more particles were observed. During the 20th century, physicists discovered more types of quarks andmore types of leptons. The third quark discovered, was the strange-quark and has charge − 1

3 andtherefore categorized as a ’down-type’ quark. It only differs from the down quark in mass (it is heav-ier). Again later the charm quark (up-type), the bottom quark (down-type) and, finally, the top quarkwere discovered, being, in order of their discovery, heavier than their ’same-type brothers’. The samething happened with the leptons and it turned out that the electron has two heavier ’brothers’, namelythe muon and the tau-lepton (or tauon). Aside from these charged leptons, all of the ’electron-type’particles have a neutral, nearly massless partner, named ’neutrino’ by Dirac. So far, there has beenno indication of a fourth family of quarks or leptons. It has been shown by the LEP experiments thatthere are not more than three types of neutrinos, making it even more plausible that there are onlythree generations of particles. All particles have a so-called anti-particle, which is of the same mass,but opposite in charge. In figure 1, an overview of all ’building blocks’ is shown.

Mathematically, the Standard Model is a SU(3)C × SU(2)L × U(1)Y gauge group. This is a productof different gauge groups. The U(1) gauge symmetry belongs to electromagnetism and means thatthe theory is gauge invariant under U(1) transformations (which are phase transformations). TheSU(2)L affects the weak force, which is a symmetry of weak isospin. In quantum field theory, one canproject a field into two chirality states, lefthanded and righthanded. In the SU(2) symmetry group,the lefthanded part is projected out, to keep the Standard Model free of anomalies. The SU(3)C

symmetry belongs to the strong force sector.

3

2 THE STANDARD MODEL AND ITS SHORTCOMINGS

Figure 1: The particles of the Standard Model.

Electroweak unification

As said, the Standard Model only incorporates the first three of these forces. In the 1960s it wasproven by Glashow, Weinberg and Salam that the force of electromagnetism and the weak force areunified at high energies. [4, 5] This means the forces were one and the same in an earlier, hotteruniverse and all bosons were massless at this point. This symmetry of forces appears to have beenbroken somewhere along the evolution of the universe, since the W and Z bosons are so heavy, whilethe photon is massless. A possible scenario for this breaking of symmetry is given by introducing anadditional field: the Higgs field. The Higgs interaction potential changes in shape by this spontaneoussymmetry breaking and creates a non-zero vacuum expectation value (vev). It is believed that allparticles acquire mass by interaction with this Higgs field (or rather: the vev) and, since it couplesonly to three of the four gauge bosons, provides a solution to the riddle of why the W and Z bosonsare so heavy, while the photon is massless. The stronger a particle couples to the Higgs field, theheavier it is. Quantizations of this field are called Higgs bosons. It’s this boson physicists are lookingfor at LHC as a proof of the ’Higgs mechanism’ being right. This unified ’electroweak’ force camewith mathematical inconsistencies 1, which were solved by Veltman and ’t Hooft. But again thereare mathematical problems: this time, the mass of the Higgs boson becomes infinite when lookingat loop-diagrams2. This is one of the biggest problems in the Standard Model of past decades andalthough there are several theories which solve this (partially), there has been no experimental proofof the existence of these ’beyond the Standard Model’ models and/or particles. Also, there is the stillunexplained observation of the huge mass difference between quarks (e.g. Mtop/Mup ∼ O(105)).One of the theories that resolves some of the main issues in the Standard Model is supersymmetry. Insection 3 will be explained what supersymmetry is, why it’s a likely candidate to describe our universeand how it fixes the Standard Model.

1These inconsistencies are often called ’anomalies’. In this case probabilities of certain reactions gave infinities, whereone expects a number between 0 and 1. The theory is said to be non-renormalizable.

2More specifically, there is a quadratic mass term in the langrangian which diverges.

4

3 SUPERSYMMETRY

3 Supersymmetry

3.1 What is supersymmetry?

Supersymmetry [4,6–11] is a symmetry that relates particles with different spin3. More accurately: allhalf-integer spin particles (fermions) have an integer spin partner (bosons) and the other way around.All fermions are spin 1

2 particles and they all have a supersymmetric partner with spin-0 (called a’scalar’ particle). All bosons ,which have spin-1, spin-0 or (in the case of the hypothetical graviton)spin-2, have supersymmetric partners with half-integer spins (all have spin- 1

2 , except the gravitino,which has spin- 3

2 ). When this symmetry was introduced 1970s 4, physicists soon found out that noneof the known bosons could be ’superpartners’ of the fermions. This meant all Standard Model particlesneeded their own superpartner.

For example, the top quark is a fermion; its susy-partner is the ’stop’ quark (short for ’scalar top’)which is spin-0. The partners of the W and Z bosons are called ’wino’ and ’zino’. So the StandardModel particle spectrum (with its fermions and gauge-bosons) is doubled by adding SuSy with its’sfermions’ and gauginos. There is, however, no a priori reason to project out the lefthanded states ofthe SU(2) group as is done in the SM; therefore all SuSy sfermions occur in both left- and right statesand the SM particle spectrum is actually more than doubled. Note that, since sfermions are scalars,the ‘left’ and ‘right’ states are not chirality states, but merely the superpartners of the two ‘handed’particles of the SM. The gauginos are not pure eigenstates, but mix to form mass eigenstates. Theneutral gauginos ( the zino, bino5 and two neutral higgsinos) mix to form ’neutralinos. The chargedgauginos ( the wino and the charged higgsino) mix to form ’charginos’. In most models, the massdifference between the left and right-handed stop squark (and the sbottom squark as well) is small, sothese two states mix to form the states, t1 and t2. Firstly, several problems for which SuSy providesan answer are discussed. After that, mSUGRA, the most commonly used model of SuSy is explained.

3.2 What does SuSy solve?

Supersymmetric GUTs

Physicists like to believe, that in the end there is one all-encompassing theory, which describes all in-teractions in the universe. At first sight, this goal looks far-fetched, since all forces seem to act by thelaws of their own, distinct theories. Nevertheless, the first step towards unification was made by J.C.Maxwell in the 19th century, who unified the theories of electricity and magnetism into a single theoryof ’electromagnetism’. With the discovery of electroweak symmetry, the merging of electromagnetismand the weak interaction into one theory there was, again, a clear hint of the existence of a unifiedfield theory. The merging of these theories is obtained by a phenomenon called ’the running of thecoupling constants’. All theories have a certain, constant number which describes how ’hard’ the forceacts on particles that are able to feel the force. This is where the names ’weak’ and ’strong’ force comefrom. These coupling constants change or ’run’ when the energy is increased. This is because, whenprobing an interaction to higher and higher precision (i.e. shorter frequencies, i.e. shorter time scales),Heisenberg’s uncertainty relation allows the creation of ’virtual’, intermediate particles. These newlyintroduced interactions, renormalize the coupling constant, making it dependent on the energy (sincethese interactions are only seen at smaller distances, which corresponds to higher frequencies, whichis a measure of the energy). So, in the electroweak unification, the coupling constants of two theories

3Like the Standard Model, it is a SU(3) × SU(2) × U(1) group.4Initially, SuSy was introduced by string theorists, needing an additional symmetry for string theory to work.5This corresponds to the field, responsible for the photino, the superpartner of the photon

5

3 SUPERSYMMETRY

come together at a certain energy scale (Mweak), making them indistinguishable.

After this discovery physicists started to look for an even more encompassing theory: a Grand UnifyingTheory (GUT). Over the years, several theories have been developed to try to merge the strong forcewith the electroweak force. It turns out that in almost all theories, there is no way let the couplingconstants of the strong and the electromagnetic and weak force, come together. One of the few theorieswho does do this is SuSy (see figure 2).

Figure 2: Left: the inverse coupling constants in the SM. Right: the inverse coupling constants whenSuSy is introduced.

Hierarchy problem

In quantum field theory, the probability of a certain process occurring, is calculated by taking the sumof all possible ways to end up with the wanted initial and final state. This is known as the Feynmanspath integral formulation. Now, when computing the interactions of the Higgs terms in the lagrangian,there are contributions to the ’mass-squared’ term that are quadraticly divergent. This means that theHiggs boson mass will blow up to the highest energy scale possible, the Planck mass scale (Mpl). If theStandard Model is to represent a ’natural’ theory, the Higgs mass has to remain small (O(100GeV ))and, thus a cancellation of these diverging terms is needed. It has been shown that the quadratic di-vergence of the Higgs boson mass mentioned, can be canceled by introducing a set of scalar partners ofthe SM fermions. In figure 3 the upper diagram causes the Higgs mass to blow up. The lower diagramexactly cancels this infinite contribution and keeps the Higgs mass ’natural’. Note that, not only thetop quark, but all fermions contribute to the blow-up. All corresponding scalar partners cancel thiscontribution.

This cancellation occurs by two radiative correction-terms in the lagrangian, which are opposite insign [12]. The radiative correction which causes the divergence of the Higgs mass in the Higgs vev is of

the form ∆m2H = − |λf |2

8π2 Λ2UV + . . . . Whereas the mass-squared term, obtained by introducing scalar

partners, has the form ∆m2H = λS

16π2 Λ2UV + . . . . The ΛUV term is the ultraviolet momentum cutoff

scale, which is used to regulate the loop integral. By using the relation λS = |λf |2, which can be setin unbroken SuSy theory, and kept after SuSy breaking by introducing ’soft’ SuSy breaking terms, thetwo terms cancel eachother. The factor 2 difference is accounted for, since one introduces two superdoublets, giving two times this term. As one can see, these cancelling terms do not depend, at leastto first order, on the individual masses of fermions and their scalar partner.

6

3 SUPERSYMMETRY

Figure 3: Two diagrams that cancel eachother. Upper: The top quark loop. Lower: The stop squarktadpole.

R-Parity & Dark Matter

Since it is always allowed to introduce gauge-invariant, renormalizable terms to the lagrangian, onecould easily insert a term which e.g. violates baryon(B) or lepton(L) number. This is a problem,since there has been no experimental observation of B or L violation. The most important proof ofthis is the non-observation of proton decay. For this reason, an ad-hoc symmetry is introduced inSuSy, called R-parity [13]. R-Parity is a symmetry which avoids baryon and lepton number violationin supersymmetric theories. It is a fully conserved quantum number, defined as R = (−1)3(B−L)+2s.Now all supersymmetric particles (’sparticles’) are R-odd particles (R = −1) and all Standard Modelparticles are R-even (R = 1). Imposing conservation of this symmetry has some consequences, namely:

• Sparticles always decays into a state, which has an odd number of sparticles

• The lightest supersymmetric particle (LSP) is absolutely stable

The LSP is, according to most theories, a neutral particle and interacts very weakly with ordinarymatter. This, together with the fact it has to be stable, provides a very good candidate for darkmatter, which is required in cosmology to describe astronomical observations.

3.3 mSUGRA

To get rid of the hierarchy problem and leave the structure of the Standard Model unchanged, physi-cists extended the SM with SuSy in a minimal way. This resulting model is known as the ’MinimalSupersymmetric Standard Model’ or MSSM. This extension adds all SuSy particles and all couplingsbetween them, to the SM parameters. Although this is the most obvious way to do it, one is left withabout 120 unknown parameters by introducing e.g. soft SuSy breaking terms in the lagrangian. Theseparameters all need their personal adjusting. This is an almost impossible task, and even then mostparameters lead to unacceptable predictions, like flavor changing neutral currents.

The easiest solution is to constrain the MSSM even further by using a model called ’mSUGRA’ (minimalSUper GRAvity) [14]. Super gravity is a theory which combines the theories of supersymmetry andgeneral relativity. mSUGRA uses only a one dimensional framework to describe super gravity, hence

7

3 SUPERSYMMETRY

the ’minimal’ in the name. mSUGRA is, by far, the most widely used model, because of its predictivepower. In this model, one assumes that there is a universal scalar mass (m0), gaugino mass (m 1

2

) and

trilinear coupling (A0) at the GUT energy scale. The universality of the gaugino masses is actually aresult from the merging of the coupling constants, which one of the main reasons to consider mSUGRA.The useful aspect of the mSUGRA parameterization of supersymmetry breaking is that it results inonly 5 free parameters, instead of 120, namely:

• m0: Common scalar mass. All scalar particles have the same mass before the breaking of SuSy.

• m 1

2

: Common gaugino mass. All gauge particles in the SuSy model have the same mass beforethe breaking of SuSy.

• A0: Common trilinear coupling. A0 determines the scalar couplings of the squarks and sleptons(Au, Ad, Al).

• tan β: Ratio of the vacuum expectation values of the Higgs doublets. tan β = 〈HU 〉〈HD〉 .

6

• sign(µ) 7

Aside from this advantage, in various mSUGRA models the electroweak symmetry breaking seemsto occur naturally. This is visible when, for example, looking at the third generation interactions(t,b,t...) where the M2

h decreases faster than the other mass terms due to the Yukawa couplings. It isproven that around the electroweak scale M 2

h turns negative and this naturally generates electroweaksymmetry breaking. Another important parameter in mSUGRA is the top quark mass, since the massof the SuSy particles depend on this, as can be seen from e.g. the stop squark mass matrix (Eq. 1).The off-diagonal terms are proportional to mqAq and produces mixing in the stop sector. Generally,the mixing terms for the other squarks are negligable, since the masses of the corresponding quarksare low.

m2t

=

(

m2Q + m2

t + m2Z( 1

2 − 23 sin2 ΘW ) cos 2β mt(At + µ cot β)

mt(At + µ cot β) m2U + m2

t + 23m2

Z sin2 ΘW cos 2β

)

(1)

mSUGRA parameter space

When all scalar particles and all gauginos both have universal masses at the GUT scale, it is clearthat setting these parameters plays an important role in the evolution of the masses of the individualparticles. When calculating the resulting masses for given m0 and m 1

2

, one can easily see if this stillcorresponds to a possible, realistic universe. For example, when the gaugino mass is large and thescalar mass is not, this will not lead, theoretically, to electroweak symmetry breaking which is neededfor the Standard Model to describe the universe. When one chooses a small gaugino mass and alarge scalar mass, the LSP will not be a neutral particle. If a chargino or a ‘stau’ would be the LSPand, therefore, a candidate for dark matter, it would have been observed in certain phenomena, sincecharged particles always feel the electromagnetic force, leaving visible evidence of its existence. Asidefrom these theoretical motivations, there are regions in ‘m0-m 1

2

’ space which can be excluded fromprevious experiments, like LEP and Tevatron. Certain values could, for example, result in a modelwith a Higgs mass smaller than 114 GeV, which is excluded by LEP. Another bound, which is likelyto exclude a large part of this parameter space, is the fact that a lot of mass choices would lead to an

6A second Higgs doublet is necessary because two Higgsinos are needed to keep the model free of anomalies. Anotherimportant reason is to have different Yukawa couplings between down and up type quarks, responsible for the quarks’masses.

7The µ2 term appears when introducing the Higgs potential. The sign of µ is not fixed and influences some processes.

8

3 SUPERSYMMETRY

LSP which has couplings that lead to an excess of LSP production. A too large number of heavy LSPsleads to the space being too curved and sometimes even closed, which is not in agreement with ourobservations of the universe. These results on the composition of energy and matter in the universe areobtained with the WMAP satellite [15]. This constraint, however, is less strong since there are a lot ofcosmological models, which could incorporate heavy dark matter with the introduction of cosmologicalconstants.

In figure 11 the m0 −m 1

2

parameter space is shown on the left side. The grey regions are excluded bytheory. SuSy being located here, would either result in a charged LSP or the absence of electroweaksymmetry breaking. The blue region is excluded by previous experiments. The green regions representmodels which are likely/possible descriptions of our universe. The bar on the right side of the left plotshows the χ2 corresponding to the shown colors. It is a measure of the probability of a certain point,with the minimum χ2 providing the best fit. The right side plot in figure 11 shows the number ofobservable Higgs particles as function of the mass of one of the Higgs particles (mA) and the mSUGRAparameter tan β. The area under the black line is excluded by the results from the LEP experiment.

Figure 4: Left: The m0-m 1

2

space. Right: The tan β-mA space.

The SU3 point

The point used in this analysis is called SU3 and is located in the green region in the outer left regionof the left graph in figure 11. This point is outside the reach of previous experiments and still producesa sufficiently large LSP mass, to be a dark matter candidate and is in agreement with the WMAPobservations. It is a point which is widely used to study SuSy candidates/phenomena, because it is an‘easy’ point, which allows a quick discovery once LHC starts running8. Nevertheless, the SU3 point is

8Only if nature would be so kind to have SuSy located in this region, of course

9

3 SUPERSYMMETRY

an arbitrary point in this phase space and other points are studied as well. In this thesis the focus issolely on the SU3 point, however. The parameters that fix the SU3 point are the following:

• m0 = 100 GeV

• m 1

2

= 300 GeV

• A0 = −300 GeV

• tan β = 6

• µ > 0

As one can see on the right side in figure 11, choosing tan β to be ∼ 6 and mA being fixed at ∼ 500GeV, results in a model with only one observable Higgs. The value of mA is directly related to theenergy scale of where electroweak symmetry breaking takes place, which in this model is taken to beO(500GeV ). The model depends only weakly on A0. In figure 5 the full mass spectrum of the SU3model is shown. The difference in masses of the sfermions compared to the fermions (or rather theinverted mass hierarchy) is produced in the renormalization group equations (RGEs) which describethe running of the coupling constants.

Figure 5: The sfermion and gaugino masses in the SU3 model.

10

4 THE EXPERIMENT

4 The Experiment

4.1 The Large Hadron Collider

The Large Hadron Collider (LHC) is a proton-proton collider at CERN on the border of Switzerlandand France, near Geneva. At this moment, it is still under construction in the same, 27 kilometer cir-cumference, tunnel where LEP used to be, and will become operational somewhere halfway 2008. Themaximum centre-of-mass energy at which this accelerator will produce these collisions is 14 TeV, thehighest ever achieved. Although the Tevatron in Chicago was the first collider to enter the TeV-energyrange, the LHC is the first accelerator that allows the study physics at 14 TeV. The other reason whyis marked as a ’discovery’-collider is the fact that it has a luminosity, which has never been reachedbefore. Main goals of the LHC include checks on the Standard Model, look for confirmation and/orcontradictions in the present theory, like the discovery of the Higgs boson and looking for signatures ofnew physics. Promising new theories include supersymmetry and extra dimensions and it is expectedthat at least a hint of these new theories’ predictions is observed. It is yet unclear what is to beexpected, but at least it is certain that the LHC will provide some answers, and very likely that it willraise even more questions.

The acceleration process is performed in several steps, starting with a linear accelerator LINAC2. Thisinjects 50 MeV protons in the Proton Synchrotron Booster (PSB), which in its turn accelerates theprotons up to 1.4 GeV and injects them into the Proton Synchrotron (PS). The PS will increase theproton energy up to 26 GeV, before injecting the protons into the Super Proton Synchrotron (SPS)which is used to further increase the protons energy to 450 GeV. The SPS will feed the LHC withthese protons where they finally can get accelerated up to an energy of 7 TeV each.

The acceleration in the LHC is performed by Radio Frequency Cavities (RFCs). The bending of theprotons is done with 1232 superconducting dipole magnets, each generating an 8 Tesla magnetic field.Just before entering an interaction point, the proton bunches are focussed by multiple quadrupolemagnets. The collisions of the two proton beams will take place in the four interaction points, wherefour big detectors are located. The time between succesive bunches is of the order of 25 ns, whichcorresponds to 40 million collisions per second (=40 MHz). In the first three years the LHC willoperate at a luminosity of 1033 cm−2s−1, which corresponds to an integrated luminosity L of 10 fb−1

per year, assuming the LHC will run 107 s per year. In the years after that, the luminosity will beincreased to 1034 cm−2s−1, corresponding to an integrated luminosity of 100 fb−1 per year. In total,the expected amount of data is estimated to be around 300 fb−1.

In total six experiments are being constructed along the LHC, including two multi purpose detectors,ATLAS and CMS, and four more specialized detectors: LHCb, ALICE , TOTEM and LHCf. ATLASand CMS are designed to study various aspects in several fields of physics. LHCb is designed tostudy CP-violation in the B meson system. ALICE will study strong interaction phenomena, like theforming of quark-gluon plasmas. This will be done most effectively when, at a later stage, lead ionsare accelerated in stead of protons. The smaller experiments, TOTEM and LHCf, are used to studythe total pp cross section and aspects of particle emissions along the beampipe, respectively.

11

4 THE EXPERIMENT

LHC: Large Hadron ColliderSPS: Super Proton SynchrotronPS: Proton Synchrotron

SPS

PS

LINAC

2

p

ATLAS

CMS

LHC

ALICE LHC−B

Figure 6: Overview of the accelerator and its detectors.

12

4 THE EXPERIMENT

4.2 The ATLAS detector

Figure 7: An artistic impression of the ATLAS detector. Yellow: Inner detector - Green: Electromag-netic calorimeter - Orange: Hadronic calorimeter - Grey: Magnets - Blue: Muon spectrometer

ATLAS is the biggest of the four detectors and is a general purpose detector [16–18]. It’s 45 meterslong and 25 meters in diameter. It weighs 7000 tonnes and consists of several layers, with differenttypes of detectors, which all have their own purpose. Figure 7 shows an artistic impression of ATLAS.

The ATLAS detector is 4π detector, measuring in all spatial directions. From the interaction point inthe centre outward, there are the inner detector, the electromagnetic and hadronic calorimeters and,on the outside, the muon spectrometer. Around the inner detector there is a superconducting solenoid,which generates a homogeneous 2 Tesla magnetic field, parallel to the beampipe. The most impressivepart of ATLAS is the barrel toroid, which is located outside the hadronic calorimeter. Together withthe end-cap toroid it generates a magnetic field, perpendicular to the beampipe (following the circularshape of ATLAS in figure 7).

The inner detector

The inner detector is located directly around the interaction point. Its main purpose is to reconstructthe tracks of charged particles in the high density region, closest to the interaction point. This has tobe done to a very high precision to distinguish all different tracks, and therefore a very high detectorresolution is required. The solenoid surrounding the inner detector is used to identify the charge of theparticles, by looking at the direction of bending, and the momentum of the particles, by looking at theamount of curvature of the charged particles. The inner detector consists of three parts: the siliconpixel detector, the Semi Conductor Tracker (SCT) and the Transition Radiation Tracker (TRT). Infigure 8 the inner detector is shown.

13

4 THE EXPERIMENT

TRT

Pixels SCT

Barrelpatch panels

Services

Beam pipe

Figure 8: A view of the ATLAS inner detector. Note that the outer TRT endcaps have never beenbuilt.

The silicon pixel detector is the innermost part of the inner detector. It consists of three layers ofpixel detectors in the barrel region and three discs in the endcap region. In total it contains 1744 pixelmodules, with on average 47,000 pixels per module. The size of a pixel is 50 by 400 µm. It has about80 million readout channels, which is about half of the total amount of readout channels of ATLAS.A big challenge is that the pixel detector is required to be very radiation hard, since being only fourcentimetres away from the interaction point means a large exposure to radiation.

After the silicon pixel detector, there is the Semi Conductor Tracker. It consists of silicon strip de-tectors and its function is similar to that of the pixel detector. The SCT consists of four double-layersilicon strips in the barrel region and nine double-layer silicon discs per end-cap. Although it has ’only’6,1 million readout channels in total, the SCT is the most important part for track reconstruction inthe plane, perpendicular to the beam.

The Transition Radiation Tracker is the outer part of the inner detector. It consists of a straw tracker.The straws are filled with a gas mixture9, which ionizes when a charged particle goes through. Thestraws in the barrel region are positioned parallel to the beam pipe, which makes the measuring oftrajectories perpendicular to the beampipe (in the transverse plane) possible. The straws in the end-cap regions are oriented perpendicular to the beampipe. In addition to the tracking, the xenon gas inthe straws is sensitive to transition radiation (photons), which is produced by the radiator material,in between the straws. The amount of transition radiation is a measure for the relativistic factor10 ofthe particle. In this way, distinction between e.g. electrons and pions is possible, since electrons aremuch lighter and produce many more photons than pions.

The calorimeters

After the inner detector and the solenoid, there are the calorimeters. From inside out, there are theelectromagnetic calorimeter (ECAL) and the hadronic calorimeter (HCAL). The calorimeters measure

970% Xe, 27% CO2 and 3% O210γ = E/m

14

4 THE EXPERIMENT

particle energies by absorbing it. Both calorimeters are sampling calorimeters, which means that theyconsist of layers of absorbing material and sampling material. The absorbing materials are of highdensity, which makes sure, particles interact with the material. The sampling material measures theshape of the resulting particle shower and the fraction of the energy deposited, which can be used toreconstruct the initial particle’s energy. A schematic overview is drawn in figure 9.

Figure 9: An overview of the calorimeters and the inner detector.

The electromagnetic calorimeter absorbs the energy of particles, lost by electromagnetic interactions.For example, an electron interacts with the material after a certain radiaton length and emits photonsin the process (bremsstrahlung). In this way the electron loses its energy and the emitted photonsstart a chain reaction of pair creations and ionization. This results in a shower of electromagneticparticles. Electrons, positrons and photons deposit their energy here. The ECAL measures the energydeposition to a high precision in energy and direction.11. The absorbing material in the ECAL arelead and stainless steel. The sampling material is liquid argon, which is kept cool at a temperature of90 K by a cryostat surrounding the ECAL. The ECAL has an accordeon shape, where absorbing andsampling materials are frequently interchanged in a layered way.

The hadronic calorimeter absorbs energy from particles that came through the ECAL which is de-posited via strong force interactions. The precision of the HCAL is far less in both energy resolutionand direction, than that of the ECAL. In the barrel of the HCAL, the absorbing material is iron andthe particle showers are sampled by scintillators. It has a layered tile-structure, where absorber andsampling material are frequently interchanged. The HCAL end-cap sampling material is liquid argon.

11The resolution of the ECAL on energy is 11.5%√

E± 0.5% and the resolution on polar direction is 50mrad

E(E in GeV)

15

4 THE EXPERIMENT

The muon spectrometer

The muon spectrometer is the outermost part of ATLAS and is, by far, the largest part in volume.The two purposes of the spectrometer are a independent muon trigger and a high quality muon recon-struction. The muon spectrometer is an important part of ATLAS, since muons play a key role in anumber of interesting physics processes and they are not stopped by the calorimeters.

Since muons do not easily deposit their energy, it has to be measured in a different way. The hugebarrel toroid is designed to make this measurement possible. The muons which travel through thismagnetic field, curve with a magnitude that is dependent to the momentum they posess. With ATLAS’toroidal field, the curvature is in the η direction (parallel to the beampipe), so it is in this directionwhere the position has to be measured very accurately. Aside from the momentum, the charge can bemeasured, by looking at the direction of curvature.

As can be seen in figure 10, a muon traverses three layers, on average in the barrel region. Theselayers all consist of chambers with monitored drift tubes, or MDT chambers. These MDT chambersare positioned in the φ direction, perpendicular to the beampipe. An MDT is a aluminum tube with athin tungsten wire in the centre. All MDTs are filled with a carbondioxide-argon gas mixture, whichionises when a charged particle c.q. a muon goes through. A signal is measured by the current flowingthrough the wire, due to the high voltage applied to the wire. Each layer of the muon spectrometerhas different size MDT chambers, with different number of layers MDTs. Because of their orientation,the MDT chambers cannot perform a precise measurement in the φ direction. For this reason, Resis-tive Plate Chambers (RPCs) are located on various MDT chambers in the middle and outer layers.Together with Thin Gap Chambers (TGCs), RPCs provide the muon trigger for ATLAS.

Triggering and data acquisition (DAQ)

When the LHC runs at high luminosity, collisions will take place at a rate of 40 MHz. The amountof data it takes to store a full event, is of the order of 1 MB. This would result in 40 TB of dataproduction per second. This is not doable, both dataflow and storage cannot handle these amountsof data. Luckily, only a fraction of events contain interesting physics events. Most of the collisionsare not head-on and are merely a scattering of the two protons. To reduce the total data flow, andonly keep the interesting processes, a selection filter has been developed, called ’triggering’. ATLAS’triggering scheme is organized in three different levels, each one of them, refining the previous selection.

First there is the level-1 trigger (LVL1). It is a hardware based trigger and uses data from thecalorimeters and the muon trigger chambers. For example, in the muon trigger, there are the RPCsand TGCs which make an estimate of the muon’s PT and decide whether it is high enough to pass, orto fire, the trigger. In this stage, the frequency of the data flow is reduced from 40 MHz to 75 kHz.The LVL1 specifies a ‘Region of Interest’ (RoI) and hands it to the second level trigger (LVL2). LVL2is a software trigger, which uses the output (RoI) of the level-1 trigger as input. Now it uses datafrom the inner detector as well, to decide whether an event should pass to the third-level trigger. Thelevel-2 trigger reduces the 75 kHz data flow to about 1 kHz. The Event Filter trigger (EF) is againa software trigger, using all data and uses complex reconstruction algorithms to fully reconstruct theevents. The maximum output of the EF trigger is about 100 Hz, which corresponds to about 100 MBper second. The output of the EF trigger is written to mass storage in the form of raw data. Fromthis raw data, ESDs and AODs (see section 5.1) are formed with reconstruction algorithms, ready forfurther analysis with offline software.

16

4 THE EXPERIMENT

End-captoroid

Barrel toroidcoils

Calorimeters

MDT chambersResistive plate chambers

Inner detector

Figure 10: Transverse view of the muon spectrometer.

17

5 FRAMEWORKS

5 Frameworks

5.1 Samples and TVModularAnalysis

Because the beam collisions of the LHC won’t start until 2008, event simulations have to be usedfor optimisation of reconstruction algorithms and physics analyses. Important steps in this are eventgeneration, detector simulation, detector response, reconstruction and physics analysis.

Software structure

Simulating events starts with the event generator, which simulates the collisions. An event genera-tor calculates the particles created in this collision, with their four-momentum. There are several Inthis analysis the background (ttbar) samples were generated by the specialized generator MC@NLOand uses HERWIG for the hadronization processes. The SuSy samples were generated by Isasugra(which is a library of ISAJET 7.71) and again uses HERWIG to simulate hadronization. The detec-tor simulation is performed by GEANT4 and simulates the passage of particles through the detectorand the interaction of particles with the various detector parts and the magnetic fields. The detectorresponse simulates responses of the detector, making, for example, the area in the simulated detector,corresponding to the inner detector, sensitive to charged particles, leaving hits and measuring energydeposits. Together, the event generation, detector simulation and detector response build up a realisticsimulation of an event. At this point, there is nothing more than hits and energy depositions in all partsof your detector. With this ’raw data’, reconstruction can be done. Various reconstruction algorithmshave been developed to perform e.g. energy measurements, track fitting and vertex determination.The output of this reconstruction is written to filetypes, called ESDs (Event Summary Data) andAODs (Analysis Object Data). The ESD contains the original event generation chain and the fullreconstruction of an event and can be used for detailed reconstruction studies. The AOD is used forphysics analysis. An AOD contains only information like particles and their energy and momentum,and is, therefore, much smaller in size than an ESD.

TopView and TVModularAnalysis

TopView is an analysis program developed at CERN, specialised for top physics. It uses the ATHENAframework, which is derived from LHCb’s GAUDI. TopView reads in AODs and does an analysis usingtools and services, specified by the user. For example, one can insert a tool that tries to reconstruct theW-mass from all possible combinations of jets. As an output, TopView gives ntuples, which are filledin tree-like structure12, with the results of the analysis next to the original AOD data of the particles.13

The analysis described in this thesis, is done with an analysis framework, called ‘TVModularAnaly-sis’.It is a modular package analysis program and TVModularAnalysis takes ntuples to analyse, insteadof AODs (like TopView does). In fact, it takes the ntuples that are the output of TopView, using itas an ‘ntuple dumper’ and not as an analysis framework. This is done to ensure reproducibility of theanalysis, by thirds (using a TopView as a benchmark). The results from TVModularAnalysis can beanalyzed, using the ROOT framework [19], developed at CERN.

12The trees are filled with branches(e.g. ’electrons’), containing leaves(e.g. electron PT , direction in φ, etc.)13Though this is fully customisable, fitting the user’s desired output.

18

5 FRAMEWORKS

Truth information of simulated events

The created TopView ntuples contain, among other trees, a TruthAll0 tree. In this tree all informationis stored on how the event generator (in this case MC@NLO, HERWIG and ISAJET) built up an event.The event generator uses theoretical values of e.g. branching ratios to make a realistic simulation of ahigh energy event. All particles created during an event are stored in a ’particle container’, togetherwith their momenta and decaying behaviour (like number of decay products), which makes it possibleto look at the ’mother-daughter’ relationship of particles. For example, when a W boson is created,one can look for the number of decay products (or daughters) of this W and find, e.g. when the Wdecayed hadronically, two daughters: two light quarks. Looking over the full range of the particles,one can reassemble the full decay chain of an event.

To identify all particles in the container, they are labeled by a ’PdgId’, which is meant to identify’truth’particles by a number which is appointed to all different particles by the ’Particle Data Group’.Looping over all particles in the container and looking the mother-daughter relations and using theselabels, one can make a decay chain on parton level (For example, a top quark is labeled with thenumber 6 and it has two ’daughters’: a 5 (bottom quark) and a 24 (a W-boson)). Associated charges(distinction between particles and anti-particles) are identified by a minus sign in front of the number.Of all created particles in an event, all kinematic information is stored as well in the form of momentumfour vectors. These properties allow one to look at the energy distribution of various particles and jetsin an event or do, combined with reconstruction data, a resolution study.

When looking at specific processes in detail, it is wise to do a full truth reconstruction of the decaychain of the process of interest. In the first place, because one can look at the event topology andsecondly, one can use this truth reconstruction to do a matching study between true en reconstructedparticle tracks and jets.

Full reconstruction of simulated events

The AODs and TopView ntuples contain a tree (FullReco0) which is filled with simulated detectorresponses. Using a very accurate description of the ATLAS geometry, the event generator, togetherwith the detector simulation, detector response and reconstruction, has computed all interactions ofall created and decaying particles with the different types of detectors and all types of material. Inthis way a realistic reconstruction of a pp-collision can be made. Information used to do this, includeparticle properties like decay length (which depends on the particles mass and momentum) and de-tailed information on all material in the detector, even including non-detector construction material.Using these properties together with the detailed description of ATLAS, the event will be simulatedvery realistically. The newest release of simulated events is even more realistic, since it uses a mis-aligned ATLAS geometry, instead of a perfect ’blueprint’ design, which will never be realized due tothe bending of material caused by gravity (sagging under their own weight) and the lorentz forces thatarise by the large magnetic fields of the barrel toroid and solenoid.

This tree contains all reconstructed particles and jets, like ATLAS will reconstruct them in real life.Electrons and muons are identified and measured, when passing through the inner detecor and ECAL(and MDTs in the case of the muons). Of course the detector doesn’t know which particles passthrough, only tracks and energy depositions are measured. To be able to identify a set of hits etc. asa certain particle, intelligent ‘pattern recognition’ software has to be written. Charged tracks throughthe inner detector, followed by a large, narrow distributed energy deposition in the ECAL and hardly

19

5 FRAMEWORKS

to no energy deposition in the HCAL, will be, with a certain likelihood, recognized as an electron.Muons are identified by charged tracks through the inner detector and ECAL, followed by multiplehits in MDT chambers. In general a muon loses only 2 to 3 GeV in the calorimeters. Charges of theleptons are reconstructed by the bending directions in the magnetic fields. The momentum of highenergy muons is reconstructed by the amount of bending in the toroidal magnetic field (by calculatingthe sagitta), since it hardly deposits its energy in the detector.

The quarks and gluons in events are not stable14 and interact rapidly, creating a process of hadroniza-tion, a showering of hadronic particles resulting in a dense area of hits/tracks and energy depositions.These showers have a cone-like distribution throughout the inner detector and calorimeters and arecalled jets. The hits in the inner detector are used in track reconstruction and clustered energy depo-sitions in the calorimeters are measured. These responses are used in jet reconstruction, with variousjet reconstruction algorithms. The multiple algorithms for jet reconstruction differ in e.g. the size ofthe cone (c.q. the opening angle in φ and η15 : (∆R)2 = (∆φ)2 + (∆η)2) in which the energy mustbe deposited, to still be counted as part of the jet. Cone sizes of 0.4 and 0.7 are the most commonlyused, and in this analysis 0.4 is used. Together the inner detector and the calorimeters make trackreconstruction of the jet possible, which allows one to retrace the origin of the jet. In most cases this isthe primary vertex, the place of initial collision. However, there are jets which are created away fromthis point, this phenomenon will be discussed in section 6.6.

5.2 Toolkit for MultiVariate Analysis (TMVA)

TMVA [20] is a ROOT-based toolkit for signal discrimination studies, using multiple variables. Itis a collection of different analysis methods (or classifiers), from which the user can manually choosemethods to work with, by means of turning flags on and off. TMVA takes distinct signal and (multiple)background sources in the form of trees, which are filled with the different variables wanted for thediscrimination study.

The signal discrimination occurs in three steps: Training, testing and evaluation of the sample. First,the user can specify the number of events over which the classifier should be trained. In this training,classifier-specific weights and likelihoods are calculated and stored in ’weight files’, which are used inmaking decisions, whether an event is a signal or background. When these weights have been computed,for the given number of training events, these optimised values are tested on a user-specified amount oftest events. These test results can be accessed through the graphical user interface, in the evaluationphase. Aside from these signal likelihood distributions, that have been calculated, it is possible tolook at e.g. the original input variable distributions, correlations between these variables, efficienciesand (when used) a graphical representation of neural network structure. In this analysis, only threemethods have been used and looked at in more detail. These three methods are described below.

Rectangular Cut Optimisation (RCO)

RCO is the simplest method for signal optimisation. It gives a binary answer to an event (either’signal’ or ’background’) in the testing phase. It minimizes the background efficiency for a targetsignal efficiency. TMVA allows the user to choose from three different RCO algorithms, from whichthe ’Genetic Algorithm’ is used in this analysis. The Genetic Algorithm is based on biological modelswhere a group of variables (or a group of genomes in biology) is tested with fitness functions. Accordingto the outcoming results a selection of combination of variables survive and others ‘die out’ (in analogy

14This is because quarks cannot exist alone, due to colour charge restrictions.15η is a convenient measure of the angle in the plane of the beampipe, which is defined as η = −ln(tan θ

2)

20

5 FRAMEWORKS

to evolution in biology). For a given group of cutting variables the signal and background efficienciesare computed by counting the training events that pass the cuts and dividing these numbers by theoriginal sample sizes. Since finding the highest rejection of the background at a given signal efficiencyis done by a random ’hit-miss’ technique, values of resulting cuts may differ everytime the optimisationalgorithm is run.

MultiLayer Perceptron Artificial Neural Network (MLP ANN)

An artificial neural network is a structure in which cut variables are defined as ‘input neurons’. Allinput neurons are connected to a number (user specified) of ‘neurons’ in a ‘hidden layer’. All neuronconnections to a hidden layer-neuron are weighted and summed in the hidden layer. These summedweights are combined to create an output neuron, which is an estimator. The neural network can beenseen as a mapping of N input neurons to M output neuron values, by the use of a neuron responsefunction. In this case M = 1, which returns a likelihood of an event being either ‘signal’ or ‘back-ground’.16 This neuron response function, can be built up in various ways, to be decided by the user.The number of hidden layers and number of neurons per hidden layer can be chosen as well. Theterm ’multilayer perceptron’ means that weights are only calculated between layers, and not betweenneurons in the same layer. By training over a number of events, all weights are adjusted, to optimizethe signal discrimination process.

In this analysis six input neurons, nine neurons in one hidden layer and one output neuron used to opti-mize the signal. The number of training cycles is set to various values (see sections 7 and 8). The neuronresponse function is composed of a ‘sigmoid’ activation function and regular summing as a ‘synapsis’function. The combination of the synapsis function and the activation function produces a mapping ofthe input layer neurons to the output neuron. The combination of a regular summed synapsis function

and a sigmoid activation function leads to the following mapping : yMLP =∑nhid

j=1 ( 11+e−K )w

(2)j , where

w(2)j are the interneuron weights between the hidden layer and the output neuron and nhid is the

number of hidden layers. K =∑nvar

i=1 (w(1)i xi), where w

(1)i are the various weights between the input

neurons and the hidden layer and xi is the value of a particular variable. In table 1 an overview of allsettings is shown.

Options Values Description

NCycles 3000 Number of training cycles.HiddenLayers ”N − 1, N − 2, . . . ” Network architecture specification.Normalise True Normalised input variable flag.NeuronType sigmoid, linear, tanh, radial Neuron activation function.NeuronInputType sum,sqsum,abssum Neuron synapsis function.

Table 1: The configurable settings for the MLP.

Boosted Decision Tree (BDT)

A decision tree works in a similar way as the rectangular cut optimisation method. Decisions arerepeatedly made on one variable at a time, until a stop criterium is reached. Everytime a decision ismade, whether an event is ’signal-like’ or ’background-like’. A possible extension of a decision tree is

16These values are complements, so when a signal likelyhood is e.g. A, the background likelihood is 1 − A.

21

5 FRAMEWORKS

made by creating several decision trees by reweighting events from the same training sample. Thisprocess of creating a ’forest of trees’, is called ’boosting’, hence the name ’boosted decision tree’. Thedifference with the RCO, is that with a BDT one can have multiple regions (or hypercube) wheredecisions can be made, whereas the RCO only has one region in this ’variable phase space’. The BDThas a lot of parameters which the user can set to one’s liking and, initially, the default settings havebeen used. The default settings consist of a maximum of 400 decision trees with a minimum of 20events in a tree (this is value is varied in section 8, though). The number of steps before the cut in atree is decided, is set to 20 and this decision method is set to ‘GiniIndex’ (which decides on a signal-to-background ratio criterium). The default ‘prune’ settings are used: the ‘CostComplexity’ methodwith a strength of 4.5. Pruning is used to reduce insignificant cuts. This is more complex than itseems, because cuts ensembles that initially seem insignificant can produce good cuts at a later point,further down the tree. The type of boosting used, is ‘AdaBoost’. This method computes a weight αto all misclassified events and is is used in the decision making of the next tree. The output is a sumover all trees in the forest and computed by yBDT =

i∈forest ln(αi)h(xi), where h(xi) is either +1or −1, depending on the classification (signal or background). x is an input variable. An overview ofthe settings is presented in table 2.

Options Values Description

nTrees 400 Number of trees in the forest.BoostType AdaBoost, Bagging Boosting type for tree building.SeparationType GiniIndex, MisClassificationError, Seperation criterium applied for the

CrossEntropy, SDiVSqrtSPlusB node splitting.nEventsMin 10 Minimum number of events in a

node where further splitting is stopped.nCuts 20 Number of steps in the scan to optimise

the cut at a node.UseYesNoLeaf True Use Yes/No decision from leaf node.UseWeightedTrees True Use a weighted majority vote of all

trees in the forestPruneMethod CostComplexity, ExpectedError, NoPruning Pruning Method.PruneStrength 4.5 Amount of pruning.

Table 2: The configurable settings for the BDT.

22

5 FRAMEWORKS

Figure 11: Left: A graphical representation of the MLP neural network. Right: A graphical represen-tation of a BDT. Every blob represents a decision and classification of a subsample to either signal,background or ’undecided’, in which there will be an additional cut.

23

6 KINEMATICS

6 Kinematics

The analysis presented in this thesis is done with ∼ 3.8 fb−1 of simulated data, which correspondsroughly to a half year of data taking at low luminosity. The total amount of SU3 SuSy events isaround 74, 000. The much larger ttbar background is run over ∼ 550, 000 events and artificially scaledafterwards to correct for the differing cross-sections.

To do a discriminating study, it’s crucial to analyze the events on different aspects. In what aspectsdoes your desired signal differ from its unwanted background and, more importantly, why? One of thefirst things done in this analysis, is to look at the decay structure of SuSy (and the stop pair creationin higher detail). This analysis has several steps: First the SuSy discovery potential is investigatedby attempting to optimize the SuSy signal. Besides this, the specific SuSy signal of stop pair creationis investigated and see if this ‘ttbar-like’ signal can be extracted from its two heavily dominatingbackgrounds: ttbar and ‘non-stop SuSy’. Finally, the plan was to try to reconstruct top quark massesfrom ‘stop-events’, to see if this would result in a top mass peak reconstruction, but with deviationsin the e.g. missing energy distributions. This part could not be completed due to lack of time and isnot presented in this thesis.

Background

First part of the analysis is to find out whether SuSy signal will be visible in the dominating SM back-grounds. Main backgrounds include top pair production (ttbar) and ‘QCD-jets’ events. In general,SuSy events are multi-jet events with a large missing energy signature (particles that escape the de-tector without interacting, see 6.1) and, possibly, a hard lepton. In this analysis, the only backgroundused is the semi- (SL) and full leptonic (FL) ttbar samples. This is justified by the fact that in thisanalysis, a ‘lepton trigger’ is used (see section 6.5), which would cut out most of the full-hadronic (c.q.QCD and FH-ttbar) events anyway. Even without the lepton trigger, the QCD background is reducedby the large missing energy demand.

6.1 Discriminating variables

The theoretical value of SU3 SuSy cross-section is about 19.3 pb, while the non-full-hadronic ttbarsamples used, have a cross-section of 460 pb. Discovery of SuSy lies in finding deviations from expecteddistributions. For example, the missing transverse energy, ET/ , in ttbar samples is only due to thecreation of one (or two) neutrino in the W-decay(s). The missing energy can be calculated, due tomomentum conservation. Before the collision, there are only the two protons with momentum inthe z-direction and the total momentum in the x-y (transverse) plane is zero. So when adding allmomentum of all particles and jets after the collision, one can by the non-conservation of momentum,compute how much momentum is missing. With the conservation of R-parity in SuSy events, therewill always be at least two LSPs per event escaping the detector unnoticed and since their mass isof the order of 100 GeV, the amount of missing energy is very often much larger than in SM-typeevents. Aside from large missing energy, SuSy events are, in general, more energetic than SM eventsand have a larger multiplicity of jets. This fact is taken advantage of in the form of taking the sumof the transverse momentum of the four most energetic jets (c.q. ‘good’ jets, which are jets passing

the criteria described in 6.5),∑4

i=1 P jetsT . These ’jet-PT -summed’ and the missing transverse energy

are used together to calculate the ‘effective mass’: Meff = ET/ +∑4

i=1 P jetsT and is used to define the

‘fatness’ of an event. Another variable that is used to discriminate ttbar from other types of events, isthe ’transverse mass’ of the W. Assuming the event is semileptonic, the invariant transverse mass of

24

6 KINEMATICS

the missing transverse energy (which is assumed to come from the neutrino) and the PT of the highestenergetic lepton in an event will maximally add up to the mass of the W. Later, when B-tagging isincluded, one more variable is introduced, mainly to try and reduce the SuSy background from thestop-pair events,

PT (highest bjet and highest non-b-jet(lq-jet)). In total, the following variables areused:

• Missing transverse energy, ET/

• The sum of the transverse momentum of the four most energetic jets,∑4

i=1 P jetsT

• Effective mass, Meff = ET/ +∑4

i=1 P jetsT

• Transverse mass, MT =√

(ET,highestlepton + ET/ )2 − (PT,highestlepton + PT/ )2

• The sum of the highest energetic b-jet and lq-jet,∑

PT (highest bjet and highest non-b-jet(lq-jet))

6.2 SuSy Kinematics

SuSy events are mainly created through gluon fusion, quark fusion and quark-gluon scattering. Thedecay chain of SuSy events generally start with the creation of one or two highly energetic gluinos ora gluino-squark pair with the squark being either a scalar up or a scalar down quark, which are theheaviest squarks. Another possibility is the formation of squarks pairs. From thereon squarks decaythrough chargino-quark pair or neutralino-quark pair creation, adding high energetic jets to the eventthrough the hadronisation of the quarks. When either the χ±

1 or χ02 is the involved gaugino in the

decay chain, further decay is dominated by tau-stau pairs creation or tau - tau sneutrino (or stau-tauneutrino) decay. The dominance of the tau leptons in decay channels is mainly due high values oftan β, which is the case in the SU3 (bulk region) point. The created stau-leptons, decay via the LSPto their SM partners. In general, the entire system has a large boost in the z-direction.

������ �

�� ��� �

��

� �

��� �� ��� � �����

� � �� ��� � � � �

���

Figure 12: An example of a decay chain in a SuSy event. The χ±1 and χ0

2 decay mainly to tau-leptons.

6.3 Top Kinematics

The main SM background to stop (and SuSy) is ttbar decay. Top pairs are created via gluon fusionand quark fusion. Its decay channels are studied extensively because this process is important: thetop quark mass plays an important role in the calculation of e.g. Higgs mass and production and itdetermines the mass of particles in most SuSy models.

25

6 KINEMATICS

��� � ����

�� � �

Figure 13: The hadronic and leptonic decay mode of the top quark

6.4 Stop Kinematics

The SU3 SuSy cross section is 19.3 pb and in ∼ 5% of these SuSy processes, a scalar top pair (t˜t)is produced, so this corresponds to a cross section of ∼ 1 pb. One can deduce, with the stop beingthe lightest squark, that a lot of channels are kinematically closed, c.q. decays via the gauginos χ±

2

and χ03,4 are not allowed. Doing a truth reconstruction of the stop decay chain, does indeed reveal

this. The resulting decay modes and the branching fractions (BF) for individual stop decays and stoppair decays can be seen in table 3. Note that the presented decay chains are generalized to one chargemode, but these are grouped, so t → tχ0

1 is the sum of the shown decay mode and its charge conjugated

mode, ˜t → tχ01. The main decay diagrams are shown in figure 14.

Decay mode BF

t → tχ01 ∼ 24.2%

t → tχ02 ∼ 9.3%

t → tχ03 0%

t → tχ04 0%

t → bχ+1 ∼ 66.5%

t → bχ+2 0%

Decay mode BF

t˜t → ttχ01χ

01 ∼ 6.6%

t˜t → ttχ02χ

01 ∼ 4.7%

t˜t → ttχ02χ

02 ∼ 0.9%

t˜t → btχ+1 χ0

1 ∼ 30.6%

t˜t → btχ+1 χ0

2 ∼ 12.4%

t˜t → bbχ+1 χ−

1 ∼ 44.7%

Table 3: The upper table shows the individual stop decay modes and their branching fractions. Thelower table shows the stop pair decay modes and corresponding branching fractions, which is a productof the individual branching fractions.

6.5 Jet and Lepton trigger conditions

When data is collected from collisions in ATLAS, it has to be categorized in a ‘type’ of event, usingcharacteristics that would be visible in wanted events. One of the general trigger conditions is todemand at least a ‘hard’ lepton (a lepton which has a high momentum). In this case we only consider

26

6 KINEMATICS

���

���

���� �

� � �������

� ���

���

�����

� �"! �#�%$��'&

() � � )+*�)+* � �) �

,-.

-/10 2 3 0�4'560�4�798-/�: 2 4 -7 8 4 -5 0

Figure 14: The three main stop decays

the electron and muon, when talking about leptons. ATLAS has a tau lepton trigger as well, butthis is not well developed yet and falls outside the scope of this thesis. Aside from this there is aminimal restriction of having 4 jets in an event17. Note that this is not a trigger, but a good startingpoint cut, since SuSy events have, in general, a high jet multiplicity. These four jets should have asubstantial amount of energy to make sure that they aren’t some low-energy quarks or gluons thatradiated off. Together, these two initial conditions, make sure that the proton collision was an inelasticone and some interesting physics is going. When e.g. two protons only graze each other, one doesn’thave a large multiplicity of jets and single hard leptons can only be seen in hadronic collisions wheninteractions through the weak mediators have taken place. Aside from a minimum energy requirement,an additional demand is made: the orientation of jets and leptons in the ±η direction has to be smallerthan 2.5, to avoid jets and leptons being to close to the beampipe and avoid (partial) detection orbeing weakly reconstructed due to a less dense environment of detection material. An absolute valueof η makes sure that all jets and leptons taken into account, are well inside the optimal range of thedetector for reconstruction. So the following initial demands have been set:

• ≥ 1 lepton with a PT ≥ 15 GeV and |η| ≤ 2.5, to trigger the event.

• ≥ 4 jets with at least PT ≥ 20 GeV and |η| ≤ 2.5, each

Further analysis is done, only with the subset of events that survive these initial cut criteria. At thispoint no distinction between stop and other SuSy is made; in SuSy both is included. In figures 15and 16), the multiplicity of jets and leptons per event is shown. It is clear from the jet plots that, onaverage, SuSy events have a larger amount of jets per event. This is due to the fact that SuSy eventsare generally higher in energy and the decay chain is more complex, which result in a larger amountof highly energetic gluons that are radiated off, creating ‘hard’ hadronic jets. The demanded leptonappears to be a very hard cut, especially in the SuSy case, but important to reduce highly dominant‘lepton-less’ backgrounds like full hadronic ttbar and especially QCD multijets events. Although thiswould deserve an intensive study, it falls outside the scope of this thesis. It is important to note,however, that in the ttbar events one would expect at least one lepton (in SL ttbar decay).

In table 4 the results are of the initial lepton and jet cuts are shown. The ttbar has been scaled bya factor 3.204 due to its cross-section. The initial SU3 sample size corresponded to 3.8 fb−1 of data.Whenever the term ‘SuSy’ is used, this is SuSy data from the SU3 mSUGRA model.

17Note that variables like transverse mass and∑

4i=1

P jetsT

couldn’t have been defined in the way they are, if it hadn’tbeen for these lepton and jet constraints

27

6 KINEMATICS

Entries 74000Mean 3.924RMS 1.456

# Jets/event0 2 4 6 8 10

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000Entries 74000Mean 3.924RMS 1.456

Number of jets in ttbar events

Entries 74000Mean 4.333RMS 1.873

# Jets/event0 2 4 6 8 10

0

2000

4000

6000

8000

10000

12000

14000

16000 Entries 74000Mean 4.333RMS 1.873

Number of jets in SuSy events

Figure 15: The number of jets per event in ttbar and SuSy events, with PT ≥ 20 GeV and |η| ≤ 2.5.Both samples are run over 74,000 events.

Entries 74000Mean 0.1653RMS 0.3813

# electrons/event0 2 4 6 8 100

10000

20000

30000

40000

50000

60000Entries 74000Mean 0.1653RMS 0.3813

Number of electrons in ttbar events

Entries 74000Mean 0.07686RMS 0.2919

# electrons/event0 2 4 6 8 100

10000

20000

30000

40000

50000

60000

70000 Entries 74000Mean 0.07686RMS 0.2919

Number of electrons in SuSy events

Entries 74000Mean 0.2225RMS 0.4332

# muons/event0 2 4 6 8 100

10000

20000

30000

40000

50000

60000 Entries 74000Mean 0.2225RMS 0.4332

Number of muons in ttbar events

Entries 74000Mean 0.1218RMS 0.3808

# muons/event0 2 4 6 8 100

10000

20000

30000

40000

50000

60000

Entries 74000Mean 0.1218RMS 0.3808

Number of muons in SuSy events

Figure 16: The number of electrons and muons per event in ttbar and SuSy events, with PT ≥ 15 GeVand |η| ≤ 2.5. Both samples are run over 74,000 events.

6.6 B-tagging

As discussed in section 5.1, created quarks hadronize to form jets. This is because quarks interact rightaway with other quarks, due to confinement (which is demanded to have only colourless particles). In

28

6 KINEMATICS

Sample # events left Fraction of events

Stop 429 10.40%non-stop SuSy 7, 690 10.38%ttbar 333, 985 18.94%

Table 4: Surviving events in the SuSy and ttbar samples after jet and lepton cuts as shown on page25. The SuSy sample includes 429 stop events. It corresponds to 3.8 fb−1 of SU3 SuSy data and onlysemi- and fully leptonic ttbar events. The ttbar has been scaled to the corresponding cross section.

the case of the lighter quarks (u,d,s quarks), mesons (like pions and kaons) and baryons (like lambdas)are formed, which decay right away starting a hadronic decay chain, resulting in a jet. The heavierbottom quark, however, interacts right away with a lighter quark to form a B-meson. This meson hasa lifetime of, roughly, 1.5 ps and since the event is often boosted, the B-meson has a high momentumand is moving away from the interaction point. When it finally decays and starts hadronizing, it hastraveled a distance of the order of several millimeters. When doing jet reconstruction by performingtrack fits in a certain cone that is dense with hits and energy depositions, it is possible to reconstructthe origin of the jet, finding the origin not to be at the ‘primary vertex’, but away from the interactionpoint at the so-called ‘secondary vertex’. This is a difficult task which requires a very high resolutiontracker and very good track reconstruction software. But by using this ‘secondary vertex reconstruc-tion’ and some impact parameters18 it is possible to identify if a jet originated from a b-quark insteadof a lighter quark, with a certain likelihood. This is known as b-tagging and is a crucial tool in variousevent analyses.

In analyzing ttbar events b-tagging is often used, because looking at the decay chain, one knows forsure that there must be at least two jets coming from the b-quarks. Also in stop pair creation, one willfind that at there must be at least two b-jets, either created via top decay or directly from the stop.In ‘non-stop’ SuSy the number of b-jets is not deduced from the decay chain and doesn’t necessarilyhave to contain b-jets. For this reason, b-tagging has not been used in the signal optimization of SuSy,but will be used in the stop optimization, especially to discriminate stop from its SuSy background.

Tagging efficiency

As described, b-tagging is not straight-forward and has a large number of uncertainties in all differentsteps of the tagging process. Taking these uncertainties into account, one expects e.g. wrongly recon-structed tracks making the jet reconstruction ambiguous by wrongly calculated impact parameters andsecondary vertex reconstructors. These uncertainties have a large impact on the b-tagging process andhighly affects the tagging efficiency. The likelihood of a jet being a b-jet is represented in a variable,called the b-tagging weight. It has a distribution between −10 and 25 and is calculated by using themost common impact parameter and secondary vertex reconstructor. This weight can be varied toalter the b-tagging definition. Taking a brief look at what the consequences are of demanding twob-jets in an event by looking calculating variables like the tagging efficiency and the tagging purity.These variables are defined as follows:

• The tagging purity is defined as the fraction of jets that are within a distance of ∆R = 0.5of a ‘true’ b-quark (which can be retrieved from the ‘truth’ tree, discussed in section 5.1), that

18IPs calculate the distance from a track fit to the line of motion of the incoming particle. So e.g. criteria can be seton the distance of a track to the interaction point.

29

6 KINEMATICS

have a tagging weight value larger than a value, specified by the user:Pbtag = (# jets with weight> x and ∆R < 0.5 to true b-quark)/(# jets with weight> x)

• The tagging efficiency is defined as the fraction of jets that have a tagging weight value, largerthan a specified value, that are within a distance of ∆R = 0.5 of a ‘true’ b-quark:εbtag = (# jets with weight> x and ∆R < 0.5 to true b-quark)/(# jets, ∆R < 0.5 to trueb-quark)

The resulting efficiencies and purities are presented in table 5 (for the ‘standard’ b-tagging definition,c.q. ‘weight=3’). In table 6 efficiency and purity of two other b-tagging definitions (‘weight=5’ and‘weight=8’) are shown. Again, the initial number of events corresponds to 3.8 fb−1 of data and leptonand jet cuts are already made.

Sample tag efficiency tag purity # events left

Stop 0.646 0.471 195SuSy 0.629 0.214 2, 504ttbar 0.729 0.895 155, 211

Table 5: Presented are the tag efficiency and purity for all samples. In the last column the number ofevents that survives two b-tags is shown (ttbar has been scaled). Jets are tagged as a ‘b-jet’ when thetagging weight is larger than 3.

weight=5 weight=8Sample tag efficiency tag purity tag efficiency tag purity

Stop 0.585 0.500 0.493 0.543SuSy 0.566 0.238 0.458 0.266ttbar 0.657 0.935 0.541 0.971

Table 6: Efficiency and purity for b-tagging, using different values for the weights.

A higher weight results in a slightly higher number of correctly reconstructed b-jets ( checked bymatching with true b; = higher purity) but cuts more heavily on the tagging efficiency (the number ofsurviving events in total). In this case, however, a high efficiency is preferred over a high purity, sinceit cannot be afforded to throw away a lot of events of the already rare stop pair production events. Inthe rest of this analysis, the ‘standard’ tagging weight is kept, since it has both an acceptable efficiencyand purity, because an intensive study of b-tagging would be required to do anything more with it(which is not the topic of this thesis).

30

7 SUSY DISCOVERY POTENTIAL

7 SuSy Discovery Potential

To ‘discover’ SuSy, the comparison of the signal and the background has to be put in a qualitativeform, so that it can be interpreted easily. One of the ways to do this, is to define a quantity calledsignificance, which is defined as:

S =# signal events√

# background events(2)

The significance can be interpreted as the amount of signal against the error on the background events.So when S is, e.g., smaller than 1, there is no way to distinguish it the signal from the backgroundbecause it falls within the statistical error of the background. When S becomes larger than 1, it becomesmore and more unlikely that it is a statistical fluctuation of your background and it is less likely that ithas other causes than the signal, superimposed on its background. Note that the significance increaseswith the amount of data taken. The significance after 3.8fb−1 of data taking and after the initial cutscan be calculated using table 4 and equation 2 : S = 7690√

333,985= 13.3 (no b-tagging is used here). So

even without cuts, one could claim a clear SuSy discovery, assuming full knowledge of all backgrounds.In fact, if this were the case, one would only need ∼ 32pb−1 of data to claim a discovery (which canbe claimed with a signal significance of 5). Systematic errors are not looked at in this thesis (mainlydue to lack of time).

7.1 Signal optimization using ‘Histogram Division’

Of course it would be nice to find the largest possible significance to make your signal stand out morein the distribution. Another reason this signal optimization is done on SuSy, is to get a good grip onhow it is done most efficiently, which is helpful when looking at much rarer events, like stop pair decay.The optimization is done in two ways: First is creating ‘efficiency’ histograms of the cutting variablesof both the signal and background sample. Efficiency is defined as:

ε =# events left

total # events(3)

The second method of optimizing the signal is by using various methods with a program, called TMVA(section 5.2). Figure 17 shows the five variables used in the discrimination studies. As expected, SuSyis in general more energetic in the most aspects, like missing energy and jet energy.

These distributions can be put in a different form, so that the plots show the fraction of remainingevents as function of a lower cut on that distribution. In figure 18 this is shown. The efficiencydistributions are drawn with a resolution of 1 GeV. Note that in these plots, still no scaling has beendone.

31

7 SUSY DISCOVERY POTENTIAL

Entries 14198Mean 62.63RMS 44.03

Etmiss (in GeV)0 100 200 300 400 500 600 700 800 9000

100

200

300

400

500

600

700

800

900 Entries 14198Mean 62.63RMS 44.03

ttbar

SuSy

Missing Transverse Energy: Etmiss

Entries 14198Mean 299.8RMS 142

Ptjetsum (in GeV)0 200 400 600 800 1000 1200 1400 1600 1800 20000

100

200

300

400

500

600

700

800

900 Entries 14198Mean 299.8RMS 142

Sum of the Pt of the 4 most energetic jets: Ptjetsum

Entries 14198Mean 64.92RMS 36.57

Transmass (in GeV)0 100 200 300 400 500 600 700 800

0

200

400

600

800

1000

1200 Entries 14198Mean 64.92RMS 36.57

Transverse mass of most energetic lepton and Etmiss: Transmass

Entries 14198Mean 363RMS 162.8

Masseff (in GeV)0 500 1000 1500 2000 2500 3000

0

200

400

600

800

1000

1200 Entries 14198Mean 363RMS 162.8

Effective mass: Masseff

Entries 14198Mean 193.3RMS 93.26

highestBPsumjetpt (in GeV)0 100 200 300 400 500 600 7000

50

100

150

200

250

300 Entries 14198Mean 193.3RMS 93.26

Sum of the most energetic b-jet and non-b-jet: highestBPsumjetpt

Figure 17: Distributions of the cutting variables. Both samples are run over 74,000 events. It is clearthat SuSy events are often more energetic than ttbar events.

7.1.1 Optimizing with single variables

The scaling is done in a macro which reads in the files, containing the efficiency plots. Then thebackground (ttbar) histograms are first scaled ‘bin-wise’ by a factor of 3.2 and then the square rootof every histogram is taken, in a similar ‘bin-wise’ manner. Note that this is a valid approach, sinceeach bin contains all events that survive a lower cut with the value on the x-axis. Then the signalhistogram is divided by this ‘square-root of the background’. The resulting plot gives a distribution ofthe significance as function of the cut on the variable (shown in figure 19).

As one can see from table 7, cutting on these variables result in a higher significance. Although thediscriminating power of each individual variable is significant, one should expect that combining thesevariables would result in an even higher significance. However, the optimal combination of valuesof cutting variables is not straight-forward. Simply using all five variables and cutting at the values

32

7 SUSY DISCOVERY POTENTIAL

Entries 1.05178e+07Mean 50.61RMS 51.44

Etmiss (in GeV)0 100 200 300 400 500 600 700 800 900 1000

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1 Entries 1.05178e+07Mean 50.61RMS 51.44

ttbar

SuSy

Efficiency of Etmiss as function of lower cut on Etmiss

Entries 2.466951e+07Mean 196.6RMS 171.2

Ptjetsum (in GeV)0 200 400 600 800 1000 1200 1400 1600 1800 2000

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1 Entries 2.466951e+07Mean 196.6RMS 171.2

Efficiency of Ptjetsum as function of lower cut on Ptjetsum

Entries 3208525Mean 43.36RMS 39.9

Transmass (in GeV)0 100 200 300 400 500 600 700 800 900 1000

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1 Entries 3208525Mean 43.36RMS 39.9

Efficiency of Transmass as function of lower cut on Transmass

Entries 2.988819e+07Mean 232.2RMS 193.7

Masseff (in GeV)0 200 400 600 800 1000 1200 1400 1600 1800 2000

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1 Entries 2.988819e+07Mean 232.2RMS 193.7

Efficiency of Masseff as function of lower cut on Masseff

Entries 3.146109e+07Mean 131.4RMS 119.2

highestBPsumjetpt (in GeV)0 100 200 300 400 500 600 700 800 900 1000

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1 Entries 3.146109e+07Mean 131.4RMS 119.2

Efficiency of highestBPsumjetpt as function of lower cut on highestBPsumjetpt

Figure 18: Efficiencies of the cutting variables.

presented in table 7, would not be very likely to result in a maximum significance, since the variablesare very likely to be correlated (though this research is not a correlation study). One could go one stepfurther than the individual cuts and try combinations of variables to be plotted in a ‘3-D’ significanceplot. This is, however, the furthest one can go with this ‘visual’ method. To plot more than twovariables at once is not possible with this method.

7.1.2 Optimizing with double variables

With five different variables, there are 10 possible combinations of taking two variables to be plottedtogether. In total this would result in 13 different plots. Even when using two times two variables, amaximum significance cannot be certified with this method. Although this has been looked into andresulting significances seemed to be a bit higher than in the ‘individual-cut’ case, the results will not beplotted. Even if this would be used, extending this method by using more variables proves to be very

33

7 SUSY DISCOVERY POTENTIAL

Entries 1889633Mean 345.5RMS 166.1

Etmiss (in GeV)0 100 200 300 400 500 600 700 800 900 1000

Sig

nif

ican

ce

0

10

20

30

40

50

60

70

80

90Entries 1889633Mean 345.5RMS 166.1

Significance as function of lower cut on Etmiss

Entries 2386551Mean 762RMS 425.8

Ptjetsum (in GeV)0 200 400 600 800 1000 1200 1400 1600 1800 2000

Sig

nif

ican

ce

0

5

10

15

20

25

30Entries 2386551Mean 762RMS 425.8

Significance as function of lower cut on Ptjetsum

Entries 981939Mean 273.2RMS 156.8

Transmass (in GeV)0 100 200 300 400 500 600 700 800 900 1000

Sig

nif

ican

ce

0

5

10

15

20

25

30Entries 981939Mean 273.2RMS 156.8

Significance as function of lower cut on Transmass

Entries 3321883Mean 955.3RMS 440.7

Masseff (in GeV)0 200 400 600 800 1000 1200 1400 1600 1800 2000

Sig

nif

ican

ce

0

10

20

30

40

50 Entries 3321883Mean 955.3RMS 440.7

Significance as function of lower cut on Masseff

Entries 2144739Mean 485.4RMS 248.6

highestBPsumjetpt (in GeV)0 100 200 300 400 500 600 700 800 900 1000

Sig

nif

ican

ce

0

2

4

6

8

10

12

14

16

18

20

22Entries 2144739Mean 485.4RMS 248.6

Significance as function of lower cut on highestBPsumjetpt

Figure 19: The resulting individual significance distributions of the various cutting variables.

difficult, if not impossible. With the opportunity of using TMVA, the optimization can be done moreorganized with more advanced methods and therefore, further analysis with the ‘histogram division’has been abandoned. The ‘individual-cut’ results will be used a a check, however.

7.2 Signal optimization using TMVA

As discussed in section 5.2 three methods are used in this analysis. First there is the RectangularCut Optimisation, because it’s closest to the ‘Histogram Division’ approach, but the cutting valuesare obtained in a more advanced way. Then the MLP neural network and the Boosted Decision Treemethods are used, which are more advanced, but less transparent, methods of signal discrimination.

34

7 SUSY DISCOVERY POTENTIAL

Variable Max. S Cut Value (in GeV) # SuSy events left # ttbar events left

Etmiss ∼ 88 ∼ 325 2172 ∼ 603Ptjetsum ∼ 33 ∼ 550 3720 ∼ 13188Transmass ∼ 31 ∼ 160 2173 ∼ 4694Masseff ∼ 50 ∼ 850 3856 ∼ 5844highestBPsumjetpt ∼ 21 ∼ 400 3592 ∼ 14844

Table 7: Surviving events in the SuSy and ttbar samples using the Histogram method. Scaling on thettbar sample has been done.

Results from RCO

Both signal and background sample are trained over 5000 events and tested over the remaining eventsin both samples. Note however, that in the case of the RCO, the testing phase only consists of givingthe calculated cuts at a signal significance of 0.7. So, in the case of the RCO, the file produced bythe training is used, and the ‘test result’ is not used. First, the RCO is used to check the individualcut case in the histogram method, to check if the two methods give the same results. When checkingtable 8, one sees that the individual cuts in both methods result in roughly the same cutting values.Although the RCO method returns upper cut values as well, it is clear that the impact of the valuespresented in table 8 is not significant. The minor differences in significance are due to the limitedresolution of the histogram method. The RCO produces a weight file which yields cuts for a largecollection of signal and background efficiencies. A macro has been developed to read in the weight fileto transform these efficiencies into significances and to give the cut values corresponding to the highestsignificance as an output.

Variable Lower Cut Value (in GeV) Upper Cut Value Significance

Etmiss 323.022 1799.58 91.385Ptjetsum 559.745 2893.16 32.2175Transmass 144.847 909.477 34.9643Masseff 786.948 3481.37 49.3129highestBPsumjetpt 462.309 2172.64 37.4751

Table 8: RCOs individual cutting values

When putting all these individually calculated values back together in TVModularAnalysis to see whatwould come out, the result is that all events have been cut out. So neither the ttbar background, northe SuSy signal survived this cut ensemble. When using the RCO to find an optimised set of cuttingvalues using the variables together, it produces the results, shown in table 9.

Having run the RCO method several times, it becomes clear that, even though the cutting values mightdiffer substantially with each run, the resulting maximum significance is always around 124 when runover a data set corresponding roughly to 3.8 fb−1. So combining different variables in a clever wayincreases the significance, compared to individual cuts. In the last row of table 9 the efficiencies forboth signal and background are presented. Finally, in figure 20 so-called ‘N − 1’-plots are shown.These plots show all full, individual variable distributions, when the RCO cuts on the other variablesare made. In this way the discriminating power of the various variables is clearly visible. One can

35

7 SUSY DISCOVERY POTENTIAL

Variable Lower Cut Value (in GeV) Upper Cut Value

Etmiss 292 1062Ptjetsum 279 3396Transmass 106 536Masseff 362 2807highestBPsumjetpt 196 1614

S εs (rem evts.) εb (rem evts.)124.13 0.195 (1500) 0.0004 (146)

Table 9: RCOs optimized set of cutting variables

conclude from these plots, that the missing energy and the transverse mass variables are the mostimportant discriminators.

36

7 SUSY DISCOVERY POTENTIAL

Entries 13276Mean 148.1RMS 131.3

0 500 1000 1500 2000 2500 3000 35000

500

1000

1500

2000

2500

3000

3500

4000Entries 13276Mean 148.1RMS 131.3

ttbar

SuSy+ttbar

etmiss {ptjetsum>279 && transmass>106 && masseff>362 && highestBPsumjetpt>196}

Entries 682Mean 711.3RMS 285.3

ptjetsum200 400 600 800 1000 1200 1400 1600 1800 2000 2200 24000

5

10

15

20

25

30

35

40

45 Entries 682Mean 711.3RMS 285.3

ptjetsum {etmiss>292 && transmass>106 && masseff>362 && highestBPsumjetpt>196}

Entries 2606Mean 101.9RMS 101.4

transmass0 100 200 300 400 500 600 700 800 900

0

50

100

150

200

250Entries 2606Mean 101.9RMS 101.4

transmass {etmiss>292 && ptjetsum>279 && masseff>362 && highestBPsumjetpt>196}

Entries 679Mean 1117RMS 366.8

masseff500 1000 1500 2000 2500 3000

0

10

20

30

40

50 Entries 679Mean 1117RMS 366.8

masseff {etmiss>292 && ptjetsum>279 && transmass>106 && highestBPsumjetpt>196}

Entries 680Mean 506.9RMS 214.6

highestBPsumjetpt200 400 600 800 1000 1200 1400 1600 18000

5

10

15

20

25

30

35

40

45 Entries 680Mean 506.9RMS 214.6

highestBPsumjetpt {etmiss>292 && ptjetsum>279 && transmass>106 && masseff>362}

Figure 20: The ‘N-1’-plots of all variables. These are the full distributions, when the cuts on the otherfour variables are made.

Results from MLP

The MLP neural network has been configured to have the five discriminating variables and a ‘bias’ asinput neurons. These are connected to 9 ‘hidden-layer’ neurons, which in their turn are connected toone output neuron. There are no inter-layer connections in the MLP. The MLP is trained with a setof 3000 signal events and 72, 000 background events. This gives the discriminating properties shownin figure 21. In the left plot the entries are scaled by the total number of events in the sample.

TMVA calculates that the distribution is ‘signal-like’ for MLP-values bigger than 0.257, where thesignal quality (efficiency*purity) is the highest. Assuming this is the best value for cutting in order toget the best signal discrimination, the efficiencies can be found in the right plot in figure 21. Readingoff a signal efficiency, εs of ∼ 0.92 and a background efficiency, εb of ∼ 0.10 and taking the numberof events of both samples, corresponding to 3.8fb−1 of data, the resulting signal significance is of the

37

7 SUSY DISCOVERY POTENTIAL

MLP

0 0.2 0.4 0.6 0.8 1 1.2 1.4

No

rmal

ized

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09SignalBackground

MLP

0 0.2 0.4 0.6 0.8 1 1.2 1.4

No

rmal

ized

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

MVA output for method: MLP

Cut on MLP output0 0.2 0.4 0.6 0.8 1 1.2 1.4

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2

Cut on MLP output0 0.2 0.4 0.6 0.8 1 1.2 1.4

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2 Signal efficiency

Background efficiency

Signal purity

Signal efficiency*purity

The purity curves use an equal number ofsignal and background events before cutting

Cut efficiencies for MLP classifier

Figure 21: left:The normalized output of the signal (SuSy) and background (ttbar) of the MLP. right:The efficiencies of both the signal and the background as function of the MLP variable. Signal andbackground have been trained over 3000 and 72, 000 events, respectively.

order S = 43.48. However, taking a look at the plots the distributions in figure 21 it is clear thatthis cannot be the optimal cut. Cutting at higher values is likely to yield higher significances. Intable 10 the significances and efficiencies are summarized. As one can see, the discrimination power isslightly higher than that of the RCO method and it should be kept in mind that with a low numberof background events left and the scaling factor, the value of the significance is not very steady.

MLP cut value signal efficiency background efficiency Significance

0.257 0.92 0.0800 43.480.600 0.72 0.0170 73.810.800 0.55 0.0070 87.860.900 0.42 0.0021 122.511.000 0.23 0.0005 137.49

Table 10: MLPs cutting values and corrsponding efficiencies and significances

Results from BDT

The standard settings of the BDT are used. These consist of variables like the number of trees (400),the minimum number of events in a tree, before stopping the decision tree (20) and the number ofsteps to decide the optimal cut in a node (20). Aside from these values, the decision method is set to‘GiniIndex’, which is a method based on signal to background ratio criteria.Reading off a signal efficiency,εs, of ∼ 0.92 and a background efficiency, εb, of ∼ 0.08 and taking thenumber of events of both samples, corresponding to 3.8fb−1 of data, the resulting signal significanceis of the order S = 43.48. Again, taking a look at both plots in 22, cutting at higher values andpresenting these results in table 11:

38

7 SUSY DISCOVERY POTENTIAL

BDT

-0.6 -0.4 -0.2 0 0.2 0.4 0.6

No

rmal

ized

0

0.02

0.04

0.06

0.08

0.1

SignalBackground

BDT

-0.6 -0.4 -0.2 0 0.2 0.4 0.6

No

rmal

ized

0

0.02

0.04

0.06

0.08

0.1

MVA output for method: BDT

Cut on BDT output-0.6 -0.4 -0.2 0 0.2 0.4 0.6

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2

Cut on BDT output-0.6 -0.4 -0.2 0 0.2 0.4 0.6

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2 Signal efficiency

Background efficiency

Signal purity

Signal efficiency*purity

The purity curves use an equal number ofsignal and background events before cutting

Cut efficiencies for BDT classifier

Figure 22: left:The normalized output of the signal (SuSy) and background (ttbar) of the BDT. right:The efficiencies of both the signal and the background as function of the BDT variable. Signal andbackground have been trained over 3000 and 72, 000 events, respectively.

BDT cut value signal efficiency background efficiency Significance

−0.406 0.920 0.0800 43.480.200 0.790 0.0185 77.630.300 0.745 0.0140 84.670.400 0.675 0.0070 107.840.600 0.260 0.0006 141.88

Table 11: BDTs cutting values and corresponding efficiencies and significances

Conclusion

The results obtained from these three methods, may lead to the conclusion that the MLP and BDTboth yield better results. Although in this case, the difference in obtained significances is not that large(RCO:124.13 vs. MLP:137.49 and BDT:141.88), it must be said that the latter, more sophisticated,methods are probably not exploited up to their full potential. These results have been obtained withthe standard settings. Also, the signal efficiencies with which these results are obtained is in all threemethods of the same order, ∼ 0.23. All methods will be used again in the case of discriminating stopfrom all other background, to see how these methods perform when the signal sample becomes evensmaller, compared to the background.

Of course, these results are obtained with a lot of assumptions that will have unknown consequences.For example, the systematical uncertainties should be studied and experimentally, the detector effi-ciency and resolution should be mapped in the first stage of beam collisions (e.g. measuring the missingenergy is highly dependent on this).

39

8 STOP SIGNAL VERSUS THE REST

8 Stop signal versus the rest

The stop analysis is more extensive than the SuSy part, since the goal is to do some inclusive analysison the stop pair channel. First part of this analysis, is trying to get an optimized cut ensemble, againcomparing the two different methods. Once an optimal configuration is achieved, the next thing isto look a bit more at the final states. In addition to the presented branching fraction in section 6.4,the topology of the final state is studied, which can be used when looking at ttbar-like stop-pair decays.

Note that in trying to discriminate stop from ttbar and non-stop SuSy, b-tagging is taken into account.So starting with, the same initial triggers as before, with the difference that the demanding of four jets,has been altered slightly to demanding at least two b-tagged jets and at least two ‘non-b-tagged’ jets.With these criteria, one starts with the variable distributions shown in figure 23 (the ttbar backgroundis not shown) and the event ensemble shown in table 12.

Sample # events left Fraction of events

Stop 195 6.23%SuSy 2, 504 3.53%ttbar 155, 211 8.80%

Table 12: Surviving events in the stop, SuSy and ttbar samples after (b-)jet and lepton cuts. The ttbarsample has been scaled.

8.1 Final states

An important part in the reconstruction of aspects of stop pair decays (in our case ttbar like signals),is the final states of the several decay types. Knowing the different decay types and their branchingfractions, the expected number of leptons and jets in the final state can be deduced. Assuming aW -boson or a charged tau lepton from a chargino decay to decay leptonically and in the case of bothstops going to a χ0

2, at least one of them creating a charged lepton, the final state requirements in table13 are presented, together with the fractions of events with the specified decay chain that actually meetthe requirements.

Decay mode Final state topology Fraction

t˜t → ttχ01χ

01 ≥ 1l, 4j or ≥ 2l, 4j 21.6%

t˜t → ttχ02χ

01 ≥ 3l, 4j or ≥ 2l, 6j 2.7%

t˜t → ttχ02χ

02 ≥ 5l, 4j or ≥ 4l, 6j 4.5%

t˜t → btχ+1 χ0

1 ≥ 1l, 4j or ≥ 2l, 2j 17.0%

t˜t → btχ+1 χ0

2 ≥ 3l, 4j or ≥ 2l, 6j 3.7%

t˜t → bbχ+1 χ−

1 ≥ 1l, 4j or ≥ 2l, 2j 16.0%

Table 13: The minimum final state topologies for the various decay trees. The fractions show that theminimum amount of leptons and jets expected in an event, are often not retrieved.

Again, the main contribution to why these fractions are low, comes from the low number of leptons in

40

8 STOP SIGNAL VERSUS THE REST

Entries 2504Mean 243.3RMS 130.4

Etmiss (in GeV)0 200 400 600 800 10000

20

40

60

80

100Entries 2504Mean 243.3RMS 130.4

non-stop SuSy

stop

Missing Transverse Energy: Etmiss

Entries 2504Mean 640.3RMS 267

Ptjetsum (in GeV)0 500 1000 1500 2000 25000

20

40

60

80

100

120

140 Entries 2504Mean 640.3RMS 267

ptjetsum

Entries 2504Mean 136.3RMS 105.2

Transmass (in GeV)0 100 200 300 400 500 600 700 800 900

0

20

40

60

80

100

120 Entries 2504Mean 136.3RMS 105.2

Transverse mass of most energetic lepton and Etmiss: Transmass

Entries 2504Mean 883.5RMS 331.1

Masseff (in GeV)0 500 1000 1500 2000 2500 3000

0

20

40

60

80

100

120Entries 2504Mean 883.5RMS 331.1

Effective mass: Masseff

Entries 2504Mean 439.3RMS 207

highestBPsumjetpt (in GeV)0 200 400 600 800 1000 1200 1400 1600 18000

20

40

60

80

100

120

Entries 2504Mean 439.3RMS 207

Sum of the most energetic b-jet and non-b-jet: highestBPsumjetpt

Figure 23: The variable distributions of the stop signal and the ‘non-stop’ SuSy signal. The stop pairevents are clearly less energetic than the ‘SuSy background’.

an event. Loosening the demand on number of leptons and/or jets does not improve these fractionsand in some decays, loosening the number of leptons results in no lepton condition at all, contradictingwith the lepton trigger. The first listed decay has, relatively, the best fraction of satisfying topologyand is the one that closest resembles SM ttbar decay. Due to shortage of time, the ttbar analysis isleft out, since no presentable results were obtained yet.

8.2 Signal optimization using ‘Histogram Division’

Again, looking at individual cutting values first, determining the discriminating power of the differentvariables, the following distributions are found. Note that, in the case of the stop, the histogrammethod will very likely not work optimally, since it only suggests one cutting value, namely the lowerbound cut. In the case of the stop however, one would like to introduce an upper bound, to discrimi-nate the more energetic non-stop SuSy background.

41

8 STOP SIGNAL VERSUS THE REST

Entries 41015Mean 330.1RMS 224.4

Etmiss (in GeV)0 100 200 300 400 500 600 700 800 900 1000

Sig

nif

ican

ce

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6Entries 41015Mean 330.1RMS 224.4

Significance as function of lower cut on Etmiss

Entries 51480Mean 595.5RMS 364.4

Ptjetsum (in GeV)0 200 400 600 800 1000 1200 1400 1600 1800 2000

Sig

nif

ican

ce

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7 Entries 51480Mean 595.5RMS 364.4

Significance as function of lower cut on Ptjetsum

Entries 23406Mean 251.5RMS 192.4

Transmass (in GeV)0 100 200 300 400 500 600 700 800 900 1000

Sig

nif

ican

ce

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9Entries 23406Mean 251.5RMS 192.4

Significance as function of lower cut on Transmass

Entries 71780Mean 803RMS 456.3

Masseff (in GeV)0 200 400 600 800 1000 1200 1400 1600 1800 2000

Sig

nif

ican

ce

0

0.2

0.4

0.6

0.8

1 Entries 71780Mean 803RMS 456.3

Significance as function of lower cut on Masseff

Entries 68776Mean 415.2RMS 247.5

highestBPsumjetpt (in GeV)0 100 200 300 400 500 600 700 800 900 1000

Sig

nif

ican

ce

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7 Entries 68776Mean 415.2RMS 247.5

Significance as function of lower cut on highestBPsumjetpt

Figure 24: The resulting individual significance distributions of the various cutting variables for thestop sample vs. the ‘full background’ (non-stop SuSy and scaled ttbar) at 3.8 fb−1

Using these optimal individual values together, the maximal significance gained is 0.77, which is notoptimal at all, since one might as well use just the ‘Etmiss’ cut and end up with a significance thatis twice as high. Instead of guessing a right combination of parameter values, the RCO method ofTMVA is used to try and find an optimized combination of variables. So far this has proven to bemore succesful (in the SuSy-ttbar case). Aside from this method, the MLP and BDT are used againas well, to see if they give better results in situations where signal discrimination seems to be harder.

8.3 Signal optimization using TMVA

Results from RCO

Training the signal events over 195 events and the background (SuSy and ttbar) over ∼ 157, 000 events.Again the testing phase of the RCO is not used and the weight file produced in the training phase is

42

8 STOP SIGNAL VERSUS THE REST

Variable Max. S Cut Value (in GeV) # events left stops SuSy ttbar

Etmiss ∼ 1.53 ∼ 170 117 1689 ∼ 4260Ptjetsum ∼ 0.7 ∼ 425 116 2035 ∼ 25228Transmass ∼ 0.91 ∼ 130 77 1129 ∼ 6116Masseff ∼ 1.0 ∼ 700 95 1756 ∼ 7140highestBPsumjetpt ∼ 0.70 ∼ 330 89 1670 ∼ 14232all 5 ∼ 0.77 21 619 ∼ 119

Table 14: Surviving events in the stop, non-stop SuSy and scaled ttbar samples looking at individualsignificances, using the histogram method.

read in by a macro, which presents the cuts that deliver the highest significance.

Variable Lower Cut Value (in GeV) Upper Cut Value

Etmiss 169.80 1264.79Ptjetsum 197.96 1675.97Transmass 94.930 296.752Masseff 419.82 2768.41highestBPsumjetpt 141.51 1127.52

S εs (rem evts.) εb (rem evts.)1.666 0.348 (68) 0.0094 (∼ 1665)

Table 15: RCOs optimized set of cutting variables for discriminating stops from the ‘full background’(non-stop SuSy and scaled ttbar) at 3.8 fb−1

Using the RCO method, reveals a maximum significance of 1.66 with 3.8 fb−1. Extrapolating this,with the assumption that the total amount of data taken, will be 300 fb−1, a maximal significance ofS 14.5 can be reached.

Results from MLP

With having only 195 stop pair events in total, it is already clear that proper training of the MLP (orBDT for that matter) is out of the question. In this attempt, the stop pair sample has been trainedover 100 events and tested over 95 events. The background consists of the scaled ttbar sample andthe ‘rest’ of the SuSy sample. The testing and training of the background happens over 80, 880 (1284SuSy and 79, 596 ttbar) and 76, 835 (1220 SuSy and 75, 615 ttbar) events, respectively. These numbersare based on keeping the relative cross sections in order. As can be seen in figure 25 and table 16a maximum significance of S = 1.798 can be achieved, which is slightly higher than the significanceachieved with the RCO method.

43

8 STOP SIGNAL VERSUS THE REST

MLP

-0.01 0 0.01 0.02 0.03

No

rmal

ized

0

0.02

0.04

0.06

0.08

0.1

SignalBackground

MLP

-0.01 0 0.01 0.02 0.03

No

rmal

ized

0

0.02

0.04

0.06

0.08

0.1

MVA output for method: MLP

Cut on MLP output-0.01 0 0.01 0.02 0.03

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2

Cut on MLP output-0.01 0 0.01 0.02 0.03

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2 Signal efficiency

Background efficiency

Signal purity

Signal efficiency*purity

The purity curves use an equal number ofsignal and background events before cutting

Cut efficiencies for MLP classifier

Figure 25: left:The normalized output of the signal (stop) and background (SuSy+ttbar) of the MLP.right: The efficiencies of both the signal and the background as function of the MLP variable. Thesignal has been trained over 100 events and trained over 95 events. The full background has been trainedand tested over an amount of events corresponding to the ratio of the cross-sections.

MLP cut value signal efficiency background efficiency Significance

0.002 0.865 0.160 1.0530.006 0.681 0.043 1.5630.008 0.560 0.023 1.7980.009 0.487 0.022 1.5850.010 0.430 0.018 1.5600.016 0.125 0.008 0.680

Table 16: MLPs cutting values and corrsponding efficiencies and significances

Results from BDT

The BDT has been run over the same sample composition as described in the previous section andwith the same settings as the BDT run of the SuSy vs. ttbar case. The results of this are shown infigure 26. It is clear that there not enough statistics here, to use this method. The minimum amountof events before it ceases analyzing is 20. This might be a bit high, since there are only 95 signal eventsto test on. However, when lowering this value to 3 the results do not get any better.

8.3.1 Results when training the backgrounds individually

Because the stop pair signal has backgrounds that are less energetic and higher energetic, the idea oftraining two MLPs for the different background is used. One MLP is trained to discriminate the stopsignal from the ttbar and the other is used to discriminate stops from its SuSy background. In boththe stop-SuSy and stop-ttbar case, the stop signal has been trained and tested over 100 and 95 events,respectively. The SuSy background is trained over 1284 and tested over 1220 (to have a correct signal-background fraction). The ttbar background is trained over 79, 596 and tested over 75, 615 events. Theresults of this training are shown in figures 27 and 28. There are now two different MLP outputs inwhich a cut can be made and one can use both to impose constraints and select the areas of the MLPdistribution where stops dominate. For the stop-SuSy case, this is not a very clear region, however.

44

8 STOP SIGNAL VERSUS THE REST

BDT

-0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2

No

rmal

ized

0

0.1

0.2

0.3

0.4

0.5

0.6SignalBackground

BDT

-0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2

No

rmal

ized

0

0.1

0.2

0.3

0.4

0.5

0.6

MVA output for method: BDT

Cut on BDT output-0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2

Cut on BDT output-0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2 Signal efficiency

Background efficiency

Signal purity

Signal efficiency*purity

The purity curves use an equal number ofsignal and background events before cutting

Cut efficiencies for BDT classifier

Figure 26: left:The normalized output of the signal (stop) and background (SuSy+ttbar) of the BDT.right: The efficiencies of both the signal and the background as function of the MLP variable. Thesignal has been trained over 100 events and trained over 95 events. The full background has been trainedand tested over an amount of events corresponding to the ratio of the cross-sections.

Now, when looping over a full event sample (including stop, SuSy and ttbar, correctly scaled), onewould expect that the fraction of stop events surviving both the MLP cuts, would be higher than thefraction of background surviving the two cuts. In table 17 the results are shown for various cuts inboth MLP distributions.

MLP

0 0.05 0.1 0.15 0.2

No

rmal

ized

0

0.05

0.1

0.15

0.2

0.25

0.3

SignalBackground

MLP

0 0.05 0.1 0.15 0.2

No

rmal

ized

0

0.05

0.1

0.15

0.2

0.25

0.3

MVA output for method: MLP

Cut on MLP output0 0.05 0.1 0.15 0.2

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2

Cut on MLP output0 0.05 0.1 0.15 0.2

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2 Signal efficiency

Background efficiency

Signal purity

Signal efficiency*purity

The purity curves use an equal number ofsignal and background events before cutting

Cut efficiencies for MLP classifier

Figure 27: The output and efficiencies of training the MLP for stop vs. ttbar events.

Looking at table 17, it is clear that the stop-SuSy isn’t contributing to maximizing the significance.When only counting events that pass the stop-ttbar MLP, the maximum significance (S =∼ 1.64) isreached at a lower bound cut of ∼ 0.016 and no upper bound cut. Obviously, this is not as high asthe results obtained with the single trained MLP, which was trained with the full background. Since,again, this might be due to a shortage of events to train, this will be tried once more when the leptontrigger demand is left out to gain some stoppair events.

45

8 STOP SIGNAL VERSUS THE REST

MLP

0 0.05 0.1 0.15 0.2 0.25

No

rmal

ized

0

0.02

0.04

0.06

0.08

0.1 SignalBackground

MLP

0 0.05 0.1 0.15 0.2 0.25

No

rmal

ized

0

0.02

0.04

0.06

0.08

0.1

MVA output for method: MLP

Cut on MLP output0 0.05 0.1 0.15 0.2 0.25

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2

Cut on MLP output0 0.05 0.1 0.15 0.2 0.25

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2 Signal efficiency

Background efficiency

Signal purity

Signal efficiency*purity

The purity curves use an equal number ofsignal and background events before cutting

Cut efficiencies for MLP classifier

Figure 28: The output and efficiencies of training the MLP for stop vs. susy events.

ttbar MLP cut value SuSy MLP cut value # Stops left # Background left Significance

> 0.005, < 0.071 > 0.11, < 0.231 125 21651 1.14> 0.01, < 0.071 > 0.11, < 0.231 104 8298 0.84> 0.02, < 0.071 > 0.11, < 0.231 60 2463 1.21> 0.02, < 0.061 > 0.11, < 0.231 57 2368 1.17> 0.02, < 0.061 > 0.13, < 0.231 36 1618 1.17> 0.02 > 0.11, < 0.231 77 2940 1.42> 0.018 > 0.11, < 0.231 85 3433 1.45> 0.02 - 101 4090 1.58> 0.018 - 110 4642 1.61> 0.0175 - 113 4880 1.62> 0.017 - 114 5075 1.60> 0.016 - 121 5455 1.64

Table 17: The results of a selection of cuts on the MLP distributions. It is clear that the stop-SuSyMLP does not improve the significance at all. The best option would be to cut only on the valuescalculated with the stop-ttbar MLP.

In the case of the BDT, training the backgrounds individually still doesn’t make the results such, thatthey can be analysed (results are presented in figures 30 and 29).

46

8 STOP SIGNAL VERSUS THE REST

BDT

-0.65 -0.6 -0.55 -0.5 -0.45 -0.4 -0.35 -0.3 -0.25 -0.2

No

rmal

ized

0

0.05

0.1

0.15

0.2

0.25

SignalBackground

BDT

-0.65 -0.6 -0.55 -0.5 -0.45 -0.4 -0.35 -0.3 -0.25 -0.2

No

rmal

ized

0

0.05

0.1

0.15

0.2

0.25

MVA output for method: BDT

Cut on BDT output-0.65 -0.6 -0.55 -0.5 -0.45 -0.4 -0.35 -0.3 -0.25 -0.2

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2

Cut on BDT output-0.65 -0.6 -0.55 -0.5 -0.45 -0.4 -0.35 -0.3 -0.25 -0.2

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2 Signal efficiency

Background efficiency

Signal purity

Signal efficiency*purity

The purity curves use an equal number ofsignal and background events before cutting

Cut efficiencies for BDT classifier

Figure 29: The output and efficiencies of training the BDT for stop vs. ttbar events.

BDT

-0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 -0

No

rmal

ized

0

0.02

0.04

0.06

0.08

0.1

0.12 SignalBackground

BDT

-0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 -0

No

rmal

ized

0

0.02

0.04

0.06

0.08

0.1

0.12

MVA output for method: BDT

Cut on BDT output-0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 -0

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2

Cut on BDT output-0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 -0

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2 Signal efficiency

Background efficiency

Signal purity

Signal efficiency*purity

The purity curves use an equal number ofsignal and background events before cutting

Cut efficiencies for BDT classifier

Figure 30: The output and efficiencies of training the BDT for stop vs. susy events.

8.4 Analysis without demanding a lepton

The results obtained so far, are very difficult to interpret, mainly because there are not enough statis-tics to e.g. properly train and test the stop events or to do a proper ttbar analysis. Because there isno more SU3 data available and there is no time to make them, the decision is made to drop the ‘hardlepton’ condition, (this will give back ∼ 90% of the SuSy events) to be able to gather some statistics.The transverse mass variable returns 0 when there are no leptons in the sample. The number ofavailable stop-events is now 900. The number of background events is scaled according to the ratioin which the signal and background samples occurred, when the lepton cut was made (so, 195 stopsis now 900 stops, 2504 SuSy events are now 2504 ∗ (900/195) = 11557 events). The number of eventsused are presented in table 18.

Since the amount of stop events has increased by a factor 4.6, the expected maximum significance(compared to the previous situation, where there were only 195 stop events) is

√4.6 ∗ 1.798 = 3.8. So

any higher significance would mean an improvement due to a higher amount of training and testingevents. The values in table 19 have been calculated for the more intensively trained MLP.

47

8 STOP SIGNAL VERSUS THE REST

Sample # events

Stop 900SuSy 11, 557ttbar 716, 298

Table 18: The number of stop events when the lepton trigger is left out. The number of SuSy and ttbarevents are multiplied by a factor of ∼ 4.6. This is by what fraction the stop events have increased. Thettbar sample has been scaled.

MLP

-0.02 -0.01 0 0.01 0.02 0.03 0.04

No

rmal

ized

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16SignalBackground

MLP

-0.02 -0.01 0 0.01 0.02 0.03 0.04

No

rmal

ized

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

MVA output for method: MLP

Cut on MLP output-0.02 -0.01 0 0.01 0.02 0.03 0.04

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2

Cut on MLP output-0.02 -0.01 0 0.01 0.02 0.03 0.04

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2 Signal efficiency

Background efficiency

Signal purity

Signal efficiency*purity

The purity curves use an equal number ofsignal and background events before cutting

Cut efficiencies for MLP classifier

Figure 31: left:The normalized output of the signal (stop) and background (SuSy+ttbar) of the MLP.right: The efficiencies of both the signal and the background as function of the MLP variable. Thesignal has been trained over 462 events and trained over 438 events. The full background has beentrained and tested over an amount of events corresponding to the ratio of the cross-sections.

MLP cut value signal efficiency background efficiency Significance

0.002 0.850 0.120 2.6060.006 0.600 0.040 3.1870.008 0.520 0.026 3.4260.010 0.450 0.018 3.5630.012 0.340 0.014 3.0520.014 0.270 0.010 2.868

Table 19: MLPs cutting values and corrsponding efficiencies and significances

It is clear from these results that the maximum found is slightly lower than in a less trained MLP.This means that a more intensive training doesn’t provide better signal discrimination. Although themaximum is lower, it is believed, however, that this maximum significance is more accurate due torunning over a higher amount of events.

The BDT, also, doesn’t seem to benefit from a higher amount of events. The results it produces arenot much better than those presented in figure 26. The reason why the BDT does not improve, is notunderstood. Varying the settings of the BDT does not seem to help.

48

8 STOP SIGNAL VERSUS THE REST

8.4.1 Results when training the backgrounds individually

Checking if a more properly trained MLP helps to increase the stop signal significance, the MLPs aretrained with the individual backgrounds again. Results of the training are presented in figures 32 and33. In table 20 a selection of MLP cutting values is presented. The maximum achieved significanceis again, of the order of what would be expected if a more proper training of the MLP is performed.It is clear that the stop-SuSy MLP does not contribute to the optimization of the stop signal. Thediscriminating variables of the two signals are too alike to be discriminated properly by the MLP.

MLP

-0.06 -0.04 -0.02 -0 0.02 0.04 0.06 0.08 0.1

No

rmal

ized

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

SignalBackground

MLP

-0.06 -0.04 -0.02 -0 0.02 0.04 0.06 0.08 0.1

No

rmal

ized

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

MVA output for method: MLP

Cut on MLP output-0.06 -0.04 -0.02 -0 0.02 0.04 0.06 0.08 0.1

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2

Cut on MLP output-0.06 -0.04 -0.02 -0 0.02 0.04 0.06 0.08 0.1

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2 Signal efficiency

Background efficiency

Signal purity

Signal efficiency*purity

The purity curves use an equal number ofsignal and background events before cutting

Cut efficiencies for MLP classifier

Figure 32: The output and efficiencies of training the MLP for stop vs. ttbar events when trained overthe larger amount of events.

MLP

-0.1 -0.05 0 0.05 0.1 0.15

No

rmal

ized

0

0.01

0.02

0.03

0.04

0.05 SignalBackground

MLP

-0.1 -0.05 0 0.05 0.1 0.15

No

rmal

ized

0

0.01

0.02

0.03

0.04

0.05

MVA output for method: MLP

Cut on MLP output-0.1 -0.05 0 0.05 0.1 0.15

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2

Cut on MLP output-0.1 -0.05 0 0.05 0.1 0.15

Eff

icie

ncy

0

0.2

0.4

0.6

0.8

1

1.2 Signal efficiency

Background efficiency

Signal purity

Signal efficiency*purity

The purity curves use an equal number ofsignal and background events before cutting

Cut efficiencies for MLP classifier

Figure 33: The output and efficiencies of training the MLP for stop vs. susy events when trained overthe larger amount of events.

49

8 STOP SIGNAL VERSUS THE REST

ttbar MLP cut value SuSy MLP cut value # Stops left # Background left Significance

> 0.005 > 0.05, < 0.13 249 24869 1.58> 0.005 > 0.02, < 0.13 481 392328 2.42> 0.005 > 0.01 549 42368 2.67> 0.005 > 0.03 406 35330 2.16> 0.005 - 723 50220 3.22> 0.01 > 0.01 420 17578 3.16> 0.01 - 592 24898 3.75> 0.012 - 533 20435 3.73> 0.015 - 473 16009 3.73> 0.020 - 372 11418 3.48> 0.022 - 342 10167 3.39

Table 20: The results of a selection of cuts on the MLP distributions. Again, the stop-SuSy MLP doesnot improve the significance at all. The best option would be to cut only on the values calculated withthe stop-ttbar MLP.

50

9 CONCLUSIONS

9 Conclusions

The goal of this research was to look at stop-pair decays and compare them to top-pair decays. Thishas not been fully achieved, mainly due to the fact that the signal optimization process used up mostof the time. The first part of researching ‘stop-properties’ was successful, as it is clear how the stopsdecay and with which branching fractions. Also, the decay topologies have been looked at and it isshown, how often these topologies are actually retrieved from the data.

Also, a small b-tagging study has been done and although, the conclusion was that the standard set-tings would be used, it didn’t have to be like that and therefore, this study served as a check. Eventhough the b-tagging didn’t prove to be more useful, it certainly was educational to have a morein-depth look at b-tagging.

Optimizing signals from specific backgrounds has proven not to be an easy job. Although the firstmethod presented (Histogram method), was the most straight-forward one, it was clear that it wouldhardly be possible to maintain doing this for more that a few discriminating variables. In a late stage,however, the possibility of the ROOT-package of TMVA was presented and the optimization studycould be done with more advanced techniques. Using the MLP and BDT method yielded some resultsand though the expectancy was to gain a better significance, the signal discrimination was not a lotbetter. The BDT method in the stop signal optimization didn’t seem to work, probably because theBDT needs a lot of events. Especially discriminating stop from ‘non-stop’ SuSy has proven to bedifficult. It must be said, however, that TMVA has more methods than the ones used in the researchand the methods that were used, could probably produce better results by changing some settings.For this research there was no time to extensively look into all these different settings. Another optionto improve the signal discrimination with the used methods, might be to introduce other or additionaldiscriminating parameters.

51

REFERENCES

References

[1] S.L. Glashow, Partial Symmetries of Weak Interactions, Nucl. Phys. 22(1961), 579.

[2] S.A. Weinberg, A model of Leptons, Phys. Rev. Lett. 19(1967), 1264.

[3] A. Salam, Weak and Electromagnetic Interaction, In ‘Elementary Particle Theory’,p.367, Stock-holm, 1968, edited by Almquist and Wiksell.

[4] M.E. Peskin and D.V. Schroeder, An Introduction to Quantum Field Theory, Perseus Books,1995.

[5] D. Griffiths, Introduction to Elementary Particles, Wiley, 1987.

[6] V.A. Gol’fand and E.P. Likhtman, Extensions of the Algebra of Poincare Group Generators andViolation of P Invariance, JETP Lett. 13(1971), 323.

[7] D. Volkov and V.P. Akulov, Possible Universal Neutrino Interactions, JETP Lett. 16(1972), 438.

[8] J. Wess and B. Zumino, Supergauge Transformations in Four Dimensions, Nucl. Phys. B70(1974),139.

[9] S. Dawson, Susy and such, arXiv:hep-ph/9612229v2.

[10] Hitoshi Murayama, Supersymmetry phenomenology, arXiv:hep-ph/0002232v2.

[11] Manuel Drees, An introduction to supersymmetry, arXiv:hep-ph/9611409v1.

[12] Stephen P. Martin, A Supersymmetry Primer, arXiv:hep-ph/9709356v4.

[13] D. Fayet, Spontaneously Broken Supersymmetric Theories of Weak Electromagnetic and StrongInteractions, Phys. Lett. B69(1977), 489.

[14] A.H. Chamseddine et al., Local Supersymmetric Grand Unification, Phys. Rev. Lett. 49(1982),970.

[15] A. Melchiorri et al., urrent Constraints on Cosmological Parameters from Microwave BackgroundAnisotropies, arXiv:astro-ph/0302361.

[16] N. van Eldik, The ATLAS muon spectrometer: calibration and pattern recognition, Ph.D. thesis,University of Amsterdam, 2007.

[17] T. Cornelissen, Track Fitting in the ATLAS experiment, Ph.D. thesis, University of Amsterdam,2006.

[18] ATLAS Collaboration, ATLAS Detector and Physics Technical Design Report, CERN/LHCC/99-14&15, 1999.

[19] R. Brun and F. Rademakers, ROOT - An Object Oriented Data Analysis Framework., NuclearInstruments and Methods 389 (1997) .

[20] A. Hocker, J. Stelzer, F. Tegenfeldt, H. Voss, and K. Voss, TMVA: Toolkit for Multivariate DataAnalysis with ROOT, Users Guide, arXiv physics/0703039.

52