W/Z Tagging With UFO Jets · 7/23/2020 · • UFO jets are going to be the new baseline for...

Page .Page .

W/Z Tagging With UFO Jets

Xiang Chen([email protected], advisor: Liang Li)

Christof Sauer

Technical supervisor: Chris Malena Delitzsch

CPG meeting

Thursday July 23,2020

1

mailto:[email protected]

Page .Page .

Introduction

• UFO jets are going to be the new baseline for large-R jets and new taggers need to be defined to

identify boosted hadronically decaying objects for physics analyses.

• UFO(Unified Flow objects)= PFOs + TCCs

PFlow

TCCs PFlow

TCCs

Particle Flow Objects (PFOs)

Optimized for low Pt

Track-CaloClusters (TCCs)

Optimized for high Pt

Figure source :https://indico.cern.ch/event/750283/contributions/3154797/attachments/1721841/2780124/20180925_MLB_OopsAllTCCs.pdf

2

Page .Page .

1. The performance of a variety of substructure variables (in combination with a simple cut

on the jet mass) will be compared to identify the best discriminating substructure variables

for the new jet collection.

2. Develop simple three-variable tagger based on #1.

3. Develop simple mass decorrelated tagger

4. Develop advanced tagger from combinations of several inputs (BDT or DNN)

Plan

3

Page .Page .

▪ Signal: W’ ->WZ->qqqq( two large-R jets, one containing the Z boson decay and the other W decay)

▪ Background: dijets

▪ Using several jet collections with different grooming algorithms for study:

AntiKt10UFOCSSKSoftDropBeta100Zcut10Jets (VanillaSD with beta = 1.0 and z_cut = 0.1)

AntiKt10UFOCSSKRecursiveSoftDropBeta100Zcut5NinfJets (RecursiveSD with beta = 1.0, z_cut = 0.05 and N = infinity)

AntiKt10UFOCSSKBottomUpSoftDropBeta100Zcut5Jets (BottomUpSD with beta = 1.0, z_cut = 0.05)

AntiKt10UFOCSSKTrimmedPtFrac5SmallR20Jets (UFO_Trimming )

▪ Signal cuts: W boson

fjet_truthJet_pt/1000. > 200

TMath::Abs(fjet_truth_dRmatched_particle_flavor) == 24

TMath::Abs(fjet_truthJet_eta) < 2

TMath::Abs(fjet_truth_dRmatched_particle_dR) < 0.7

fjet_truthJet_GhostBHadronsFinalCount ==0

fjet_truthJet_m > 50000.

fjet_truthJet_m < 100000

soft drop reference:

https://arxiv.org/abs/1402.2657

https://arxiv.org/pdf/1804.03657.pdf

4

Signal and Background Samples

Signal cuts: Z boson:

fjet_truthJet_pt/1000. > 200

TMath::Abs(fjet_truth_dRmatched_particle_flavor) == 23

TMath::Abs(fjet_truthJet_eta) < 2

TMath::Abs(fjet_truth_dRmatched_particle_dR) < 0.75

fjet_truthJet_GhostBHadronsFinalCount == 0

fjet_truthJet_m > 60000.

fjet_truthJet_m < 110000.

Page .Page .

Left figure: comparison of two-variable tagger against

three-variable tagger

Right figure: performance of different jet collection with

three-variable tagger

▪ Three-variable tagger can imporve the performance

▪ BottomUp Soft Drop performs the best in UFO

collection

Three-Variable W Tagger: Preliminary Result

5

Page .Page .

Different cut combinations are compared in

same jet collection ( working point = 50%)

• mass+D2+Ntrk500 performs best

• mass+D2+Tau21/KtDR doesn’t have too

much improvements compared with two-

variable taggers

6

Three-Variable W Tagger:Preliminary Result

Page .Page .

Z tagging two tagger results @50%

7

W Z

W Z

Page .Page .

Z tagging three tagger results @50%

8

Z Z

W

Page .Page .

▪ Substructure observables correlated with jet mass. Then, MVA taggers exploit this for

resonance classification.

Mass-decorelated tagger

9

▪ Signal and background become indistinguishable and it

is impossible to perform resonance search

▪ Designed decorrelated taggers(DDT) is used to reduce

the relationship between the observation and mass

BottomUp UFO LCTopo

jet scaling variable

ρDDT= log(𝑚2

𝑃𝑇∗1𝐺𝑒𝑉) is designed

𝑉𝑎𝑟𝐷𝐷𝑇 = 𝑉𝑎𝑟 − 𝑎 ∗ (ρDDT − 𝑐𝑜𝑛𝑠𝑡)

But how about to non-linear one?

Page .Page .

▪ Samples are classified by two collections: training and testing

▪ Both contain signals and backgrounds:

training : 1E6 signals 1E6 backgrounds

testing : 8.5E5 signals 1E7 backgrounds(signals maybe not enough)

separated by testing weight

▪ Cuts: fjet_mass: 50-300GeV

Pt: 200-2000GeV

K-NN D2 Tagger

10

• Measure X’th percentile of background

substructure distribution in bins of (ρ, pT)

• 𝐷2𝑘−𝑁𝑁 = 𝐷2 − 𝐷2(7%) is the new

decorrelated variable

Page .Page .

K-NN fitting Preliminary Result

11

K-NN fitting

Page .Page .

▪ After cuts fjet_m in [50,300] GeV , 𝐷2𝑘−𝑁𝑁’s roc curve is slightly lower than D2 in Pt

range 200-2000 GeV.

▪ But in high Pt range(over 1000GeV), 𝐷2𝑘−𝑁𝑁’s performance is better(in next page)

ROC Curve

12

Page .Page . 13

Page .Page . 14

Page .Page .

▪ Checked performance of both two-variable and three-variable Z/W-tagger

▪ Two variables: FJetMass + D2

▪ Three variables: FJetMass + D2 + NTrk500 is the best

▪ Validation of LCTopo Jets

▪ For both taggers, BottomUp SD is better than LCTopo Jets

▪ Z tagging : BottomUp SD is the best

▪ Simple mass decorrelated tagger

▪ Use k-NN method to make a simple D2-knn tagger

▪ Roc curves show not as good as D2(needs validation)

▪ Combine new mass decorrelated D2 with Ntrk500 and compared to previous cut

Summary and Todo

15

Backups

16

Page .Page .

Variable Distribution --- D2

17

𝛽 = 0.5

𝛽 = 2.0

𝛽 = 1.2

▪ D2 gives sensible values for systems have zero total momentum and for events are dijet-like.

𝑫𝟐(𝜷)= 𝒆𝟑(𝜷)/(𝒆𝟐(𝜷))𝟑 ,

𝒆𝟑(𝜷)=𝟏

𝑷𝑻𝑱𝟑 𝑷𝑻𝒊𝑷𝑻𝒋𝑷𝑻𝒌 𝑹𝒊𝒋

𝜷𝑹𝒊𝒌𝜷𝑹𝒋𝒌𝜷

𝒆𝟐(𝜷)=𝟏

𝑷𝑻𝑱𝟐 𝑷𝑻𝒊𝑷𝑻𝒋𝑹𝒊𝒋

𝜷

𝑷𝑻𝑱 : transverse momentum of the jet with respect to the beam

𝑷𝑻𝒊: transverse momentum of particle i

𝑹𝒊𝒋𝟐= (φi −φj)2 +(yi −yj)2

Page .Page . 18

Variable Distribution --- FJet mass & Ntrk500 & KtDR

▪ FJet mass distribution for W boson

▪ Two-variable tagger: FJet mass and D2 as the baseline

▪ Three-variable tagger: try Ntrk variable first

▪ Ntrk500 is the number of ghost associated tracks

to the jet with pT > 500 MeV

▪ KtDR is obtained from anti-kt R = 1.0 jet,

reclustering its constituents again into two subjets.

Page .Page .

Variable Distribution --- Fjet Tau21

▪ Tau_21 use N-subjettiness to effectively “count” the number of

subjets in a given jet.

𝝉𝟏 =𝟏

𝒅𝟎 𝒌𝑷𝑻,𝒌𝒎𝒊𝒏{∆𝑹𝟏,𝒌}

𝝉𝟐 =𝟏

𝒅𝟎

𝒌

𝑷𝑻,𝒌𝒎𝒊𝒏{∆𝑹𝟏,𝒌, ∆𝑹𝟐,𝒌}

𝒅𝟎 = 𝒌𝑷𝑻,𝒌𝑹𝟎

𝝉𝟐𝟏 = 𝝉𝟐/𝝉𝟏 is the discriminating variable

𝝉𝟏 𝝉𝟐 𝝉𝟐𝟏

19

Page .Page .

▪ Soft drop declustering recursively removes soft wide-angle radiation from a jet

▪ 1.Break jet J into 2 subjet: j1& j2

▪ 2.cut condition 𝑚𝑖𝑛(𝑝𝑇1,𝑝𝑇2)

𝑝𝑇1+𝑝𝑇2> 𝑧𝑐𝑢𝑡(

∆𝑅12

𝑅0)𝛽 𝑧𝑐𝑢𝑡:soft drop threthold 𝛽:an angular exponent

▪ 3.Otherwise, redefine j to be equal to subjet with larger pT and iterate the procedure

▪ 4.If J cannot decluster -> remove or leave(two mode)

Recursive Soft Drop:

1.After C/A reclustering, taking the remaining branch whose two parent subjets have the widest separation in

∆R, and label these j1 and j2

2.cutcondition as SD

3. two pass ->all remain; otherwise, remove the soft one.

BottomUp SD

Recluster the jet by using SD condition from leaves to branches.

Introduction --Soft Drop

20

Page .Page .

▪ k nearest neighbor method

▪ The training examples are vectors in a multidimensional feature space, each with a class label. The training

phase of the algorithm consists only of storing the feature vectors and class labels of the training samples.

In the classification phase, k is a user-defined constant, and an unlabeled vector (a query or test point) is

classified by assigning the label which is most frequent among the k training samples nearest to that query

point.

▪ In k-NN regression, the output is the property value for the object. This value is the average of the values

of k nearest neighbors.

K-NN

21

W/Z Tagging With UFO Jets · 7/23/2020 · • UFO jets are going to be the new baseline for...

Documents

Transcript of W/Z Tagging With UFO Jets · 7/23/2020 · • UFO jets are going to be the new baseline for...