Dataset Based Physics Analysis

18
11 December 2006 Elizabeth Gallas 1 Dataset Based Physics Analysis Elizabeth Gallas Oxford University From TOB Task Forces to Final Dress Rehearsal 11 December 2006

description

Dataset Based Physics Analysis. Elizabeth Gallas Oxford University From TOB Task Forces to Final Dress Rehearsal 11 December 2006. Outline. Start from the End Luminosity and cross section Storing Luminosity Streaming Test (Ayana) Online System Assumptions: inclusive streaming - PowerPoint PPT Presentation

Transcript of Dataset Based Physics Analysis

Page 1: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 1

Dataset Based Physics Analysis

Elizabeth GallasOxford University

From TOB Task Forces to Final Dress Rehearsal 11 December 2006

Page 2: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 2

Outline Start from the End

Luminosity and cross section Storing Luminosity

Streaming Test (Ayana) Online System

Assumptions: inclusive streaming

What happens at Pt 1/Tier0 What happens at Tier 1/beyond

(Jack,Marjorie) Comments about Exclusive Streaming Summary Conclusion

Page 3: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 3

Who needs luminosity (normalization) ? We all agree:

For ‘physics’ triggers/analysis, Some to greater precision than others

Think about ‘other’ cases Calibration – not written to ‘physics’ stream !?!

Might be Luminosity dependent or worse: BCID dependent (Beam Crossing)

Calibration – written to ‘physics’ stream ?

Create robust system enabling luminosity normalization for many trigger configurations Must come with ‘Operations Rules’ to insure the

necessary booking is recorded Robustness should include ability to recover a lower

precision luminosity should losses occur

Page 4: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 4

Calculating a Physics Cross Section

Prescale Fractional

(detector)Fraction Live

Luminosity Delivered

efficiency

acceptance

where

,)()()(

P

f

L

dttPtliveft

delL

bkNN

BRphys

live

del

Page 5: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 5

Physics and Trigger Cross Sections

),( LBNtrigf

phys

Where f includes Trigger efficiency Reconstruction efficiency

Assumes: physics dataset is composed of a all

events recorded satisfying a trigger in a well defined set of Luminosity Blocks (LBNs)

We can measure the trigger cross section for each Luminosity Block LBN

trig

Page 6: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 6

Measuring a Trigger Cross Section

Prescale Fractional

detector),(trigger,Fraction Live

less),or minute 1(~Duration

,corrected) ,( Luminosity ousInstantane

condition, trigger thepassing events ofNumber

where

bad'' LBN if

good''LBN if

,

horseshoe)down -upside (the

trig

live

inst

trig

trigliveinst

trigLBNtrig

P

f

D

L

N

PfDL

N

Page 7: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 7

Simple Luminosity DB – 2 Tables

Run NumberLBN

Start TimeEnd TimeDuration (seconds)Luminosity (nb-1)Live FractionQuality

Run_LBN TableRun NumberLBNTrigger Name

L1_EVENTSL2_EVENTSL3_EVENTSL1_PRESCALEL2_PRESCALEL3_PRESCALE

Run_LBN_Trigger Table

Page 8: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 8

For the Streaming Test This is a simple Luminosity Database

being implemented for the Streaming Test

For each Run_Number,LBN,Trigger we can calculate: Trigger_Luminosity = DELIVERED_LUM *

LIVE_FRACTION / PRESCALE Trigger_Cross_Section = L3_EVENTS

/ Trigger_Luminosity

Page 9: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 9

Lum DB–StreamTest Data Sources

Columns populated from Ayana's log files: RUN_LBNS.RUN_NUMBER, LBN RUN_LBN_TRIGGERS.TRIGGER_NAME,L3_ACCEPTS.

Other database columns Start, End_times, prescales, luminosity ...

populated with 'fake' data based on logical assumptions about what we expect the data to look like, will evolve with the Test.

Placeholders for Deadtime, Prescales, Level 1,2 Accepts, Data Quality,

Duration

Page 10: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 10

Lum Database – Real Data Sources

Online Quantities Run, LBN, Start, End_time - RC(Run Control) Live Fraction – Level 1 EVENTS accepted – HLT or EventLossMonitor

Trigger Configuration Prescales at Level 1, 2, and 3, Trigger Names

Luminosity system Quantities Acceptance, efficiency corrected Luminosity

Offline – Store in Conditions Database Richard Hawkings – Trigger/Physics week (3/11/06)

Page 11: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 11

Conditions DB - basic concepts• COOL-based database architecture - data with an interval of validity (IOV)

Online, Tier 0/1

COOL

Relational AbstractionLayer (CORAL)

OracleDB

MySQLDB

Application

Small DBreplicas

SQLiteFile

Frontierweb

File-basedsubsets

http-basedProxy/cache

SQL-likeC++ API

C++, python APIs, specific data model

IOVstart IOVstop channel1 (tag1) payload1

IOVstart IOVstop channel2 (tag2) payload2

• COOL IOV (63 bit) can be interpreted as:– Absolute timestamp (e.g. for DCS)– Run/event number (e.g. for calibration)– Run/LB number (possible to implement)

• COOL payload defined per ‘folder’– Ttuple of simple types 1 DB table row

• Can also be a reference to external data

– Use channels (int, soon string) for multiple instances of data in 1 folder

– COOL tags allow multiple data versions

• COOL folders organised in a hierarchy

• Athena interfaces, replication, …

Indexed

Page 12: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 12

What happens at Point 1/Tier 0 Assume

Run Structure as in Run Structure Report. An Inclusive Streaming model

Complications of Exclusive model – later… At Point 1 (ATLAS online):

5-10 data loggers open/close files on LumiBlock boundaries lots of small files in byte stream format

At Tier 0, small files combined Files contain one/more complete LBNs

Ignored in this simple model: BCID, Stream… dependence

Page 13: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 13

What happens at Tier 1, beyond

Pt 1/ Tier 0 makes files respecting LBN boudaries

Subsequent processing must allow us to collect and track complete data sets for physics analysis in well defined sets of LBNs

Our bookkeeping must track LBNs By EVENT / TAG (Jack) By Metadata / File (Marjorie)

Page 14: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 14

“Exclusive Streaming” - Online Online:

More acCOUNTing Count events by run / lbn / trigger AND new index: stream

Smooth online running depends on Accurate predictions of rates and overlaps Trigger rates depend on fully operational

Triggers and detectors Balancing streams

Empty file syndrome (empty or lost?) Special runs with special prescales require stream balance

analysis How do we keep rates to one stream from exceeding ?? % How do we keep rates to one stream from dropping < ?%

Heightens importance of predictions of rates / overlaps As trigger configurations evolve

Prescales Thresholds

Page 15: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 15

“Exclusive Streaming” - Offline

Offline: Implications for TAG database (Jack). Splits trigger samples to different Streams

LBNs - distributed onto 1/more file(sets) Note: a‘topological’ split, not just random

Complicates: tracking of processing history (parentage) Should processing for different streams occur

at different Tier 1 sites Successful analysis depends on Grid coherence

Robustness should include ability to recover a lower precision luminosity should losses occur

Page 16: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 16

“Exclusive Streaming” (3) In an ideal world, might be

manageable, but infrastructure will be quite complex and day-to-day situations will conspire to foil the best laid plans Perhaps after a few ‘months?’ of data taking but bookkeeping much more complicated

The Streaming Test will not answer: How the system might be actually (ab)used But has experience with ‘grid coherence’

Make Rules, lots of Rules. Hope people understand, follow,

or they will invent their own tools…

Page 17: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 17

Summary FACT: A physics event dataset comprised of or related to the

complete set of events passing a trigger recorded in an LBN has a corresponding measurable integrated luminosity.

Any dataset which does not DOES NOT Unless that process cross section can be related to a known

process cross section taken in the same interval

We measure luminosity in Luminosity Blocks (smallest complete fundamental unit of ATLAS data taking)

Pt 1/ Tier 0 makes files respecting LBN boudaries Subsequent processing must allow us to collect and track complete

data sets for physics analysis in well defined sets of LBNs Our bookkeeping must track LBNs by

EVENT / TAG (Jack) Metadata / File (Marjorie)

Data Quality Naturally indexed by Run/LBN (or Run/LBN range) Any other index must have a direct relationship to Run/LBN

Page 18: Dataset Based Physics Analysis

11 December 2006 Elizabeth Gallas 18

Conclusion Required: Robust Operational Model including a

Luminosity Block based (or aware) File management and bookkeeping Data analyses Data quality

Detector, trigger, reconstruction Having it from day 1

Enables us to debug problems faster Trigger cross sections are constant

Measuring their rates helps monitor stability Streaming Models

The devil is in the details – ‘exclusive streaming’ by physics / trigger will Initially make it difficult to find problems Be an ongoing challenge to operations and bookkeeping