Data Quality Update.

20
Data Quality Update. Andy Blake Cambridge University Friday February 23 rd 2007

description

Data Quality Update. Andy Blake Cambridge University Friday February 23 rd 2007. Introduction. Have looked at far detector data up to December 31 st 2006. – Will present updated run selection and performance plots. – Working on an automation scheme for run selection. - PowerPoint PPT Presentation

Transcript of Data Quality Update.

Page 1: Data Quality Update.

Data Quality Update.

Andy BlakeCambridge University

Friday February 23rd 2007

Page 2: Data Quality Update.

Introduction

Andy Blake, Cambridge University Data Quality Talk, slide 2

• Have looked at far detector data up to December 31st 2006.

– Will present updated run selection and performance plots.

– Working on an automation scheme for run selection.

• A large collection of data quality software now exists.

– Data quality information stored in cedar ntuples.

– Detector status information recorded in database.

– Will review the available tools.

• These tools can now be used for physics analyses.

– In place of “Mad/fddataquality.h” look-up table.

Page 3: Data Quality Update.

Run Selection

Andy Blake, Cambridge University Data Quality Talk, slide 3

• Selection of far detector physics data.

– Run types 0x301 (physics), 0x4031 (modified physics).

– Run must contain >5 minutes data.

• Data quality checks.

– Data should contain one or more “physics-like” events.

– Require complete ROP mask (all 16 crates reading out).

– Require some period of data with < 20 dead chips.

– Check for anomalous snarl or event rates.

– Keep track of CRL logs and end of week reports.

– Use feedback from physics analyses to refine run lists.

[N.B: only remove runs that are entirely bad].

• Results currently collated in doc-db #2067.

– lists of good runs compiled for Aug 1st 2003 → Dec 31st 2006.

– contains notes and comments on far detector performance.

• Still working on automating run selection code.

Page 4: Data Quality Update.

Run Selection Results

Andy Blake, Cambridge University Data Quality Talk, slide 4

ALL

RUNS

GOOD

RUNS

RED = some physics runs are rejected after data quality checks.

May 740 740

Jun 715 715

Jul 734 733

Aug 742 742

Sep 723 723

Oct 744 737

Nov 717 716

Dec 747 747

Jan 749 663

Feb 621 621

Mar 602 586

Apr 619 616

May 688 681

Jun 742 742

Jul 701 698

Aug 703 662

Sep 726 726

Oct 746 746

Nov 725 725

Dec 752 752

2005 2006

ALL RUNS = ALL PHYSICS SUBRUNS (run type 0x301,0x4031, >5 minutes).

GOOD RUNS = GOOD PHYSICS SUBRUNS (after all data quality checks).

crate 15 out of readout

crate 12 out of readout

crate 0 out of readout

+ HV trips

+ HV trips

+ HV trips

many HV trips

(a.k.a “long weekend”)

(N.B: majority of bad runsoccurred with beam off.)

excellent recent performance!

Page 5: Data Quality Update.

Far Det Performance: (I) Snarl Rates

Andy Blake, Cambridge University Data Quality Talk, slide 5

2006 2007

Most LI leaks due to HV trips

(TPMT disabled)

Page 6: Data Quality Update.

Far Det Performance: (II) HV Status

Andy Blake, Cambridge University Data Quality Talk, slide 6

SM1

SM2

2006 2007

HV trips account for~ 0.5% of live time

RED: COLD CHIPS ≥ 20 (HV TRIP) BLUE: COLD CHIPS < 20 (HV OK)

Page 7: Data Quality Update.

Far Det Performance: (III) Coil Status

Andy Blake, Cambridge University Data Quality Talk, slide 7

SM1

SM2

2006 2007

Coil trips account for~ 1.2% of live time

Page 8: Data Quality Update.

Far Det Performance: (IV) GPS Error

Andy Blake, Cambridge University Data Quality Talk, slide 8

2006 2007

0.3% of spills have GPS error > 1000 ns

Page 9: Data Quality Update.

Far Det Performance: (IV) GPS Error

Andy Blake, Cambridge University Data Quality Talk, slide 9

0.1 ms!

cut at ~1 us

2 GPS UNITS

1 GPS UNITS

GPS error has quite long tails!

Page 10: Data Quality Update.

Far Det Performance: (V) Live Time

Andy Blake, Cambridge University Data Quality Talk, slide 10

May 1st 2005 Jan 1st 2006 Sep 1st 2006

shutdown

beam off

“long weekend”

HV trips

(total live time from all selected physics runs – require good HV, COIL and GPS)

N.B: NOT CORRELATED WITH BEAM SPILLS YET

Page 11: Data Quality Update.

Run Selection Automation

Andy Blake, Cambridge University Data Quality Talk, slide 11

• Problems with current selection code:

– Basically kludged together from pieces of code written during my PhD. – Runs on “data quality ntuples” generated from raw data. – Much of the code not in CVS. – Virtually impossible to automate!

• Currently developing database table to hold run information. – Easy to look up run information and tune the run selection. – Will facilitate calculations of live time and PoTs. – Very easy to automate!

• The table will contain the following information: – Run, SubRun, RunType. – StartTime, EndTime, TimeFrames. – ROP Mask, Trigger Mask. – Snarls, “Good” Snarls. – Snarl Rate Min, Max, Mean, Median. … some other stuff … – Pass/Fail.

Page 12: Data Quality Update.

Data Quality Software

Andy Blake, Cambridge University Data Quality Talk, slide 12

Data quality software:

• Standard Ntuples:

– Data Monitoring Information NtpSRDataQuality class summarizes information contained in monitoring blocks (readout info, spill server info, light injection info, number of bad channels etc…).

– Bad Readout Channels NtpSRDeadChip class stores information on each bad readout channel.

– Coil and HV Status NtpSRDetStatus class extracts the detector status from the database.

• Offline Database:

– Coil and HV Status DcsUser package contains CoilTools and HvStatusFinder classes that provide an interface with BlfdDbiCoilState and DbuHvFromSingles database tables.

– Spill Server GPS error SpillTiming package contains SpillServerMonFinder class that provides an interface with SpillServerMon database table (this contains the GPS errors).

Page 13: Data Quality Update.

SR Ntuples: (I) NtpSRDataQuality

Andy Blake, Cambridge University Data Quality Talk, slide 13

class NtpSRDataQuality : public TObject { public: Int_t trigsource; // trigger word Int_t trigtime; // trigger time Int_t errorcode; // snarl error code from RawDigitDataBlock Int_t cratemask; // number of crates enabled Int_t pretrigdigits; // number of pre-trigger digits Int_t posttrigdigits; // number of post-trigger digits Int_t snarlmultiplicity; // number of post-trigger digits in detector Int_t spillstatus; // state of SpillServer Int_t spilltype; // type of spill (real, fake etc...) Int_t spilltimeerror; // GPS error from SpillServer Int_t litrigger; // was there a nearby TMPT hit Int_t litime; // time of the TMPT hit Int_t lisubtractedtime; // (TMPT hit time) - (Trigger Time) Int_t lirelativetime; // Abs(LiSubTime) Int_t licalibpoint; // Current LI point Int_t licalibtype; // type of LI Int_t libox; // pulser box number Int_t liled; // LED number Int_t lipulseheight; // pulse height Int_t lipulsewidth; // pulse width Int_t coldchips; // number of cold chips Int_t hotchips; // number of hot chips Int_t busychips; // number of busy chips Int_t readouterrors; // number of readout errors from RawDigits Int_t dataqualityword; // overall quality (CandDataQuality::EDataQuality)

ClassDef(NtpSRDataQuality,1)};

LI Info

Spill Info

Readout Info

Raw Digits

Trigger Info

NtpSRDataQuality ntuple class stores information recorded in CandDataQualityHandles.

Page 14: Data Quality Update.

SR Ntuples: (II) NtpSRDeadChip

Andy Blake, Cambridge University Data Quality Talk, slide 14

class NtpSRDeadChip : public TObject {

public:

Int_t channelid; // Channel ID // FarDet: 108*crate+36*varc+6*vmm+3*vaadc+vachip // NearDet: 2560*crate+128*master+16*minder+menu

Int_t plane0; // 1st associated plane Int_t plane1; // 2nd associated plane Int_t shield; // veto shield plane (Far Detector)

Int_t errorcode; // Error Code

Int_t status; // Status (CandDeadChip::EChipStatus) ClassDef(NtpSRDeadChip,1)};

Problem

Error Code

Plane Number

Channel ID

NtpSRDeadChip class stores information recorded in CandDeadChipHandles.

Page 15: Data Quality Update.

SR Ntuples: (III) NtpSRDetStatus

Andy Blake, Cambridge University Data Quality Talk, slide 15

class NtpSRDetStatus : public TObject {

public:

// coilstatus is deprecated, used to be filled from BFieldCoilCurrent Short_t coilstatus; //magnetic coil status: -1(rev),0(off/unknown),1(forward)

// dcscoilstatus is filled from BfldDbiCoilState, set to // ECoilStatus::kUnknown if table is unavailable for a given validity. Short_t dcscoilstatus; // mag coil status: enum'ed as DcsUser::ECoilStatus Float_t coilcurrent1; // coil current in supermodule 1 Float_t coilcurrent2; // coil current in supermodule 2

// HV status using TP singles info Short_t dbuhvstatus; // -1(unknown), 0(bad), 1(good) Int_t coldchips1; // cold chips in supermodule 1 Int_t coldchips2; // cold chips in supermodule 2

ClassDef(NtpSRDetStatus,3)};

NtpSRDetStatus class stores information from DbuHvFromSingles and BfldDbiCoilState tables.

COILSTATUS

HVSTATUS

Page 16: Data Quality Update.

Detector Status (I)

Andy Blake, Cambridge University Data Quality Talk, slide 16

Coil Current (Near/Far Detector)

High Voltage Status (Far Detector Only)

Spill Server Status

HvStatus::HvStatus_t status = HvStatusFinder::Instance().GetHvStatus(vldc);

Int_t GpsError = SpillServerMonFinder::Instance().GetSpillTimeError(vldc);

– Far Detector high voltage status from TP singles recorded in DbuHvFromSingles database table.– HV status can be accessed using HvStatusFinder class in DcsUser package.

– Far Detector Spill Server monitoring information recorded in SpillServerMon database table.– Spill time error can be accessed using SpillServerMonFinder class in SpillTiming package.

(DbuHvFromSingles and SpillServerMon database tables currently filled up to end of December 2006, the ultimate aim is to automate the population of these tables).

– Near and Far Detector coil status recorded in BfldDbiCoilState database table.– Coil status can be accessed using CoilTools class in DcsUser package.

Bool_t status = CoilTools::IsOK(vldc);

Page 17: Data Quality Update.

Detector Status (II)

Andy Blake, Cambridge University Data Quality Talk, slide 17

typedef enum ECoilStatus { kOK = 0x00, kBad = 0x01, kReverse = 0x02, kDegauss = 0x04, kCalib = 0x08, kBadCurrent = 0x10, kDataGap = 0x20, kUnknown = 0x80} CoilStatus_t;

typedef enum EHvStatus { kUnknown = 0x00, kOK = 0x01, kBad = 0x02, kSM1Unknown = 0x10, kSM1OK = 0x20, kSM1Bad = 0x40, kSM2Unknown = 0x100, kSM2OK = 0x200, kSM2Bad = 0x400} HvStatus_t;

COIL STATUS HV STATUS

– Coil Status and HV Status enumerations contained in the DcsUser package.

– Be sure to analyse only the data where CoilStatus=OK and HvStatus=OK !

– Can use either the database or ntuples to extract this information .

Page 18: Data Quality Update.

Near Detector Data Quality

Andy Blake, Cambridge University Data Quality Talk, slide 18

• Currently the code doesn’t do a great deal for Near Detector data, but could be generalized for use at the Near Detector (if desired).

• Although most data quality variables are extracted straight from the monitoring blocks, some must be calculated from the data.

e.g. “hot chips”, “cold chips”, “busy chips” …

Are there Near Detector definitions for any of these variables?

• What other variables characterize Near Detector data quality?

• Is there any treatment of High Voltage trips in Near Detector data?

• Could use some input from a Near Detector expert here!

Page 19: Data Quality Update.

Summary

Andy Blake, Cambridge University Data Quality Talk, slide 19

• Substantial amount of data quality software now in place.

– Data quality information stored in cedar ntuples.

– Detector status tables now created and filled.

– Should be used in this Physics Analysis.

• Future Work:

– A few refinements will probably be needed as code is exercised.

– Need to develop and document run selection code.

– Need to automate much of the code.

– The data quality code works okay for near detector data but often doesn’t do much – could use input from a near detector expert.

Page 20: Data Quality Update.

Data Quality Software Model

Raw Data

Database

HV Statusfrom Singles

SR Ntupleslive time calculator

Spill ServerMonitoring

Fill DB

Coil Status

access tools access tools access tools

CandMorgue

CandDataQualityHandle

CandDeadChipHandles

Reconstruction

Implemented

Future Work

Andy Blake, Cambridge University Data Quality Talk, slide 20

HWDB

automation