Data Quality Assurance Linda R. Coney UCR CM26 Mar 25, 2010.
-
date post
20-Dec-2015 -
Category
Documents
-
view
216 -
download
0
Transcript of Data Quality Assurance Linda R. Coney UCR CM26 Mar 25, 2010.
Data Quality Philosophy• Need data quality checks offline (and online?) to officially
pass the data as good– Check basic level of data– *before* reconstruction and at each stage
• How automated should it be?– Can we do this in stages?
• Do we tag Runs or Event-by-Event?
• Need method to process data to produce DST (or MiceEvent list or Root file or what?) that is the Official, Approved, Good data set for analysis
• Data Production - iterative– Software version– Cabling configuration– Geometry– Beamline settings/Hardware status
Philosophy Continued• Multi-staged data quality
determination• Do RAW data
consistency check at the beginning
• Get Mapping then consistency check
• Geometry check alignment
• Calibration check again
• Finally - reconstruction & create final product
Map
Geometry
Calibration Data
1st Consistency Check
Alignment
Raw Data
Calibration
2nd Consistency Check
Reconstruction
DST
Data Access
• Need repository (or immediate access) to recent data for expert checking– For debugging & commissioning– For understanding of beam dynamics– Technical details: has to happen on second
raid computer, miceraid1 busy taking data– New ssh bastion gives this access/also
potentially with Fed ID• People should access data on GRID for
long term, complex analysis aimed at publication
Data Quality: Now
• Need to link appropriate Calibration files with data Runs– Database will solve this (don’t have it yet)– Short term solution: Add column with calibration
filename to Run Summary spreadsheet
• Chris has code to dump world into single file – write file for Step1 file Date1, Step1 file Date2 etc?
• Need more miceModules to match different geometries – StepI not cover it all
• New G4MICE Application: DataQualityCheck
Data Quality: G4MICE Application
• Initial version of application created by Mark Rayner– Generic code which can be used by each person who wants to add a
feature to the DataQualityCheck Application
• Convention for writing code (for use by experts) established– Modular– Control of overall DataQualityCheck Application by single person– Detector Experts write code to create plots for their own detector– Request addition to overall– Single canvas/detector or theme
• Documentation begun– What it does– Histograms produced– Cuts made on data
• Website– http://www.physics.ox.ac.uk/users/raynerm/OnOfflineApps.html
G4MICE’s DataQualityCheck application
Code Structure for Standardized Detector Plots M.Rayner
DataQualityPlotpublic:virtual Process()=0virtual Plot()=0virtual Write()=0private:1 canvas, output file,&MICEEvent, &MICERun
TemplatePlotClasspublic:Process()Plot()Write()private:histograms, graphs…
TofMonitorWritten by TOF experts
CkovMonitorWritten by CKOV experts
etc…
Add an event’s data to the histograms etc
Paint fresh histograms to the screen
Write final plots to the output file
Data Quality: Future/Long Term
• Database– DAQ - Vassil
• spill gate, run number, start/stop date/time, configuration of DAQ (LDCs etc), trigger
– C&M - Pierrick and James • EPICS interface writes magnet currents and?
– G4MICE - ??• EPICS interface writes magnet currents and?
• People should access data on GRID
• Need to know what we need to be looking at….
• How do we automatically flag bad runs?
• Do we have EPICs tell DAQ to stop run if magnet off for certain time or PMT died?
Input?
Conclusions
• This is something we must think about, plan a strategy for, and implement it as soon as possible– Must be able to reproduce analysis results
• Progress has been made– Thanks to Mark for the code template– Imperial mini-workshop very useful
• Major project long way to go