Feature Extractor: overview and history of recent changes Dmitry Chirkin, UW Madison Goal: Given an...
-
Upload
david-carpenter -
Category
Documents
-
view
222 -
download
0
Transcript of Feature Extractor: overview and history of recent changes Dmitry Chirkin, UW Madison Goal: Given an...
Feature Extractor: overview and history of recent changes
Dmitry Chirkin, UW Madison
Goal:
Given an ATWD or FADC waveform, determine arrival times of some or all photons which contributed:
hit series (DisableHitSeries) reco pulse series:
TimestampCharge
Simple feature extraction:FastFirstPeak options
bit[0]=0
bit[0]=1Depending on the settings(see documentation),a number of choices is possible:
1. First peak above the baseline, total charge of the entire waveform2. Largest pulse in a waveform, charge under that pulse
A new option (since December, 2008, “FFP=7”) is to fall back to finding the largest pulse if the waveform falls below the threshold everywhere. This also works when a multi-pulse extraction method does not find any pulses.
Baseline is found by fitting a parabola around the most repeated values of the waveform, there are two fallback methods, using
1. the average of the first 3 bin values2. the mean of the lowest ¼ of bin values
Root fits (“MaxNumHits” option)
Variable-width pulses can be fit, iteratively, from largest to smaller ones
Bayesian waveform unfolding
• fast waveform feature extraction: 2-3 ms per every WF (cf. 30 sec. of the root fit method)
invert against the tabulated smearing function
• need to emphasize SPE signal while controlling oscillations of the solution due to noise
Bayesian or regularized unfolding does just that
Bayesian waveform unfolding
If a fitted pulse does not start on the boundary, then it is approximated by a superposition of 2 pulses. The weighted average of these pulses gives the estimate of the leading edge.
Simple and complicated waveforms are reconstructed with the same amount of effort
Precision of feature extraction
0.5 ns difference between 2 multi-pulse methods
FADC-ATWD comparison: 0.65 ns
Bin widths:ATWD: 3.5 nsFADC: 25 ns
ADCThreshold
ADCThreshold has to be carefully selected based on a choice of the (Modified) Old/New Threshold or some other discriminator voltage estimate (usually as provided by the DOMFunctions)
Very important for multi-peak unfolding method Quite robust for simple single-peak extraction methods, especially with FFP=7 (fallback method)
old SPE modified old SPE new SPE
Documentation
1. Doxigen, for description of option use
2. http://icecube.wisc.edu/~dima/work/LBNL/reader/docs/fe/fe.pdf Description of algorithms employed by the FE
3. http://icecube.wisc.edu/~dima/work/WISC/fe/ Recommendations on ADCThreshold settings
History of recent changes (svn log)
r51128 | 2008-12-08 | 3 lines | added FastFirstPeak=7 (which always keeps at least one pulse)
r52436 | 2009-02-10 | 2 lines | getting impedance from dom calibration
r53943 | 2009-04-03 | 1 line | use SPEPMTThreshold function to calculate thresholds if the option UseNewDiscThreshold is not set.
r54670 | 2009-04-27 | 2 lines | width set correctly for FADC (this was never used except in Patrick's width vs. charge plots)
r55000 | 2009-05-13 | 3 lines | fixed spurious pulse problem
r55010 | 2009-05-13 | 2 lines | increased FADC cushion to 10 FADC bins (250 ns)
r55060 | 2009-05-14 | 2 lines | completely rely on DOMCalibrator start times, remove the PMTTransit option
r56879 | 2009-07-24 | 2 lines | verify the FFP and Unfolded matched pulse is the same
r56996 | 2009-07-30 | 2 lines | reduced the FADC/ATWD cushion from 125 to 50 ns
r57936 | 2009-08-31 | 2 lines | added "ExclusionSize" parameter (default=2 *25 [ns])
r58032 | 2009-09-03 | 3 lines | added option "TinyThreshold"
r58145 | 2009-09-09 | 2 lines | changed fExclusionSize from 2.0 to 5.0, to match the value used in the last release
r58219 | 2009-09-11 | 2 lines | added back the PMTTransit option with a new default, -1
FADC issues, new features, other problems
1. Various timing offset issues: relies on DOMCalibrator in the new version, PMTTransit option removed, then put back for backwards compatibility
2. Too many low-P.E. pulses? Be sure to select the value of the ADCThreshold correctly for every new discriminator threshold DOMFunction, and possibly for new domcal revisions. Using more precise SPE shape parameterization improves this.
3. Droop correction constants use now better default values when not in the database (affected both the FeatureExtractor and the DomCalibrator)
4. FADC missing pulses, spurious pulses, etc., hopefully repaired.
5. New option ExclusionSize configures the size of the padding space at the end of the ATWD window, within which the FADC pulses are not kept
6. New option TinyThreshold removes tiny pulses with amplitudes less than configured fraction of the amplitude of the largest pulse (5% default).
Selecting the right ExclusionSize
Plot by Martin W.
Charge mismatch and TinyThreshold
Plot by M. Merck
Report problemsWhat to do if you see a problem, e.g., a pulse that should not be there:
• Identify at least one DOMLaunch that demonstrates the problem• Verify by plotting the raw and calibrated waveforms that the pulses found are not consistent with each• Send me description of the problem and an example event (or its ID)
Problem reports like:• “I think I see too many large charges” or • “Is the timing offset between ATWD and FADC wrong?”
are not useful!
Although this has changed a lot in the last months. I’ve had a lot of useful feedback from Patrick B., Juanan, Martin M., Sebastian, Marius, Martin W., Tim Ruhe, Henrik, Andreas Thus the rapid progress.
PulseExtractor
• Option list is much simplified: only 2 options: input calibrated waveform and output recopulseseries
• Only one type of waveform feature-extracted per module instance: either ATWD or FADC
• No attempt to improve the baseline is made
• Only the Bayesian unfolding method is implemented. This design decision is a result of the discussion in the UW Madison group (chosen over the multiple pulse over threshold algorithm). It is best suited for using the resulting pulseseries for reconstruction, but not for counting the number of pulses
• 4 templated waveforms are used: ATWD/FADC, old/new DOMs. Chris Wendt provided the 4 templates.