An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto...
-
Upload
allison-booker -
Category
Documents
-
view
221 -
download
3
Transcript of An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto...
![Page 1: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/1.jpg)
An update on the An update on the Statistical ToolkitStatistical Toolkit
Barbara Mascialino, Maria Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Grazia Pia, Andreas Pfeiffer,
Alberto Ribon, Alberto Ribon,
Paolo ViarengoPaolo Viarengo
July 19July 19thth, 2005, 2005
![Page 2: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/2.jpg)
G.A.P Cirrone, S. Donadio, S. Guatelli, A. Mantero, B. Mascialino, S. Parlati, M.G. Pia, A. Pfeiffer, A. Ribon, P. Viarengo
“A Goodness-of-Fit Statistical Toolkit”IEEE- Transactions on Nuclear Science (2004), 51 (5): 2056-2063.
Release StatisticsTesting-V1-01-00 downloadable from the web:http://www.ge.infn.it/geant4/analysis/HEPstatistics/
![Page 3: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/3.jpg)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
EMPIRICAL DISTRIBUTION FUNCTIONORIGINAL DISTRIBUTIONS
• Kolmogorov-Smirnov test
• Goodman approximation of KS test
• Kuiper test
)(
4 22
nm
nmDmn
)()( xGxFSupD mnmn
)()()()( 00* xFxFMaxxFxFMaxD TT
Dmn
Tests based on maximum distanceTests based on maximum distanceunbinned distributionsunbinned distributions
SUPREMUMSUPREMUMSTATISTICSSTATISTICS
![Page 4: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/4.jpg)
• Fisz-Cramer-von Mises test
• Anderson-Darling test
i
ii xFxFnn
nnt 2
21221
21 )]()([)(
i k kkk
kiikk
ik nh
HnH
HnnFh
nkn
nA
4)(
)(1
)1(
)1( 2
22
Tests containing a weighting functionTests containing a weighting function
binned/unbinned distributionsbinned/unbinned distributions
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
EMPIRICAL DISTRIBUTION FUNCTIONORIGINAL DISTRIBUTIONS
QUADRATICQUADRATICSTATISTICSSTATISTICS
+ + WEIGHTING WEIGHTING FUNCTIONFUNCTION
Sum/integral of all the distances
![Page 5: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/5.jpg)
1.1. Status of the existing testsStatus of the existing tests
2. New GoF Tests added2. New GoF Tests added
3. Description of the power study3. Description of the power studyphase Iphase I
4. Description of the power study4. Description of the power studyphase IIphase II
5. A concrete example: IMRT5. A concrete example: IMRT
![Page 6: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/6.jpg)
1.1. Status of the existing tests:Status of the existing tests:Fisz-Cramer-von MisesFisz-Cramer-von Mises
Conover (book) + Darling (1957):
- The two-sample Cramer-von Mises test (Fisz test) has the same asymptotic distribution of the one-sample test (Cramer-von Mises test).
- The equation of the asymptotic distribution is available in the paperby Anderson and Darling (1952).
binned/unbinned distributionsbinned/unbinned distributions
![Page 7: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/7.jpg)
1.1. Status of the existing tests:Status of the existing tests:two-sample Anderson-Darlingtwo-sample Anderson-Darling
Scholz and Stephens (1987):-The two-sample Anderson-Darling test can be written in different ways:
- exact formulation (for unbinned distributions only)- approximated formulation (for binned/unbinned distributions)
-The approximated distance is already available in the toolkit.
-The asymptotic distributions of both exact and approximated formulations are available in the paper.- The two-sample Anderson-Darling test has the same asymptotic distribution of the one-sample test.
binned/unbinned distributionsbinned/unbinned distributions
![Page 8: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/8.jpg)
1.1. Status of the existing tests:Status of the existing tests:Tiku testTiku test
Tiku (1965):
- Cramer-von Mises test in a chi-squared approximation.
- Cramer-von Mises test statistics is converted into a central chi-square, bypassing the problem of integrating the weighting function.
binned/unbinned distributionsbinned/unbinned distributions
![Page 9: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/9.jpg)
2. New GoF Tests:2. New GoF Tests:weighted Kolmogorov-Smirnovweighted Kolmogorov-Smirnov
Canner (1975) & Buning (2001):
- Canner modified KS test introducing one weighting function identical to the one used in AD test.
- Buning modified KS test introducing one weighting function similar to the one used in AD test.
- The equation of the asymptotic distribution is not available in Canner’s paper, only a few critical values for some samples sizes (n=m).
))()( xxGxFSupKSW mnmn
unbinned distributionsunbinned distributions
![Page 10: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/10.jpg)
2. New GoF Tests:2. New GoF Tests:weighted Cramer von Misesweighted Cramer von Mises
Buning (2001):
- Buning modified CVM test introducing one weighting function similar to the one used in AD test.
- The equation of the asymptotic distribution is not available in the paper, only critical values for many samples sizes.
unbinned distributionsunbinned distributions
)][ xGFCVMW mnmn 2
![Page 11: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/11.jpg)
2. New GoF Tests:2. New GoF Tests:WatsonWatson
Watson (1975):
-Derives from Cramer-von Mises test statistics.
- Like Kuiper test it can be applied in case of cyclic observations.
- The equation of the asymptotic distribution is not available in the paper, only critical values for many samples sizes.
)()()()( xFxGxGxFW nmmn 2 2
![Page 12: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/12.jpg)
Other newsOther news
• New user layer dealing with ROOT histograms (Andreas is working on that).
• Paper to IEEE-TNS
• Next release of the GoF Statistical Toolkit scheduled within summer.
![Page 13: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/13.jpg)
Future developmentsFuture developments
• Fix some design-related problems.
• New design (add uncertainties).
• Extend the toolkit to the comparison of:– Experimental data versus theoretical functions,– k-sample problem,– Many dimensional one-, two-, k-sample problem.
![Page 14: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/14.jpg)
Which is the recipe toWhich is the recipe toselect the most suitableselect the most suitable
Goodness-of-Fit testGoodness-of-Fit testamong the ones available inamong the ones available inthe GoF Statistical Toolkit?the GoF Statistical Toolkit?
![Page 15: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/15.jpg)
3. Description of the3. Description of thepower study – phase Ipower study – phase I
SAMPLE 1SAMPLE 1
RESULTS:RESULTS:LOCATION-SCALE LOCATION-SCALE
ALTERNATIVEALTERNATIVE
SAMPLE 2SAMPLE 2TESTTEST
PARENT 1PARENT 1 PARENT 2PARENT 2MONTEMONTECARLOCARLO
REPLICATIONSREPLICATIONSk=1000k=1000
EDF STATISTICS (UNBINNED DATA): EDF STATISTICS (UNBINNED DATA): KS, KSW, KSA, KUIPER, CVM, ADAKS, KSW, KSA, KUIPER, CVM, ADA
““EMPIRICAL” POWER EVALUATIONEMPIRICAL” POWER EVALUATION
RESULTS:RESULTS:GENERAL GENERAL
ALTERNATIVEALTERNATIVE
COMPARISON WITHCOMPARISON WITHPUBLISHED PUBLISHED
RESULTSRESULTS
REALREALDATADATA
EXAMPLESEXAMPLES
![Page 16: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/16.jpg)
Parent distributionsParent distributions
1)(1 xf
Uniform)
2(
2
2
2
1)(
x
exf
Gaussian
||3
2
1)( xexf
Double exponential
24
1
11)(
xxf
Cauchy
xexf )(5
Exponential
Contaminated Normal Distribution 2
)1,1(5.0)4,1(5.0)(7 xf
)9,0(1.0)1,0(9.0)(6 xf
Contaminated Normal Distribution 1
![Page 17: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/17.jpg)
Skewness and tailweightSkewness and tailweight
025.05.0
5.0975.0
xx
xxS
125.0875.0
025.0975.0
xx
xxT
ParentParent SS TTf1(x) Uniform 1 1.267
f2(x) Gaussian 1 1.704
f3(x) Double exponential 1 2.161
f4(x) Cauchy 1 5.263
f5(x) Exponential 4.486 1.883
f6(x) Contamined normal 1
1 1.991
f7(x) Contamined normal 2
1.769 1.693
SkewnessSkewness TailweightTailweight
![Page 18: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/18.jpg)
Comparative evaluation of testsComparative evaluation of tests
ShortShort
(T(T<1.5)<1.5)
MediumMedium
(1.5 < T < 2)(1.5 < T < 2)
LongLong
(T>2)(T>2)
SS~~11 KSKS KS – CVMKS – CVM CVM - ADCVM - AD
SS>1.5>1.5 KS - ADKS - AD ADAD CVM - ADCVM - AD
Skewness
Skewness
TailweightTailweight
2222 Supremum Supremum statistics statistics
teststests
Supremum Supremum statistics statistics
teststests
Tests Tests containing a containing a
weight functionweight function
Tests Tests containing a containing a
weight functionweight function< <
![Page 19: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/19.jpg)
4. Description of the4. Description of thepower study – phase IIpower study – phase II
SAMPLE 1SAMPLE 1
RESULTS:RESULTS:LOCATION-SCALE LOCATION-SCALE
ALTERNATIVEALTERNATIVE
SAMPLE 2SAMPLE 2TESTTEST
PARENT 1PARENT 1 PARENT 2PARENT 2MONTEMONTECARLOCARLO
REPLICATIONSREPLICATIONSk=1000k=1000
BINNED/UNBINNED DATABINNED/UNBINNED DATACHI2, KS, KSW, KSA, KUIPER, CVM, CVMA, ADA, ADCHI2, KS, KSW, KSA, KUIPER, CVM, CVMA, ADA, AD
““EMPIRICAL” + “MC” POWER EVALUATIONEMPIRICAL” + “MC” POWER EVALUATION
RESULTS:RESULTS:GENERAL GENERAL
ALTERNATIVEALTERNATIVE
LINEAR LINEAR POWERPOWER
CORRELATIONCORRELATIONBETWEENBETWEEN
TESTSTESTSISODYNESISODYNES
![Page 20: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/20.jpg)
Which is the most suitable goodness-of-fit test?
EXAMPLE EXAMPLE : unbinned dataLateral profilesLateral profiles
5. A concrete example: IMRT5. A concrete example: IMRT
Mich
ela
Mich
ela
Pierge
ntili
Pierge
ntili
mn GFH :0
mn GFH :1
![Page 21: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/21.jpg)
GoF test selectionGoF test selection
025.05.0
5.0975.0
xx
xxS
125.0875.0
025.0975.0
xx
xxT
SkewnessSkewness TailweightTailweight
S = 1: symmetric distributionS < 1: left skewed distributionS > 1: right skewed distribution
T is always greater than 1, the longer the tail the greater
the value of T.
1.1. Classify the type of the distributions in terms of Classify the type of the distributions in terms of skewness S and tailweight Tskewness S and tailweight T
![Page 22: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/22.jpg)
Comparative evaluation of tests powerComparative evaluation of tests power
ShortShort
(T(T<1.3)<1.3)
MediumMedium
(1.3 < T < 2)(1.3 < T < 2)
LongLong
(T>2)(T>2)
SS~~11 KSKS KS – CVMKS – CVM CVM - ADCVM - AD
SS>1.5>1.5 KS - ADKS - AD ADAD CVM - ADCVM - ADSkewness
Skewness
TailweightTailweight
2. Choose the most appropriate test for the classified 2. Choose the most appropriate test for the classified type of distributiontype of distribution
![Page 23: An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.](https://reader036.fdocuments.us/reader036/viewer/2022062518/56649f3b5503460f94c5a271/html5/thumbnails/23.jpg)
GoF test: test selection & resultsGoF test: test selection & results
Moderate skewed – medium tail
KOLMOGOROV-SMIRNOV TESTD=0.27 – p>0.05
X-variable: Ŝ=1.53 T=1.36
Y-variable: Ŝ=1.27 T=1.34
^
^
RESULTSRESULTS: unbinned data