dffits
-
Upload
tangguh-wicaksono -
Category
Documents
-
view
218 -
download
0
Transcript of dffits
-
8/11/2019 dffits
1/4
Residuals
Remember that the predicted values arey i = 0 + 1 x 1 i + + m x mi , i = 1 , . . . , n .
The residuals are e1 , . . . , e n , where
e i = yi y i , i = 1 , . . . , n .
Plots to consider:
1) Construct a histogram, boxplot or normalprobability plot of residuals to check on
normality assumption.
2) Plot residuals against the predicted values.This is a good plot for checking the equalvariances assumption.
3) If the independent variables are not highlyrelated, plot residuals against each inde-pendent variable.
96
-
8/11/2019 dffits
2/4
4) If the data are collected over time, plot theresiduals against time. If time does not af-
fect the response, this plot should show nopattern. Durbin-Watson test can be usedto test for time effect. The Durbin-Watsonstatistic can be gotten in SPSS via Re-gression Linear Statistics Durbin-Watson . Values of the statistic larger than
2.5 or less than 1.5 are indicative of a timeeffect.
Outliers
As in simple regression, outliers that occur near
the boundary of the x-region may not show upin a residual plot. So, methods besides residu-als are needed to spot outliers.
Dene
DFFITS ( i ) = y i y ( i )
scale factor,
where y i is as usual and y ( i ) is the ith pre-dicted value obtained after removing the ithobservation from the data set.
97
-
8/11/2019 dffits
3/4
A large value of DF F ITS ( i ) indicates that the
i th observation may be an outlier. Values big-ger than 2 in absolute value indicate potentialoutliers.
The DFFITS statistics are obtained in SPSSas follows: Regression Linear Save Standardized DfFit .
Plot DFFITS ( i ) against i or one of the inde-pendent variables to check for outliers.
Always plot both residuals and DFFITS .
Residuals may miss outliers near boundaryof x-region.
DFFITS may miss outliers in middle of x -region.
98
-
8/11/2019 dffits
4/4
What should one do with outliers?
After spotting an outlier, check to see if anerror was made in recording the data. If anerror was made, correct it and re-estimate
the model using all the data.
If no errors were made, there are at leasttwo courses of action:
Throw out the outlier(s) and estimatethe model with the remaining data. Con-sult a statistician if you want to predictthe response at values of x near the onesthrown out.
Use an alternative to least squares anal-ysis, such as robust regression .
99