Outliers Sometimes there is data that doesn’t fit the pattern of the rest. In the following graphs...

6
Outliers Sometimes there is data that doesn’t fit the pattern of the rest. In the following graphs Circle the one that is unlike the trend 3.92 3.96 4.00 4.04 y 1 2 3 4 5 6 7 8 x C ollection 1 S catter Plot 3.92 3.96 4.00 4.04 y 1 2 3 4 5 6 7 8 x C ollection 2 S catter Plot 3.92 3.96 4.00 4.04 y 1 2 3 4 5 6 7 8 x C ollection 3 S catter Plot 3.92 3.96 4.00 4.04 y 1 2 3 4 5 6 7 8 x C ollection 4 S catter Plot A point that is far away from the trend line (line of best fit) of the data is considered an outlier . Sometimes these data points should be removed to give a better representation of the trend of the data. If they are removed, this fact should be mentioned.

Transcript of Outliers Sometimes there is data that doesn’t fit the pattern of the rest. In the following graphs...

Page 1: Outliers Sometimes there is data that doesn’t fit the pattern of the rest. In the following graphs Circle the one that is unlike the trend A point that.

OutliersSometimes there is data that doesn’t fit the pattern of the rest. In the following graphs Circle the one that is unlike the trend

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

Collection 1 Scatter Plot

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

Collection 2 Scatter Plot

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

Collection 3 Scatter Plot

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

Collection 4 Scatter Plot

A point that is far away from the trend line (line of best fit) of the data is considered an outlier. Sometimes these data points should be removed to give a better representation of the trend of the data. If they are removed, this fact should be mentioned.

Page 2: Outliers Sometimes there is data that doesn’t fit the pattern of the rest. In the following graphs Circle the one that is unlike the trend A point that.

OutliersLet’s see the effect outliers have on the line of best fit.

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

Collection 1 Scatter Plot

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

y = 0.000454x + 3.96; r^2 = 0.0016

Collection 1 Scatter Plot

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

y = 0.0116x + 3.93; r^2 = 0.87

Collection 1 Scatter PlotWith outlier:

Without outlier:

Equation: y = 0.000454x + 3.96

r2 = 0.0016

r = 0.04

Equation: y = 0.0116x + 3.93

r2 = 0.87

r = 0.93

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

Collection 1 Scatter Plot

Page 3: Outliers Sometimes there is data that doesn’t fit the pattern of the rest. In the following graphs Circle the one that is unlike the trend A point that.

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

Collection 2 Scatter Plot

OutliersLet’s see the effect outliers have on the line of best fit.

With outlier:

Without outlier:

Equation: y = 0.0133x + 3.912

r2 = 0.59

r = 0.77

Equation: y = 0.00829x + 3.926

r2 = 0.85

r = 0.92

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

y = 0.0133x + 3.912; r^2 = 0.59

Collection 2 Scatter Plot

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

y = 0.00829x + 3.926; r^2 = 0.85

Collection 2 Scatter Plot

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

y = 0.00829x + 3.926; r^2 = 0.85

Collection 2 Scatter Plot

Page 4: Outliers Sometimes there is data that doesn’t fit the pattern of the rest. In the following graphs Circle the one that is unlike the trend A point that.

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

Collection 3 Scatter Plot

OutliersLet’s see the effect outliers have on the line of best fit.

With outlier:

Without outlier:

Equation: y = 0.00895x + 3.95

r2 = 0.46

r = 0.68

Equation: y = 0.00456x + 3.975

r2 = 0.64

r = 0.80

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

y = 0.00895x + 3.95; r^2 = 0.46

Collection 3 Scatter Plot

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

y = 0.00456x + 3.975; r^2 = 0.64

Collection 3 Scatter Plot

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

Collection 3 Scatter Plot

Page 5: Outliers Sometimes there is data that doesn’t fit the pattern of the rest. In the following graphs Circle the one that is unlike the trend A point that.

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

Collection 4 Scatter Plot

OutliersLet’s see the effect outliers have on the line of best fit.

With outlier:

Without outlier:

Equation: y = 0.00325x + 3.98

r2 = 0.045

r = 0.21

Equation: y = 0.0118x + 3.93

r2 = 0.83

r = 0.91

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

y = 0.00325x + 3.98; r^2 = 0.045

Collection 4 Scatter Plot

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

y = 0.0118x + 3.93; r^2 = 0.83

Collection 4 Scatter Plot

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

Collection 4 Scatter Plot

Page 6: Outliers Sometimes there is data that doesn’t fit the pattern of the rest. In the following graphs Circle the one that is unlike the trend A point that.

OutliersCompare the correlation before the outlier was removed to after it was removed. What effect does the outlier have on the line of best fit?.

The outlier pulls the line of best fit towards it. The correlation becomes stronger when the outlier is removed.

3.92

3.96

4.00

4.04

y

1 2 3 4 5 6 7 8x

Collection 4 Scatter Plot Before removal

Equation: y = 0.00325x + 3.98

r = 0.22

After removal

Equation: y = 0.0118x + 3.93

r = 0.91