Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33,...
-
Upload
hunter-turkett -
Category
Documents
-
view
218 -
download
3
Transcript of Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33,...
![Page 1: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/1.jpg)
Re-Expressing DataGet it Straight!
![Page 2: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/2.jpg)
Page 192, #2, 4, 15, 19, 22Residuals Pg 193, # 11, 23, 27, 33, 45Pg 195, 16, 22, 23,25,37Regression Wisdom Pg 214, #1, 3, 4, 8, 10
![Page 3: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/3.jpg)
Goals of Re-Expressing Data
Make the distribution of a variable (as seen in its histogram, for example) more symmetric
Goal 1
![Page 4: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/4.jpg)
Goals of Re-Expressing Data
Make the Spread of several groups (as seen in side by side boxplots) more alike, even if their centers differ
Goal 1
![Page 5: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/5.jpg)
Goals of Re-Expressing DataGoal 1
![Page 6: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/6.jpg)
Goals of Re-expessing Data
Make the form of the scatterplot more nearly linear
![Page 7: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/7.jpg)
Goals of Re-Expressing Data
Make the scatter in a scatterplot spread out evenly rather than thickening at one end.
The points in the left scatterplots go from tightly bunched at the left to widely scattered at the right; the plot “thickens.”
In the second plot, log Assets vs. log Sales shows a clean, positive, linear association. The variation of each x value is about the same.
![Page 8: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/8.jpg)
The Ladder of Powers
How do you pick which re-expression to use?
We will create (copy) a ladder of re-expression techniques
The farther you move from the 1 position – the original data, the greater is the effect on the data.
![Page 9: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/9.jpg)
Power Name Comment
2 The square of the data values, Try for unimodal, left skewed
distributions
1The raw data – no change at
all.
Data that can take on both positive and negative values with no bounds are less likely to benefit from re-expression.
1/2The square root of the data
values, Counts often benefit from square root re-expression.
“0”We use the 0-power to stand
for the logarithm.
Good for values that cannot be negative. Good for value that grow by percentage increases
such as salaries or populations. Good place to start.
-1/2The negative reciprocal of
square root, An uncommon re-expression.
-1 The negative reciprocal, Ratios of two quantities often
benefit from a reciprocal.
![Page 10: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/10.jpg)
Just Checking
You want to model the relationship between the number of birds counted at a nesting site and the temperature (in degrees C).
The scatterplot of counts vs. temperature shows an upwardly curving pattern, with some birds spotted at higher temperatures. What transformation, if any, of the bird counts might you start with?
Answer: Counts are often best transformed by using the square root.
![Page 11: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/11.jpg)
Just Checking
You want to model the relationship between prices for various items in Paris and in Hong Kong. The scatterplot of HK prices versus Parisian prices shows a generally straight pattern wit ha small amount of scatter. What transformation, if any, of the Hong Kong prices might you start with?
None, the relationship is already linear.
![Page 12: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/12.jpg)
Just Checking
You want to model the population growth of the US over the past 200 years. The scatterplot shows a strongly upward and curved pattern. What transformation, if any, of the population might you start with?
Even though population values are technically counts you should probably try a stronger transformation like log(population) because populations grow in proportion to their size.
![Page 13: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/13.jpg)
Attack the Logarithms
What do you do if the curvature is more stubborn?
When none of the data values are zero or negative, try taking the logs of both the x and y variables.
Model Name X-axis Y-axis Comment
Exponential X Log(y)
This model is the “0” power in the ladder approach. Useful for values that grow by percentage increases.
Logarithmic Log(x) Y
A wide range of x-values, or a scatterplot descending rapidly at the left but leveling off toward the right. These cases may benefit from this model.
Power Log(x) Log(y)
The Goldilocks model: When one of the ladder’s powers is too big and the next is too small, this one may be just right.
![Page 14: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/14.jpg)
Non-Linear
![Page 15: Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.](https://reader036.fdocuments.us/reader036/viewer/2022062511/551b653d550346ae7a8b5ce5/html5/thumbnails/15.jpg)
HomeworkPage 239, #3, 5, 8, 11, 13, 15