*Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

17
Chapter 10 Re-expressing Data: Get it Straight!! *Straightening Relationships *Goals of Re-Expression *Ladder of Powers

Transcript of *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

Page 1: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

Chapter 10Re-expressing Data: Get it

Straight!!

*Straightening Relationships*Goals of Re-Expression

*Ladder of Powers

Page 2: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

Straightening Relationships

To use a linear model, the scatterplot must be straight enough• Check scatterplot AND residual plot

We have the ability to straighten data so that we can use a linear model for scatterplots that do not satisfy the straight enough condition

Page 3: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

MPG and Weight

Page 4: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

A Hummer weighs about 6000 pounds. What is the predicted MPG?

Page 5: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

MPG vs Gallons/100 Miles Change 25 mpg into gallons/100

miles

Page 6: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

Scatterplot: gal/100 miles and weight

Page 7: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

Revisit the Hummer What is the predicted fuel efficiency

for a Hummer? (6000 lbs)

The new model predicts that a 6000 lbs Hummer would get 9.7 gallons/100 miles

Convert that back into MPG

Page 8: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

Not Sold?? You regularly use re-expression

What units do you use to talk about how fast you went on a bike?

What units do you use to talk about how fast you run?

Page 9: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

Goals of Re-Expressing 1) Make the distribution of a variable

more symmetric• easier to compare centers• if its unimodal you could perhaps use the

Normal Model

Page 10: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

Goals Make the spread of several groups

more alike• groups with similar spreads are easier to

compare• centers may be different

Page 11: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

Goals Make the form of a scatterplot more

nearly linear• linear models are easier to describe

Page 12: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

Goals Make the scatter in a scatterplot

spread out evenly rather than following a fan shape• having an even scatter is a condition of

many methods in Stats (we will see later)

Page 13: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

Ladder of Powers Use to systematically re-express data

The farther you move from 1 (original data) the greater the effect on the data

Certain re-expressions work better for different types of data

Page 14: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

Ladder of Powers Power

Name Comment

2 y2 unimodal distributions that are skewed to the left

1 y data that can be both positive and negative and continue without bond; less likely for re-expression

1/2 counted data

0 logy measurements that can NOT be negative; values that grow by percentages (salaries, populations); if the data has zeros add a small constant to each value

-1/2 -1/ uncommon; changing the sign to take the negative of the reciprocal square root preserves the direction

-1 -1/y ratios of two quantities (mpg); change the sign if you want to preserve the direction; if there are zeros, add a small constant to all values

Page 15: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

Plan B: Attack of the Logs Try taking the logs of BOTH the x

values and the y values.

Model Name

x-axis y-axis Comment

Exponential x log(y) “0” power from the ladder

Logarithmic log (x) y wide range of x-values, scatterplot descending rapidly at the left and trailing to the right

Power log (x) log (y) when you are in between powers on the ladder

Page 16: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

Example Let’s try to predict the shutter speed

based off the f/stop of a cameras lens.

Enter data

Shutter Speed

1/1000

1/500

1/250

1/125 1/60 1/30

1/15

1/8

f/stop 2.8 4 5.6 8 11 16 22 32

Page 17: *Straightening Relationships *Goals of Re-Expression *Ladder of Powers.

What Can Go Wrong? Don’t expect your model to be

perfect Don’t choose a model based on R2

alone• always check the residual plot

Watch out for scatterplots that change direction

Watch out for negative values Rescale years Don’t stray too far from the ladder