Power 16

64
1 Power 16

description

Power 16. Review. Post-Midterm Cumulative. Projects. Logistics. Put power point slide show on a high density floppy disk, or e-mail as an attachment, for a WINTEL machine. Email [email protected] the slide-show as a PowerPoint attachment. Assignments. 1. Project choice - PowerPoint PPT Presentation

Transcript of Power 16

Page 1: Power 16

1

Power 16

Page 2: Power 16

2

Review Post-Midterm Cumulative

Page 3: Power 16

3

Projects

Page 4: Power 16

4

Logistics Put power point slide show on a high

density floppy disk, or e-mail as an attachment, for a WINTEL machine.

Email [email protected] the slide-show as a PowerPoint attachment

Page 5: Power 16

5

Assignments 1. Project choice 2. Data Retrieval 3. Statistical Analysis 4. PowerPoint Presentation 5. Executive Summary 6. Technical Appendix 7. Graphics

Power_13

Page 6: Power 16

6

PowerPoint Presentations: Member 4 1. Introduction: Members 1 ,2 , 3

What Why How

2. Executive Summary: Member 5 3. Exploratory Data Analysis: Member 3 4. Descriptive Statistics: Member 3 5. Statistical Analysis: Member 3 6. Conclusions: Members 3 & 5 7. Technical Appendix: Table of Contents,

Member 6

Page 7: Power 16

7

Executive Summary and Technical Appendix

Page 8: Power 16

8

I. Your report should have an executive summary of one to one

and a half pages that summarizes your findings in words for a non-

technical reader. It should explain the problem being examined

from an economic perspective, i.e. it should motivate interest in the

issue on the part of the reader. Your report should explain how you

are investigating the issue, in simple language. It should explain

why you are approaching the problem in this particular fashion.

Your executive report should explain the economic importance of

your findings.

The technical details of your findings you can attach as an

appendix.

Page 9: Power 16

9

Technical Appendix Table of Contents Spreadsheet of data used and sources or

if extensive, a subsample of the data Descriptive Statistics and Histograms for

the variables in the study If time series data, a plot of each variable

against time If relevant, plot of the dependent Vs.

each of the explanatory variables

Page 10: Power 16

10

Technical Appendix (Cont.) Statistical Results, for example regression Plot of the actual, fitted and error and other

diagnostics Brief summary of the conclusions,

meanings drawn from the exploratory, descriptive, and statistical analysis.

Page 11: Power 16

11

Post-Midterm Review Project I: Power 16 Contingency Table Analysis: Power 14, Lab

8 ANOVA: Power 15, Lab 9 Survival Analysis: Power 12, Power 11, Lab

7 Multi-variate Regression: Power 11 , Lab 6

Page 12: Power 16

12

Slide Show Challenger disaster

Page 13: Power 16

13

Project I Number of O-Rings Failing On Launch i:

yi(#) = a + b*tempi + ei

Biased because of zeros, even if divide equation by 6

Two Ways to Proceed Tobit, non-linear estimation: yi(#) = a + b*tempi + ei

Bernoulli variable: probability models

Probability Models: yi(0,1) = a + b*tempi + ei

Page 14: Power 16

14

Project I (Cont.) Probability Models: yi(0,1) = a + b*tempi + ei

OLS, Linear Probability Model, linear approximation to the sigmoid

Probit, non-linear estimate of the sigmoid Logit, non-linear estimate of the sigmoid

Significant Dependence on Temperature t-test (or z-test) on slope, H0 : b=0 F-test Wald test

Page 15: Power 16

15

Project I (Cont.) Plots of Number or Probability Vs Temp.

Label the axes

Answer all parts, a-f The most frequent sins

Did not explicitly address significance Did not answer b, 660 : all launches at lower

temperatures had one or more o-ring failures Did not execute c, estimate linear probability

model

Page 16: Power 16

16

Challenger Disaster Failure of O-rings that sealed grooves on

the booster rockets Was there any relationship between o-

ring failure and temperature? Engineers knew that the rubber o-rings

hardened and were less flexible at low temperatures

But was there launch data that showed a problem?

Page 17: Power 16

17

Challenger Disaster

What: Was there a relationship between launch temperature and o-ring failure prior to the Challenger disaster?

Why: Should the launch have proceeded? How: Analyze the relationship between

launch temperature and o-ring failure

Page 18: Power 16

18

Launches Before Challenger Data

number of o-rings that failed launch temperature

Page 19: Power 16

19

o-rings temperature3 531 571 581 630 660 670 670 670 680 691 70

Page 20: Power 16

20

o-rings temperature1 700 700 700 720 732 750 750 760 760 780 79

Page 21: Power 16

21

o-rings temperature

0 80

0 81

Page 22: Power 16

22

Exploratory Analysis Launches where there was a problem

Page 23: Power 16

23

1 581 571 701 631 702 753 53

Orings temperature

Page 24: Power 16

0.5

1.0

1.5

2.0

2.5

3.0

3.5

50 55 60 65 70 75 80

TEMP

OR

ING

S

.

Page 25: Power 16

25

Exploratory Analysis All Launches

Plot of failures per observation versus temperature range shows temperature dependence:

Mean temperature for the 7 launches with o-ring failures was lower, 63.7, than for the 17 launches without o-ring failures,72.6. -

Contingency table analysis

Page 26: Power 16

26

Launches and O-Ring Failures (Yes/No)

Fail: Yes Fail: No Column Totals

53-62 F 3

0 3

63-71 F 3 8 11

72-81 F 1 9 10

Row Totals 7 17 24

Page 27: Power 16

27

Launches and O-Ring Failures (Yes/No) Expected/Observed

Fail: Yes Fail: No Column Totals

53-62 F 0.875/3 2.125/0 3

63-71 F 3.208/3 7.792/8 11

72-81 F 2.917/1 7.083/9 10

Row Totals 7 17 24

Page 28: Power 16

28

Launches and O-Ring Failures Chi-Square, 2dof=9.08, crit(=0.05)=6 Fail: Yes Fail: No Column Totals

53-62 F 5.16

2.125 3

63-71 F 0.013 0.005 11

72-81 F 1.26 0.519 10

Row Totals 7 17 24

Page 29: Power 16

0

1

2

3

4

50 60 70 80 90

TEMP

OR

ING

S

Number of O-ring Failures Vs. Temperature

Page 30: Power 16

30

Probability Models

-0.2

0

0.2

0.4

0.6

0.8

1

30 40 50 60 70 80 90

Temperature

Pro

bab

ilit

y Bernoulli

LPM Fitted

Probit Fitted

Logit Extrapolated to 31F: Probit extrapolated to 31F:

Page 31: Power 16

31

Number of Failed O-Rings

-0.5

0

0.5

1

1.5

2

2.5

3

3.5

30 40 50 60 70 80 90

Temperature

Nu

mb

er

Number of Failed O-Rings

OLS Fitted

Tobit Fitted

Extrapolating OLS to 31F: OLS: Tobit:

Page 32: Power 16

32

Conclusions From extrapolating the probability models

to 31 F, Linear Probability, Probit, or Logit, there was a high probability of one or more o-rings failing

From extrapolating the Number of O-rings failing to 31 F, OLS or Tobit, 3 or more o-rings would fail.

There had been only one launch out of 24 where as many as 3 o-rings had failed.

Decision theory argument: expected cost/benefit ratio:

Page 33: Power 16

33

Conclusions Decision theory argument: expected

cost/benefit ratio:

Page 34: Power 16

34

Ways to Analyze Challenger

Difference in mean temperatures for failures and successes

Difference in probability of one or more o-ring failures for high and low temperature ranges

Probabilty models: LPM (OLS), probit, logitNumber of o-ring failure per launch Vs. Temp.

OLS, TobitContingency table analysisANOVA

Page 35: Power 16

35

Contingency Table Analysis Challenger example

Page 36: Power 16

36

Launches and O-Ring Failures (Yes/No)

Fail: Yes Fail: No Column Totals

53-62 F 3

0 3

63-71 F 3 8 11

72-81 F 1 9 10

Row Totals 7 17 24

Page 37: Power 16

37

ANOVA and O-Rings Probability one or more o-rings fail

Low temp: 53-62 degrees Medium temp: 63-71 degrees High temp: 72-81 degrees

Average number of o-rings failing per launch Low temp: 53-62 degrees Medium temp: 63-71 degrees High temp: 72-81 degrees

Page 38: Power 16

38

Probability one or more o-rings fails

Page 39: Power 16

39

Number of o-rings failing per launch

Page 40: Power 16

40

Page 41: Power 16

41

Outline ANOVA and Regression (Non-Parametric Statistics) (Goodman Log-Linear Model)

Page 42: Power 16

42

Anova and Regression: One-Way Salesaj =

c(1)*convenience+c(2)*quality+c(3)*price+ e E[salesaj/(convenience=1, quality=0, price=0)]

=c(1) = mean for city(1) c(1) = mean for city(1) (convenience) c(2) = mean for city(2) (quality) c(3) = mean for city(3) (price) Test the null hypothesis that the means are equal

using a Wald test: c(1) = c(2) = c(3)

Page 43: Power 16

43

One-Way ANOVA and Regression

Regression Coefficients are the City Means; F statistic

Page 44: Power 16

44

Anova and Regression: One-WayAlternative Specification Salesaj = c(1) +

c(2)*convenience+c(3)*quality+e E[Salesaj/(convenience=0, quality=0)] =

c(1) = mean for city(3) (price, the omitted one)

E[Salesaj/(convenience=1, quality=0)] = c(1) + c(2) = mean for city(1) (convenience) c(1) = mean for city(3), the omitted city c(2) = mean for city(1) minus mean for city(3) Test that the mean for city(1) = mean for city(3) Using the t-statistic for c(2)

Page 45: Power 16

45

Anova and Regression: One-WayAlternative Specification Salesaj = c(1) +

c(2)*convenience+c(3)*price+e E[Salesaj/(convenience=0, price=0)] = c(1)

= mean for city(2) (quality, the omitted one) E[Salesaj/(convenience=1, price=0)] = c(1)

+ c(2) = mean for city(1) (convenience) c(1) = mean for city(2), the omitted city c(2) = mean for city(1) minus mean for city(2) Test that the mean for city(1) = mean for city(2) Using the t-statistic for c(2)

Page 46: Power 16

46

ANOVA and Regression: Two-WaySeries of Regressions; Compare to Table 11, Lecture 15

Salesaj = c(1) + c(2)*convenience + c(3)* quality + c(4)*television + c(5)*convenience*television + c(6)*quality*television + e, SSR=501,136.7

Salesaj = c(1) + c(2)*convenience + c(3)* quality + c(4)*television + e, SSR=502,746.3

Test for interaction effect: F2, 54 = [(502746.3-501136.7)/2]/(501136.7/54) = (1609.6/2)/9280.3 = 0.09

Page 47: Power 16

Table 11: 2-Way ANOVA of Apple Juice Sales

Source of Variation Sum of Squares Degrees of

Freedom

Mean Square

Explained(between

treatments)

ESS =

Strategy ESS(Strat) = 98838.6 (a-1) = 2 49419.3

Medium ESS(Med) = 13172.0 (b-1) = 1 13172.0

Interaction ESS(I) = 1609.6 (a-1)(b-1) = 2 804.8

Unexplained(within

treatments)

USS = 501136.7 (n-ab) = 60 – 6

= 54

9280.3

Total TSS = 614756.98 (n-1) = 59

Table of Two-Way ANOVA for Apple Juice Sales

Page 48: Power 16

48

ANOVA and Regression: Two-WaySeries of Regressions

Salesaj = c(1) + c(2)*convenience + c(3)* quality + e, SSR=515,918.3

Test for media effect: F1, 54 = [(515918.3-502746.3)/1]/(501136.7/54) = 13172/9280.3 = 1.42

Salesaj = c(1) +e, SSR = 614757 Test for strategy effect: F2, 54 = [(614757-

515918.3)/2]/(501136.7/54) = (98838.7/2)/(9280.3) = 5.32

Page 49: Power 16

49

Survival Analysis Density, f(t) Cumulative distribution function, CDF, F(t)

Probability you failed up to time t* =F(t*) Survivor Function, S(t) = 1-F(t)

Probability you survived longer than t*, S(t*) Kaplan-Meier estimates: (#at risk- # ending)/# at risk

Applications Testing a new drug

Page 50: Power 16

50

Chemotherapy Drug Taxol Current standard for ovarian cancer is

taxol and a platinate such as cisplatin Previous standard was

cyclophosphamide and cisplatin Kaplan-Meier Survival curves comparing

the two regimens Lab 7: ( # at risk- #ending)/# at riak

Page 51: Power 16

51

Taxol ( Bristol-Myers Squibb) interrupts cell division (mitosis)

It is a cyclical hydrocarbon

Page 52: Power 16

52

Top Panel: EuropeanCanadian and Scottish,342 at risk for Tc, 292 Survived 1 year

Bottom Panel:Gynecological Oncology Group, 196 at riskFor Tc, 168 survived1 year

Page 53: Power 16

53

2003 Final

Page 54: Power 16

54

Nonparametric Statistics What to do when the sample of

observations is not distributed normally?

Page 55: Power 16

55

3 Nonparametric Techniques Wilcoxon Rank Sum Test for independent

samples Data Analysis Plus

Signs Test for Matched Pairs: Rated Data Eviews, Descriptive Statistics

Wilcoxon Signed Rank Sum Test for Matched Pairs: Quantitative Data Eviews

Page 56: Power 16

56

Wilcoxon Rank Sum Test for Independent Samples

Testing the difference between the means of two populations when they are non-normal

A New Painkiller Vs. Aspirin, Xm17-02

Page 57: Power 16

57

Rating scheme

Score Legend

5 Extremely Effective

4 Quite Effective

3 Somewhat Effective

2 Slightly Effective

1 Not At All Effective

Page 58: Power 16

58

New Drug Aspirin3 45 14 33 22 45 11 34 45 23 23 25 45 35 4

Ratings

Page 59: Power 16

59

Rank the 30 Ratings 30 total ratings for both samples 3 ratings of 1 5 ratings of 2 etc

Page 60: Power 16

60

Rating Raw Rank Rank/Ties1 1 21 2 21 3 22 4 62 5 62 6 62 7 62 8 63 9 123 10 123 11 123 12 123 13 123 14 12

3 15 12

Page 61: Power 16

61

Rating Raw Rank Rank/Ties4 16 19.54 17 19.54 18 19.54 19 19.54 20 19.54 21 19.54 22 19.54 23 19.55 24 275 25 275 26 275 27 275 28 275 29 27

5 30 27

continued

Page 62: Power 16

62

Drug Rate Rank Asp. Rate Rank3 12 4 19.55 27 1 24 19.5 3 123 12 2 62 6 4 19.55 27 1 21 2 3 124 19.5 4 19.55 27 2 63 12 2 63 12 2 65 27 4 19.55 27 3 125 27 4 19.5

4 19.5 5 27

Rank Sum 276.5 188.5

Page 63: Power 16

63

Rank Sum, T E (T )= n1 (n1 + n2 + 1)/2 = 15*31/2 = 232.5

VAR (T) = n1 * n2 (n1 + n2 + 1)/12

VAR (T) = 15*31/12 , T = 24.1 For sample sizes larger than 10, T is normal Z = [T-E(T)]/ T = (276.5 - 232.5)/24.1 = 1.83 Null Hypothesis is that the central tendency

for the two drugs is the same Alternative hypothesis: central tendency for

the new drug is greater than for aspirin: 1-tailed test

Page 64: Power 16

0.0

0.1

0.2

0.3

0.4

0.5

-4 -2 0 2 4

Z

FR

EQ

UE

NC

Y

Figure 1: One-Tailed Test, 5% Level, Normal Distribution

1.645

5%