Chapter 6. Exercise 1 X=c(5,8,9,7,14) Y=c(3,1,6,7,19) R function ols(x,y) returns (Intercept)...

Chapter 6

Exercise 1

X=c(5,8,9,7,14)Y=c(3,1,6,7,19)R function ols(x,y) returns (Intercept) -8.477876 x (slope): 1.823009mean(x)=8.6, mean(y)=7.2

Exercise 2

X=c(5,8,9,7,14)Y=c(3,1,6,7,19)

Exercise 3

X=c(5,8,9,7,14)Y=c(3,1,6,7,19)The sum of squared residuals will be larger in this line relative to LSR because the LSR line is designed to minimize the residuals.

Exercise 4

Exercise 5

a=c(3,104,50,9,68,29,74,11,18,39,0,56,54,77,14,32,34,13,96,84,5,4,18,76,34,14,9,28,7,11,21,30,26,2,11,12,6,3,3,47,19,2,25,37,11,14,0)b=c(0,5,0,0,0,6,0,1,1,2,17,0,3,6,4,2,4,2,0,0,13,9,1,4,2,0,4,0,4,6,4,4,1,6,6,13,3,1,0,3,1,6,1,0,2,11,3)

The R function ols(a,b)returns (Intercept) 4.58061839 x (slope) -0.04051423

Exercise 6c=c(300,280,305,340,348,357,380,397,453,456,510,535,275,270,335,342,354,394,383,450,446,513,520,520)d=c(32.75,28,30.75,29,27,31.20,27,27,23.50,21,21.5,22.8,30.75,27.25,31,26.50,23.50,22.70,25.80,27.80,21.50,22.50,20.60,21)

Ols(c,d) yields:

Higher levels of solar radiation predict lower rates of cancer.

Exercise 7

a=c(500,530,590,660,610,700,570,640)b=c(2.3,3.1,2.6,3.0,2.4,3.3,2.6,3.5)

R function ols(a,b) returns (Intercept) 0.484615385 X (slope) 0.003942308

Exercise 8R function ols(a,b) returns $coef Estimate Std. Error t value Pr(>|t|)(Intercept) 0.484615385 1.289275061 0.3758821 0.7199360x 0.003942308 0.002137246 1.8445735 0.1146492

$Ftest.p.value value 0.1146492

This means that SAT accounts for about 36% of the variance in GPA. This gives an indication of the strength of the assocition

$R.squared[1] 0.3618685

Exercise 9x=c(40,41,42,43,44,45,46)y=c(1.62,1.63,1.90,2.64,2.05,2.13,1.94)

ols(x,y)$coef Estimate Std. Error t value Pr(>|t|)(Intercept) -1.25321429 2.73157319 -0.4587885 0.6656396x 0.07535714 0.06345636 1.1875429 0.2883482

$Ftest.p.value value 0.2883482

$R.squared[1] 0.2200002

Exercise 10

c=c(300,280,305,340,348,357,380,397,453,456,510,535,275,270,335,342,354,394,383,450,446,513,520,520)d=c(32.75,28,30.75,29,27,31.20,27,27,23.50,21,21.5,22.8,30.75,27.25,31,26.50,23.50,22.70,25.80,27.80,21.50,22.50,20.60,21)Ols(c,d) yields(Intercept) 39.99094634 X(slope) -0.03565283600 exceeds the range of X values, so the prediction is based on

extrapolation.The relationship between the variables may change in extreme values.

Exercise 11mou=c(63.3,60.1,53.6,58.8,67.5,62.5)time=c(241.5,249.8,246.1,232.4,237.2,238.4)

R function cor.test(mou,time) returns Pearson's product-moment correlation

t = -0.7872, df = 4, p-value = 0.4752sample estimates:cor -0.3662634 There is insufficient evidence to determine that the correlation is different than 0.

> qt(0.975,4): [1] 2.776445pt(-0.7872,4): [1] 0.2375939, for two tailed 0.234*2=0.475P>0.05

T=-0.78 does not Exceed crticial valueOf 2.77 or -2.77

Exercise 12x=c(1,2,3,4,5,6)y=c(1,4,7,7,4,1)

ols(x,y) (Intercept) 4.000000e+00 (slope) -5.838669e-16 (reasonably close to 0)

Data is consistent with an inverted U shape rather than with the linear model.There might be an association here that is not detected.

Exercise 13

The LSR slope is still 0 even though there is a clear linear trend to the data, which is masked by a single outlier

x=c(1,2,3,4,5,6)y=c(4,5,6,7,8,2)

Exercise 14

The nature of the relationship between two variables can vary with the predictor value. In other words, the association between Y and X can change as a function of X values. Extrapolating beyond the data range, therefore, can be problematic, even when the association appears to be linear. In non-linear associations, the LSR line can be misleading.

Exercise 15age=c(5.2,8.8,10.5,10.6,10.4,1.8,12.7,15.6,5.8,1.9,2.2,4.8,7.9,5.2,0.9,11.8,7.9,1.5,10.6,8.5,11.1,12.8,11.3,1,14.5,11.9,8.1,13.8,15.5,9.8,11.0,14.4,11.1,5.1,4.8,4.2,6.9,13.2,9.9,12.5,13.2,8.9,10.8)cpep=c(4.8,4.1,5.2,5.5,5,3.4,3.4,4.9,5.6,3.7,3.9,4.5,4.8,4.9,3.0,4.6,4.8,5.5,4.5,5.3,4.7,6.6,5.1,3.9,5.7,5.1,5.2,3.7,4.9,4.8,4.4,5.2,5.1,4.6,3.9,5.1,5.1,6.0,4.9,4.1,4.6,4.9,5.1)

R function: cor(age,cpep) returns;[1] 0.3906776

R function: hc4test(age,cpep) returns: $test[1] 4.705966

$p.value[1] 0.03005811

Thus, r=0.39, and the hc4test rejects at 0.05

Exercise 16age=c(5.2,8.8,10.5,10.6,10.4,1.8,12.7,15.6,5.8,1.9,2.2,4.8,7.9,5.2,0.9,11.8,7.9,1.5,10.6,8.5,11.1,12.8,11.3,1,14.5,11.9,8.1,13.8,15.5,9.8,11.0,14.4,11.1,5.1,4.8,4.2,6.9,13.2,9.9,12.5,13.2,8.9,10.8)cpep=c(4.8,4.1,5.2,5.5,5,3.4,3.4,4.9,5.6,3.7,3.9,4.5,4.8,4.9,3.0,4.6,4.8,5.5,4.5,5.3,4.7,6.6,5.1,3.9,5.7,5.1,5.2,3.7,4.9,4.8,4.4,5.2,5.1,4.6,3.9,5.1,5.1,6.0,4.9,4.1,4.6,4.9,5.1)

ols(age[age<7],cpep[age<7])$coef Estimate Std. Error t value Pr(>|t|)(Intercept) 3.5148814 0.37014633 9.495924 6.244186e-07x 0.2474008 0.08924835 2.772049 1.689761e-02

ls(age[age>7],cpep[age>7])$coef Estimate Std. Error t value Pr(>|t|)(Intercept) 4.7535568 0.64125948 7.4128445 5.654828e-08x 0.0132083 0.05550626 0.2379606 8.137083e-01

C-peptide concentrations increase to about age 7. The regression line plateaus beyond that age. Using a single line or correlation To describe the relationship is misleading

Exercise 17size=c(2359,3397,1232,2608,4870,4225,1390,2028,3700,2949,688,3147,4000,4180,3883,1937,2565,2722,4231,1488,4261,1613,2746,1550,3000,1743,2388,4522)price=c(510,690,365,592,1125,850,363,559,860,695,182,860,1050,675,859,435,555,525,805,369,930,375,670,290,715,365,610,1290)

R Function ols(size,price) returns (Intercept) 38.1921217 X (Slope) 0.2153008

The conclusion here is that a home size of 0 cost 38.192, which makes no sense. This illustrates ho non-linear relationships can make the regression land midleading. Extrapolation beyond the data can be problematic.

Exercise 18lot=c(18200,12900,10060,14500,76670,22800,10880,10880,23090,10875,3498,42689,17790,38330,18460,17000,15710,14180,19840,9150,40511,9060,15038,5807,16000,3173,24000,16600)price=c(510,690,365,592,1125,850,363,559,860,695,182,860,1050,675,859,435,555,525,805,369,930,375,670,290,715,365,610,1290)

R function ols(lot,price) returns

Estimate Std. Error t value Pr(>|t|)(Intercept) 436.83367567 66.609568133 6.558122 5.927679e-07x (slope) 0.01104288 0.002754693 4.008752 4.569549e-04

Exercise 19

This would generally be the case when the relationship are linear and homoscedastic.

Exercise 20x=c(18,20,35,16,12)y=c(36,29,48,64,18)

R function ols(x,y) returns: Estimate Std. Error t value Pr(>|t|)(Intercept) 25.3283679 23.774217 1.0653713 0.3648449x 0.6768135 1.096856 0.6170485 0.5808715$Ftest.p.value: 0.5808715

R function cor.test(x,y) returns:

t = 0.617, df = 3, p-value = 0.5809sample estimates cor: 0.3355929

Both analyses agree, both not significant. X and Y can still be dependent in nonlinear ways, and there are power considerations with a small sample size.

Exercise 21x=c(12.2,41,5.4,13,22.6,35.9,7.2,5.2,55,2.4,6.8,29.6,58.7)y=c(1.8,7.8,0.9,2.6,4.1,6.4,1.3,0.9,9.1,0.7,1.5,4.7,8.2)

R function ols(x,y) returns Estimate Std. Error t value Pr(>|t|)(Intercept) 0.3269323 0.248122843 1.317623 2.144131e-01x 0.1550843 0.008413901 18.431919 1.280856e-09

The estimate of the slope is 0.155 with a SE of 0.0084. The 0.975 quantile of T with 24 df is:> qt(0.975,24)[1] 2.063899

The scatter plot suggests that X and Y increase together, but with the same confidence interval situations arise when it is not always the case

Exercise 22x=c(34,49,49,44,66,48,49,39,54,57,39,65,43,43,44,42,71,40,41,38,42,77,40,38,43,42,36,55,57,57,41,66,69,38,49,51,45,141,133,76,44,40,56,50,75,44,181,45,61,15,23,42,61,146,144,89,71,83,49,43,68,57,60,56,63,136,49,57,64,43,71,38,74,84,75,64,48)y=c(129,107,91,110,104,101,105,125,82,92,104,134,105,95,101,104,105,122,98,104,95,93,105,132,98,112,95,102,72,103,102,102,80,125,93,105,79,125,102,91,58,104,58,129,58,90,108,95,85,84,77,85,82,82,111,58,99,77,102,82,95,95,82,72,93,114,108,95,72,95,68,119,84,75,75,122,127)

R function ols(x,y) returns$coef Estimate Std. Error t value Pr(>|t|)(Intercept) 97.95728197 4.73432147 20.6908809 9.985891e-33x (slope) -0.02136595 0.07096758 -0.3010664 7.641969e-01

pq(0.975.df=77)[1] 1.99

12.0,16.007.099.102.0ˆ

Exercise 23

khomreg(size,price)$test[1,] 6.115014, $p.value [1,] 0.01340384

khomreg(lot,price)$test[1,] 0.1683221$p.value[1,] 0.6816073

We actually do reject for house size but not for lot size. This test may not have sufficient power to detect heteroscedasticity, so when we fail to reject, it is difficult to draw conclusions

Exercise 24ols(x,y)

Estimate Std. Error t value Pr(>|t|)(Intercept) 65.46175413 18.4508380 3.5479014 0.000673844x (slope) -0.05649584 0.1876524 -0.3010664 0.764196940 (Slope is close to 0, with P<0.764, do not rejet with OLS)

$Ftest.p.value value 0.7641969 (book has typo)

rqfit(x,y)

$coef(Intercept) x 95.2000000 -0.4333333

$ci lower bd upper bd(Intercept) 64.4610733 105.972735X (Slope) -0.5505706 -0.1450298 (CI for slope does not contain 0, so reject with rqfit.

As is evident in the scatterplot of OLS, there are several outliers between the X values of100-130. To minimize least squared distances, these outliers pull the regression line upward in a manner that makes it horizontal.The rqfit is based on the median of Y instead of mean. It is thus insensitive to outliers , making the regression line (in blue) go through the middle (0.5 y quantile/X) of the bulk of the observations.

regplot(y,x,regfun=rqfit)

ols(y,x)

Exercise 25

X=c(2300,750,4300,2600,6000, 10500, 10000, 17000, 5400, 7000, 9400, 32000, 35000, 100000, 100000, 52000, 100000, 4400, 3000, 4000, 1500, 9000, 5300, 10000, 19000, 27000, 28000, 31000, 26000, 21000, 79000, 100000,100000)

Y=c(65,156,100,134,16,108,121,4,39,143,56,26,22,1,1,5,65,56,65,17,7,16,22,3,4,2,3,8,4,3,30,4,43)

ols(X,Y)$coef Estimate Std. Error t value Pr(>|t|)(Intercept) 53.8899623928 1.027986e+01 5.242286 1.072131e-05x -0.0004461206 2.296306e-04 -1.942775 6.117379e-02

$Ftest.p.value0.06117379

olshc4(X,Y)$ci Coef. Estimates ci.lower ci.upper p-value Std.Error(Intercept) 0 53.8899623928 30.5619402421 7.721798e+01 4.902827e-05 1.143803e+01Slope 1 -0.0004461206 -0.0008776261 -1.461508e-05 4.315956e-02 2.115728e-04

Olshc4 reject. It has a smaller standard error for the slope

The data can be accessed by library(MASS)

Chapter 6. Exercise 1 X=c(5,8,9,7,14) Y=c(3,1,6,7,19) R function ols(x,y) returns (Intercept)...

Documents

Transcript of Chapter 6. Exercise 1 X=c(5,8,9,7,14) Y=c(3,1,6,7,19) R function ols(x,y) returns (Intercept)...

APPENDIX - cgesd.k12.or.us · Observation and/or narrative x x x x x x Other formal x Minimal to none x x Advise of number transferring? Y Y Y Y Y Y Y N Y Y N N Y NA Y Y N N Y Y N

Mean Value Theoremscms.gcg11.ac.in/attachments/article/204/Mean Value Theorems.pdf · Example: Verify mean value theorem for the function f (x) = (x - 4) (x - 6) (x - 8) in [4,10]

Application of Machine Learning Algorithms to On-Board … · Determine the parameters, y(x 1,x 2) : θsuch that: y(x 1,x 2) predicts the mean value of dP Confidence interval of fit

City of Athens Official Zoning Map August 21, 2014 MAP LEGEND of Athens Official Zoning … · x y x y xy x x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y

Randall Sundrum Modelparticle.korea.ac.kr/class/2012/phy713/knmin.pdf · 7396 200 0436 .08696 400 ' .6087' KKmass vs DeltaPhi KKmass Entries Mean x Mean y RMS x RMS y 3.4788 7.47855

REVIEWED BY - DelE Directorate of Educationedudel.nic.in/welcome_folder/question_bank_2011-12/Maths_XII_2011...p x y q x p x q x x dx ... distribution, mean and variance of haphazard

NMHC Internship Manual 2019-2020 Final NMHC Psychology... · ï 7deoh ri &rqwhqwv :hofrph dqg 2xu 0lvvlrq y y y y y y y y y y y y x x x x x x x x x x x x x x x x x x x x x x x x x

Trig Graphs. y = sin x y = cos x y = tan x y = sin x + 2.

ö - Echo360€¦ · - /$)"x )x #*÷ûôx1$ *x ..$")( )/x$)x/# x ** ' x x + " xõx*!xö yyyyy y # y1$ *y ..$")( )/y! /0- y ''*2.y./0 )/.y/*y - / y $/y( ) " y ) y.0 ($/y1$ *y *)/ )/y!*-

Linear and affine functions - Slope-intercept form...-3-65) x y 66) x y 67) x y 68) y 69) x y 70) x y 71) x y 72) x y 73) x y 74) y 75) x y 76) x y 77) x y 78) x y 79) x y 80 ) x y

7.1: Systems of Linear Equations (2 Variables) y x y x y x y = ½ x –3 y = (-2/3)x + 2 One Solution (x, y) y = ½ x - 3 y = ½ x + 2 No Solutions y = ½ x.

1 Copy Propagation What does it mean? Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments.

1.MEAN 2.MODE 3.MEDIAN If x 1, x 2, x 3, x n are observations with respective frequencies f 1, f 2, f 3, …, f n, then mean is given as X = Mean =

( X (Y - ICPSR€¦ · the mean of the inverse; and invert the mean of the inverse). If weights are used, then each X or Y value is weighted by its inverse and the numerator is the

R Z Y X Wx X Y X R X Z Y X Y X Y X Y X Y X Y X Y X ΩY ΩY X ...pages.cs.wisc.edu/~jerryzhu/pub/NIPS10_MC_poster.pdf · Transduction with Matrix Completion: Three Birds with One Stone

Chapter 4-5 DeGroot & Schervish. Conditional Expectation/Mean Let X and Y be random variables such that the mean of Y exists and is finite. The conditional.

Solutions to Assignmentspeople.ucalgary.ca/~aswish/SolStat507_Ch2_4_Fall18.pdf · Hence, given Y — y, X is exponential with mean y. 24. In allparts, let X denote the random variable

ALGEBRA - kimaddmath.files.wordpress.com · 12 Mean / Min, P np 13 V npq 14 X Z P V GEOMETRY GEOMETRI 1 Distance / Jarak 22 ( ) ( )x x y y 2 1 2 1 1 2 Midpoint / Titik tengah ( ,

CMBUkTI 2 rgVas´sSitiénTinñn&y › fs › mathematics › documents... · 10,8% nigrfynþTI 3 tMél $ 4500 edayGRta ... (Geometric Mean): G én k cMnYnviC¢man x 1, x 2, ..., x

Guided Analysis of Hurricane Trends Using Statistical ... · of a variable, y as the dependent variable, x as the predictor mean, and y as the dependent variable mean, a number k

ö - Echo360€¦ · - /$)"x )x #÷ûôx1$ x ..$")( )/x$)x/# x ** ' x x + " xõx!xö yyyyy y # y1$ y ..$")( )/y! /0- y ''2.y./0 )/.y/y - / y $/y( ) " y ) y.0 ($/y1$ y )/ )/y!*-