STAT 111 Recitation 8stat.wharton.upenn.edu/~linjunz/rec8.pdf · ˜ linjunz/ ... Con dence...

STAT 111 Recitation 8

Linjun Zhang

March 17, 2017

I Midterm grades will be posted next Tuesday or Wednesday.

I The slides can be found on

http://stat.wharton.upenn.edu/∼ linjunz/

I Send me email at linjunz@wharton.upenn.edu if you have any

feedback. (eg. less review, more practice problems? )

Confidence intervals

A general formula. For a parameter θ, suppose we estimate it by θ.

Then an approximate 95% confidence interval for θ is

θ − 2 · s.d.(θ) to θ + 2 · s.d.(θ)

where s.d.(θ) is the standard deviation of θ, and s.d.(θ) is the

estimate of s.d.(θ).

I If X has a binomial distribution Binomial(n, θ), and we observe X = x .

An (conservative) approximate 95% confidence interval for θ is

I If X1,X2, ...,Xn are i.i.d. with mean µ and variance σ2, and we observe

x1, x2, ..., xn. An approximate 95% confidence interval for µ is

x − 2s√n

to x + 2s√n,

where x = x1+...+xnn

, s2 =x2

1 +...+x2n−n(x)2

n−1.

I If X1 has a binomial distribution Binomial(n1, θ1), and X2 has a binomial

distribution Binomial(n2, θ2). We observe X1 = x1,X2 = x2. An

(conservative) approximate 95% confidence interval for θ1 − θ2 is

p1 − p2 −√

n2to p1 − p2 +

where pi = xini

, for i = 1, 2.

x − 2s√n

to x + 2s√n,

, s2 =x2

1 +...+x2n−n(x)2

n−1.

p1 − p2 −√

n2to p1 − p2 +

where pi = xini

, for i = 1, 2.

x − 2s√n

to x + 2s√n,

, s2 =x2

1 +...+x2n−n(x)2

n−1.

p1 − p2 −√

n2to p1 − p2 +

where pi = xini

, for i = 1, 2.

Estimating the difference between two means

If X11,X12, ...,X1n are i.i.d. with mean µ1 and variance σ21 , X21,X22, ...,X2m

are i.i.d. with mean µ2 and variance σ22 , and we observe x11, ..., x1n,

x21, ..., x2m, what can we say about µ1 − µ2?

I We estimate µ1 − µ2 by x1 − x2, where x1 = x11+...+x1nn , x2 = x21+...+x2m

I The variance of X1 − X2 isσ2

m , and we estimate σ21 and σ2

x211+...+x2

1n−n(x1)2

n−1 , s22 =

x221+...+x2

2m−m(x1)2

m−1 .

I An approximate 95% confidence interval for µ1 − µ2 is

x1 − x2 − 2

mto x1 − x2 + 2

Practice problem

Question

We are interested in investigating any potential difference between the mean blood

sugar level of diabetics (µ1) and that of non-diabetics (µ2). To do this we took a

sample of six diabetics and found the following blood sugar levels: 127, 144, 140, 136,

118, 138. We also took a sample of eight non-diabetics and found the following blood

sugar levels: 125, 128, 133, 141, 109, 125, 126, 122. (a) Estimate µ1 − µ2. (b) Find

two numbers between which we are about 95% certain that µ1 − µ2 lies.

Solution

µ1 = x1 = 16

(127 + 144 + 140 + 136 + 118 + 138) = 133.83.

σ21 = s2

1 = 16−1

(1272 + 1442 + 1402 + 1362 + 1182 + 1382 − 6 × 133.832) = 93.24.

µ2 = x2 = 18

(125 + 128 + 133 + 141 + 109 + 125 + 126 + 122) = 126.13.

σ22 = s2

18−1

(1252 + 1282 + 1332 + 1412 + 1092 + 1252 + 1262 + 1222 − 8× 126.1252) = 83.55.

The 95% confidence interval is given as x1 − x2 ± 2

√s21n

, which is −2.49 to 17.90.

Practice problem

Question

We are interested in investigating any potential difference between the mean blood

sugar level of diabetics (µ1) and that of non-diabetics (µ2). To do this we took a

sample of six diabetics and found the following blood sugar levels: 127, 144, 140, 136,

118, 138. We also took a sample of eight non-diabetics and found the following blood

sugar levels: 125, 128, 133, 141, 109, 125, 126, 122. (a) Estimate µ1 − µ2. (b) Find

two numbers between which we are about 95% certain that µ1 − µ2 lies.

Solution

µ1 = x1 = 16

(127 + 144 + 140 + 136 + 118 + 138) = 133.83.

σ21 = s2

1 = 16−1

(1272 + 1442 + 1402 + 1362 + 1182 + 1382 − 6 × 133.832) = 93.24.

µ2 = x2 = 18

(125 + 128 + 133 + 141 + 109 + 125 + 126 + 122) = 126.13.

σ22 = s2

18−1

(1252 + 1282 + 1332 + 1412 + 1092 + 1252 + 1262 + 1222 − 8× 126.1252) = 83.55.

The 95% confidence interval is given as x1 − x2 ± 2

√s21n

, which is −2.49 to 17.90.

RegressionSuppose we observe n data points (xi , yi ), i = 1, 2, ..., n.

It seems like there is some kind of linear relationship between the random

variables Xi and Yi , i = 1, 2, ..., n, i.e.

Yi = α + βXi + εi

where εi denotes the noise term (we assume that each yi is observed with noise

εi that has mean 0 and variance σ2).

Regression

I We can view Y as some random non-controllable quantity, and X as

some non-random controllable quantity.

Example:

Regression

Example:

I Y is the growth height of a tree, and X is the amount of water.

Regression

Example:

I In a basketball game analysis, Y is the points scored and X is

the minutes played.

Regression

Example:

I In a basketball game analysis, Y is the points scored and X is

the minutes played of a player.

Regression: before/after the experiment

I Before the experiment

I Conceptualize about Y1,Y2, ...,Yn

I Y1 corresponds to x1, Y2 corresponds to x2 and so on.

I Mean of Yi = α + βxi and variance of Yi = σ2.

I The various Yi are independent but not identically distributed.

I After the experiment

I Obtain observed values y1, y2, ..., yn.

I Plot (x1, y1), (x2, y2), ..., (xn, yn) values in the x-y plane.

Regression: auxiliary quantities

n∑i=1

sxx =n∑

(xi − x)2 =n∑

x2i − nx2

syy =n∑

(yi − y)2 =n∑

y2i − ny2

sxy =n∑

(xi − x)(yi − y) =n∑

xiyi − nx y

Regression: estimating α, β, σ2

Unbiased estimate :

I Estimate β by b =sxysxx

I Estimate α by a = y − bx .

I Estimate σ2 by s2r =

syy−b2sxxn−2 .

I Estimate the regression line by y = a + bx .

Regression: practice problem

Practice Problem

Suppose we have observations of average income and total pizza sales for a

1-month period for eight different towns:

Estimate the mean pizza sales of a town with income x via the formula

“estimated mean = a+bx”. (That is, calculate a and b.)

Solution

I x = 10, y = 43.625, sxx = 210, syy = 1829.875, sxy = 610.

I b =sxysxx

= 610210

= 2.905; a = y −bx = 43.625−2.904762×10 = 14.57738.

Practice Problem

Solution

I x = 10, y = 43.625, sxx = 210, syy = 1829.875, sxy = 610.

I b =sxysxx

= 610210

= 2.905; a = y −bx = 43.625−2.904762×10 = 14.57738.

Practice Problem

Solution

I x = 10, y = 43.625, sxx = 210, syy = 1829.875, sxy = 610.

I b =sxysxx

= 610210

= 2.905; a = y −bx = 43.625−2.904762×10 = 14.57738.

STAT 111 Recitation 8stat.wharton.upenn.edu/~linjunz/rec8.pdf · ˜ linjunz/ ... Con dence...

Documents

Transcript of STAT 111 Recitation 8stat.wharton.upenn.edu/~linjunz/rec8.pdf · ˜ linjunz/ ... Con dence...

PET_Regulations C. Dence

Role of cleavage by separase of the Rec8 kleisin subunit ...Rec8 cleavage in mammalian meiosis I 2687 I chromosome segregation but has little or no effect on meiotic progression per

Interacting genomic landscapes of REC8-cohesin, chromatin and … · 1 1 RESEARCH ARTICLE 2 3 Interacting genomic landscapes of REC8-cohesin, chromatin 4 and meiotic recombination

Con dence Bands for the Logistic and Probit Regression Models Over Intervals … · 2016-04-06 · Con dence Bands for the Logistic and Probit Regression Models Over Intervals Lucy

Con dence Intervals for Projections of Partially Identi ed ... · Con dence Intervals for Projections of Partially Identi ed Parameters Hiroaki Kaidoy Francesca Molinariz J org Stoyex

76fcp/statistics/intervals/intervals/intervals.pdf · 78 CHAPTER 5. INTERVAL ESTIMATION 5.1 Bayesian Intervals We will call intervals obtained according to Bayesian methodology Bayesian

Fitting Distributions to Reliability Data SAS Code for Appliance … · 2016-02-16 · Lognormal Data Formulas for MLEs and con dence intervals for and ˙are complex, and can be found

Conservative Hypothesis Tests and Confidence Intervals ... · M.T. Harrison/Conservative Hypothesis Tests and Con dence Intervals 4 for all 2[0;1] and n 0 under the null hypothesis,

Lecture 9 - Sampling, Point Estimates, and …kkl13/courses/sta102F13/lec/Lec9.pdf · Lecture 9 - Sampling, Point Estimates, and Con dence Intervals ... Example - Review To the ...

MINIMAL SUFFICIENT CAUSATION AND DIRECTED · The rst conceptualization may be characterized as ... both positive with 95% con dence intervals that ... the assumption of no su cient

Patrick Breheny January 19 - University of Kentuckyweb.as.uky.edu/statistics/users/pbreheny/580-S12/notes/2.pdf · Patrick Breheny January 19 ... Simulated 80% con dence intervals

Robust Con–dence Intervals in Nonlinear Regression under ...erp/erp seminar pdfs... · This paper studies inference methods under weak identi–cation. In particular, we consider

Stat 302 - University of British Columbiaruben/Stat302Website/lecture12.pdf · Stat 302 Ruben Zamar ruben@stat.ubc.ca Asymptotic Results ... Con–dence Intervals Other examples of

THE AUTOMATIC CONSTRUCTION OF BOOTSTRAP CONFIDENCE ... · Can the construction of bootstrap con dence intervals be made fully automatic? The answer is yes for nonparametric intervals:

Introduction to Geostatistics Samplingpebesma.staff.ifgi.de/Geostatistics10/lec6_ho.pdf · Introduction to Geostatistics 6. Sampling: strategies, point estimation, con dence intervals

Clustering, Spatial Correlations and Randomization Inferencediamondr/spatial_10mar23-1_0.pdf · Keywords: Spatial Correlations, Clustering, Randomization Inference, Conﬁ-dence Intervals

Index [] · 2012-05-08 · Index 863 CI see conﬁ dence intervals CIM see convective interaction media CIP see cleaning-in-place CIPP see capture, intermediate puriﬁ cation, and

Complements to Vector Generalized Linear and Additive Modelsyee/VGLAM/complementsV... · 2019-01-21 · 3.2 Con dence Intervals for Regression Coe cients The stats generic function

Con dence Intervals for Projections of Partially Identi ed ... · tik Ausschuss fur Okonometrie, and ES-ESM 2017. We are grateful to Zhonghao Fu, Debi Mohapatra, Sida We are grateful

STAT 30100 Exam Jam Contents - Indiana University · STAT 30100 Exam Jam Contents Chapter 1: ... Estimation with Con dence Intervals 27 ... Mathematical Sciences Department @ IUPUI