Section 02 Review of Probability and Statistics
-
Upload
syed-kazim-ali -
Category
Documents
-
view
215 -
download
0
Transcript of Section 02 Review of Probability and Statistics
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 1/183
Brief Review
Probability and Statistics
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 2/183
Probability distributions
Continuous distributions
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 3/183
Defn (density function)
Let x denote a continuous random variable then f ( x) is
called the density function of x
1) f ( x) ≥ 0
2)
3)
( ) 1 f x dx
( )
b
a f x dx P a x b
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 4/183
Defn (Joint density function)
Let x = ( x1 , x2 , x3 , ... , xn) denote a vector of continuous
random variables then
f (x) = f ( x1 , x2 , x3 , ... , xn)
is called the joint density function of x = ( x1 , x2 , x3 , ... , xn)if
1) f (x) ≥ 0
2)
3)
1)(
xx d f
R xxx P d f R
)(
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 5/183
Note:
nn dxdxdx x x x f d f 2121 ,,)(
xx
n
R
n
R
dxdxdx x x x f d f 2121 ,,)( xx
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 6/183
Defn (Marginal density function)
The marginal density of x1 = ( x1 , x2 , x3 , ... , x p) ( p < n)
is defined by:
f 1(x1) = =
where x2 = ( x p+1 , x p+2 , x p+3 , ... , xn)
2)( xx d f 221 ),( xxx d f
The marginal density of x2 = ( x p+1 , x p+2 , x p+3 , ... , xn) is
defined by: f 2(x2) = =
where x1 = ( x1 , x2 , x3 , ... , x p) 121 ),( xxx d f 1)( xx d f
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 7/183
Defn (Conditional density function)
The conditional density of x1 given x2 (defined in previous
slide) ( p < n) is defined by:
f 1|2(x1 |x2) =
conditional density of x2 given x1 is defined by:
f 2|1(x2 |x1) =
22
21
22
),()(
x
xx
x
x
f
f
f
f
11
21
11
),()(
x
xx
x
x
f
f
f
f
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 8/183
Marginal densities describe how the subvector xi behaves ignoring x j
Conditional densities describe how the
subvector xi behaves when the subvector x j is
held fixed
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 9/183
Defn (Independence)
The two sub-vectors (x1 and x2) are called independent
if:
f (x) = f (x1, x2) = f 1(x1) f 2(x2)
= product of marginals
or
the conditional density of xi given x j :
f i|j(xi |x j) = f i(xi) = marginal density of xi
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 10/183
Example (p-variate Normal)
The random vector x ( p × 1) is said to have the
p -variate Normal distribution with
mean vector m ( p × 1) and
covariance matrix S ( p × p)
(written x ~ N p(m,S)) if:
S
S )()'(
2
1exp2
1 1
2/12/μxμxx
p f
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 11/183
Example (bivariate Normal)
The random vector is said to have the bi variate
Normal distribution with mean vector
and
covariance matrix
2
1
m
m μ
SS
)()'(2
1exp
2
1 1
2/12/μxμxx
p f
2
1
x
xx
S
2
221
21
2
1
2212
1211
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 12/183
S
S
)()'(
2
1exp
2
1, 1
2/121 μxμx
x x f
212/12
122211
,exp2
1 x xQ
)()'(,
1
2212
1211
21 μxμx
x xQ
2
122211
2
2211221112
2
1122 )())((2)(
m m m m
x x x x
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 13/183
212
1121
,exp12
1, x xQ x x f
21, x xQ
2
2
2
22
2
22
1
11
2
1
11
1
2
m
m
m
m
x x x x
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 14/183
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 15/183
Theorem (Transformations)
Let x = ( x1 , x2 , x3 , ... , xn) denote a vector of
continuous random variables with joint density
function f ( x1 , x2 , x3 , ... , xn) = f (x). Let
y1 =f 1( x1 , x2 , x3 , ... , xn)
y2 =f 2( x1 , x2 , x3 , ... , xn)
... yn =f n( x1 , x2 , x3 , ... , xn)
define a 1-1 transformation of x into y.
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 16/183
Then the joint density of y is g(y) given by:
g (y) = f (x)| J | where
),...,,,(
),...,,,(
)(
)(
321
321
n
n
y y y y
x x x x J
y
x
n
n
nn
n
n
y
x
y
x
y
x
y
x
y
x
y
x
y x
y x
y x
...
...
...
...
det
21
22
2
2
1
11
2
1
1
= the Jacobian of thetransformation
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 17/183
Corollary (Linear Transformations)
Let x = ( x1 , x2 , x3 , ... , xn) denote a vector of
continuous random variables with joint density
function f ( x1 , x2 , x3 , ... , xn) = f (x). Let
y1 = a11 x1 + a12 x2 + a13 x3 , ... + a1n xn
y2 = a21 x1 + a22 x2 + a23 x3 , ... + a2n xn
... yn = an1 x1 + an2 x2 + an3 x3 , ... + ann xn
define a 1-1 transformation of x into y.
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 18/183
Then the joint density of y is g(y) given by:
)det(
1)(
)det(
1)()( 1
A A f
A f g yxy
nnnn
n
n
aaa
aaa
aaa
A
...
...
...
where
21
22221
11211
C ( i f i f
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 19/183
Corollary (Linear Transformations for Normal
Random variables)
Let x = ( x1 , x2 , x3 , ... , xn) denote a vector of continuous
random variables having an n-variate Normaldistribution with mean vector m and covariance matrixS.
i.e. x ~ Nn(m, S)
Let
y1 = a11 x1 + a12 x2 + a13 x3 , ... + a1n xn
y2 = a21 x1 + a22 x2 + a23 x3 , ... + a2n xn
... yn = an1 x1 + an2 x2 + an3 x3 , ... + ann xn
define a 1-1 transformation of x into y.
Then y = ( y1 , y2 , y3 , ... , yn) ~ N n( Am, AS A')
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 20/183
Defn (Expectation)
Let x = ( x1 , x2 , x3 , ... , xn) denote a vector of
continuous random variables with joint density
function
f (x) = f ( x1 , x2 , x3 , ... , xn).
Let U = h(x) = h( x1 , x2 , x3 , ... , xn)
Then
xxxx d f hh E U E )()()(
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 21/183
Defn (Conditional Expectation)
Let x = ( x1 , x2 , x3 , ... , xn) = (x1 , x2 ) denote a
vector of continuous random variables with joint
density function
f (x) = f ( x1 , x2 , x3 , ... , xn) = f (x1 , x2 ).
Let U = h(x1) = h( x1 , x2 , x3 , ... , x p)
Then the conditional expectation of U given x2
1
212|11212 )()()( xxxxxxx d f hh E U E
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 22/183
Defn (Variance)
Let x = ( x1 , x2 , x3 , ... , xn) denote a vector of
continuous random variables with joint density
function
f (x) = f ( x1 , x2 , x3 , ... , xn).
Let U = h(x) = h( x1 , x2 , x3 , ... , xn)
Then 222 )()( xx h E h E U E U E U Var U
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 23/183
Defn (Conditional Variance)
Let x = ( x1 , x2 , x3 , ... , xn) = (x1 , x2 ) denote a
vector of continuous random variables with joint
density function
f (x) = f ( x1 , x2 , x3 , ... , xn) = f (x1 , x2 ).
Let U = h(x1) = h( x1 , x2 , x3 , ... , x p)
Then the conditional variance of U givenx2
2
2
112 )()( xxxx h E h E U Var
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 24/183
Defn (Covariance, Correlation)
Let x = ( x1 , x2 , x3 , ... , xn) denote a vector of
continuous random variables with joint densityfunction
f (x) = f ( x1 , x2 , x3 , ... , xn).
Let U = h(x) = h( x1 , x2 , x3 , ... , xn) and
V = g (x) = g ( x1 , x2 , x3 , ... , xn)
Then the covariance of U and V.
)()()()( xxxx g E g h E h E
V E V U E U E V U Cov ,
ncorrelatio
)()(
, and
V Var U Var
V U CovUV
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 25/183
Properties
• Expectation
• Variance
• Covariance
• Correlation
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 26/183
1. E [a1 x1 + a2 x2 + a3 x3 + ... + an xn]= a1 E [ x1] + a2 E [ x2] + a3 E [ x3] + ... + an E [ xn]
or E [a'x] = a' E [x]
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 27/183
2. E[UV] = E[h(x1)g(x2)]= E[U]E[V] = E[h(x1)]E[g(x2)]
if x1 and x2 are independent
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 28/183
3. Var [a1 x1 + a2 x2 + a3 x3 + ... + an xn]
or Var [a'x] = a′S a
n
ji
ji ji
n
i
ii x xCovaa xVar a ],[2][1
2
S
)(...),(),(
...),(...)(),(
),(...),()(
where
21
2212
1211
nnn
n
n
xVar x xCov x xCov
x xCov xVar x xCov
x xCov x xCov xVar
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 29/183
4. Cov[a1 x1 + a2 x2 + ... + an xn ,
b1 x1 + b2 x2 + ... + bn xn]
or Cov[a'x, b'x] = a′S b
n
ji
ji ji
n
i
i ji x xCovba xVar ba ],[][1
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 30/183
5.
6.
22xx U E E U E
22 22xx xx U E Var U Var E U Var
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 31/183
Multivariate distributions
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 32/183
The Normal distribution
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 33/183
1.The Normal distribution –
parameters m and (or 2)
Comment: If m = 0 and = 1 the distribution is
called the standard normal distribution
0
0.005
0.01
0.015
0.02
0.025
0.03
0 20 40 60 80 100 120
Normal distribution
with m = 50 and =15
Normal distribution with
m = 70 and =20
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 34/183
2
221
( ) e ,
2
x
f x x
m
The probability density of the normal distribution
If a random variable, X , has a normal distribution
with mean m and variance 2 then we willwrite:
2~ , X N m
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 35/183
The multivariate Normal
distribution
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 36/183
Let
1
p
x
x
x
= a random vector
Let1
p
m
m
m
= a vector of constants (the
mean vector)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 37/183
Let1
1
p
p p
p pp
S
= a p × p positive
definite matrix
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 38/183
Definition
The matrix A is positive semi definite if
for all x Ax 0 x
Further the matrix A is positive definite if
only if x Ax 0 x 0
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 39/183
1( ) , , p f x f x x
Suppose that the joint density of the random
vector
The random vector, [ x1, x2, … x p] is said
to have a p-variate normal distribution withmean vector and covariance matrix S
We will write: ~ , p x N m S
11
2/ 2 1/ 2
1e
2
x x
p
m m
S
S
x
m
is: x
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 40/183
Example: the Bivariate Normal distribution
11
21 2 1/ 2
1, e
2
x x
f x xm m
S
S
with1
2
m m
m
2
2 1 2
22 22 22 2 2
S
and
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 41/183
Now
1 x xm m S
and
2 2 2 2
22 12 1 2 1 S
-1
11 12 1 1
1 1 2 2
12 22 2 2
, x
x x x
m m m
m
22 12 1 1
1 1 2 2
12 11 2 2
1,
x x x
x
m m m
m
S
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 42/183
2 2
22 1 1 12 1 1 2 2 11 2 2
12 x x x x m m m m
S
2 22 2
2 1 1 1 2 1 1 2 2 1 2 2
2 2 2
1 2
2
1
x x x x m m m m
2 2
1 1 1 1 2 2 2 2
1 1 2 2
2
2
1
x x x xm m m m
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 43/183
Hence
11
21 2
1/ 2
1, e
2
x x
f x xm m
S
S
2 2
1 1 1 1 2 2 2 2
1 1 2 2
1 2 2
2
,1
x x x x
Q x x
m m m m
1 2
1,
2
21 2
1e
2 1
Q x x
where
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 44/183
Note:
2 21 1 1 1 2 2 2 2
1 1 2 2
1 2 2
2
,1
x x x x
Q x x
m m m m
1 2
1,
21 2
21 2
1, e
2 1
Q x x
f x x
is constant when
is constant.
This is true when x1, x2 lie on an ellipse
centered at m 1, m 2 .
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 45/183
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 46/183
Surface Plots of the bivariate
Normal distribution
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 47/183
Contour Plots of the bivariate
Normal distribution
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 48/183
Scatter Plots of data from the
bivariate Normal distribution
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 49/183
Trivariate Normal distribution - Contour map
x1 x2
x3
mean vector 1
2
3
m
m m
m
1 = const x xm m S
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 50/183
Trivariate Normal distribution
x1
x2
x3
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 51/183
Trivariate Normal distribution
x1 x2
x3
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 52/183
Trivariate Normal distribution
x1
x2
x3
example
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 53/183
example
In the following study data was collected for a
sample of n = 183 females on the variables
• Age,
• Height (Ht),
• Weight (Wt),
• Birth control pill use (Bpl - 1=no pill, 2=pill)
and the following Blood Chemistry measurements
• Cholesterol (Chl),• Albumin (Abl),
• Calcium (Ca) and
• Uric Acid (UA). The data are tabulated next page:
The data :Age Ht Wt Bpl Chl Alb Ca UA Age Ht Wt Bpl Chl Alb Ca UA Age Ht Wt Bpl Chl Alb Ca UA
22 67 144 1 200 43 98 54 27 64 120 1 172 43 98 60 37 67 125 2 200 45 99 66
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 54/183
The data :22 67 144 1 200 43 98 54 27 64 120 1 172 43 98 60 37 67 125 2 200 45 99 6625 62 128 1 243 41 104 33 27 64 180 2 317 37 98 84 37 65 116 1 270 42 100 4825 68 150 2 50 38 96 30 27 69 137 1 195 46 101 42 37 63 129 2 230 36 91 2219 64 125 1 158 41 99 47 27 64 125 2 185 36 94 54 38 64 165 1 255 44 102 6219 67 130 2 255 45 105 83 27 63 125 1 168 42 97 41 38 65 151 2 275 38 94 4620 64 118 1 210 39 95 40 27 64 124 2 200 40 96 52 39 64 135 1 210 40 95 4620 64 118 1 210 39 95 40 27 60 140 1 250 36 98 68 39 64 108 2 198 44 90 3820 65 119 2 192 38 93 50 27 65 155 2 280 42 103 52 39 63 195 1 260 40 108 4221 60 107 1 246 42 101 52 28 65 108 1 260 48 106 51 39 69 132 2 180 39 94 3021 65 135 2 245 34 106 48 28 62 110 2 250 44 105 38 39 62 100 1 210 45 91 2721 63 100 1 208 38 98 54 28 65 120 1 175 48 100 47 39 62 110 2 235 41 99 3521 64 120 2 260 47 106 38 28 66 113 2 305 41 93 24 40 63 110 1 196 39 97 42
21 67 134 1 204 40 108 34 28 62 135 1 200 43 97 37 40 64 151 2 305 39 99 4821 67 145 2 192 39 95 49 28 65 160 2 235 42 101 41 40 65 145 1 170 45 100 4321 63 138 1 280 41 102 41 29 61 142 1 177 39 99 46 40 66 140 2 276 46 100 5521 64 113 2 230 39 99 38 29 61 115 2 235 45 98 47 40 65 140 1 272 41 91 4421 63 160 1 215 39 96 39 29 68 155 1 226 38 94 43 40 65 137 2 315 37 96 9921 64 115 2 225 44 105 44 29 65 118 2 230 44 99 44 40 67 130 1 300 40 106 5221 68 125 1 165 48 105 28 30 66 143 1 198 45 107 65 40 62 117 2 290 42 99 4221 62 106 2 200 38 95 40 30 63 110 2 295 45 98 46 41 62 116 1 320 44 111 6121 68 150 1 220 47 102 75 30 61 99 1 230 43 99 39 41 68 215 2 255 43 105 4521 64 130 2 255 34 102 40 30 63 132 2 200 37 96 34 41 64 125 1 306 45 98 6222 62 135 1 263 43 98 47 30 62 125 1 230 46 104 48 41 69 170 2 324 40 99 5522 62 110 2 173 42 97 37 30 63 110 2 262 33 99 41 42 60 105 1 240 41 101 5122 57 105 1 170 46 98 45 30 64 135 1 174 40 95 35 42 63 129 2 210 40 100 4622 64 120 2 290 37 98 59 30 66 112 2 250 44 100 35 43 66 167 1 210 40 100 5222 64 115 1 263 42 102 47 30 64 160 1 217 35 95 31 43 68 145 2 250 36 98 4222 59 94 2 220 47 105 46 31 65 125 1 250 43 98 39 43 66 138 1 335 44 105 58
22 67 125 1 200 43 100 44 31 66 120 2 237 34 91 49 43 66 132 2 230 42 98 4822 62 97 2 192 38 95 43 31 65 115 1 270 41 111 64 43 64 125 1 285 45 105 5022 58 100 1 247 42 104 52 31 63 110 2 280 44 99 49 43 62 113 2 200 40 93 3622 66 130 2 175 44 106 58 31 66 123 1 238 37 96 33 43 64 126 1 280 45 106 3822 60 100 1 155 41 96 45 31 67 136 2 218 38 95 42 43 65 148 2 276 41 105 5022 60 100 1 155 41 96 45 32 67 132 1 185 39 103 37 55 64 124 1 275 40 98 5322 65 135 2 215 40 93 43 32 68 203 2 235 38 99 37 55 64 165 2 298 36 100 6322 60 95 1 200 47 99 34 32 62 155 1 262 37 99 43 44 62 118 1 253 43 94 4422 67 124 2 247 44 102 45 32 65 126 2 160 41 97 40 44 63 133 2 242 47 104 4923 63 125 1 220 32 92 42 32 63 125 1 189 40 94 40 45 67 180 1 160 38 97 5923 64 105 2 207 42 100 40 32 71 170 2 205 37 90 60 45 65 140 2 263 45 107 5223 63 125 1 266 42 103 47 32 62 120 1 260 43 107 38 46 67 145 2 320 40 101 3723 63 120 2 240 43 101 39 32 62 145 2 240 45 108 42 46 63 138 1 257 40 90 6124 68 125 1 195 49 106 52 32 66 140 1 197 44 106 58 46 62 118 2 190 38 95 4324 64 130 2 250 39 103 46 32 68 133 2 180 32 95 40 46 62 103 1 230 43 102 3324 64 130 2 250 39 103 46 54 67 140 2 245 39 104 56 46 65 190 2 265 41 108 8524 65 130 1 225 50 108 39 33 64 115 1 205 47 100 54 47 67 135 1 297 42 100 4524 65 148 2 200 37 104 49 33 60 118 2 260 38 99 38 47 67 143 2 255 41 100 4024 64 135 1 180 37 96 49 33 67 137 1 243 41 106 55 47 61 132 1 257 39 96 3824 71 156 2 240 42 102 51 33 68 130 2 195 40 95 58 47 59 94 2 257 41 103 5325 62 107 1 330 48 101 53 33 65 130 1 203 44 101 48 48 62 120 1 300 39 94 5125 67 175 2 175 39 93 51 33 69 138 2 222 40 104 42 48 66 143 2 225 40 100 6225 66 112 1 205 46 101 33 34 62 112 1 197 37 93 44 48 67 143 1 216 40 96 4725 63 120 2 235 44 103 40 34 63 125 2 245 38 95 41 48 65 134 2 248 42 102 4254 67 127 2 260 44 106 57 35 62 115 1 180 40 91 59 48 65 164 1 306 44 100 7825 67 135 1 295 46 106 47 35 67 125 2 223 40 100 37 48 66 120 2 235 36 97 3525 67 141 2 230 38 101 52 35 66 138 1 254 39 107 41 48 60 125 1 195 41 95 5326 66 135 1 240 48 103 51 35 66 140 2 245 39 105 56 48 64 138 2 338 37 100 5826 64 118 2 238 40 99 46 36 62 135 1 247 34 90 44 49 64 126 1 255 41 102 4826 65 125 1 198 44 96 43 36 67 120 2 175 46 103 39 49 69 158 2 217 36 106 6526 65 120 2 196 38 95 43 36 66 112 1 215 43 104 42 50 69 135 1 295 43 105 63
36 65 121 2 270 43 98 35 50 66 140 2 390 46 97 55 52 62 107 2 265 46 104 6453 65 140 2 220 40 107 46 54 66 158 1 305 42 103 48 54 60 170 2 220 35 88 63
3D Scatterplot Wt Ht Age
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 55/183
3D Scatterplot Wt,Ht Age
3D Scatterplot Alb, Chl, Bp
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 56/183
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 57/183
Marginal and Conditional
distributions
Th (W db )
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 58/183
11 1 1 1 1 1 A CBD A A C B DA C DA
Theorem: (Woodbury)
Proof:
11 1 1 1 1 A CBD A A C B DA C DA
11 1 1 1 1 A CBD A A CBD A C B DA C DA
11 1 1 1 1 I CBDA C CBDA C B DA C DA
11 1 1 1 1 1 I CBDA CB B DA C B DA C DA
1 1 I CBDA CBDA I
E l
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 59/183
1
1n n n n
b I bJ I J
nb
Example:
Solution:
11 1 1 1 1 1Use A CBD A A C B DA C DA
1 1
with , 1, 1 ,n A I C D B b
1 11hence 1 1 1 1 1 1n n n n n I b I I b I I
1
111n I
b n
1n n
b I J
nb
Theorem: (Inverse of a partitioned symmetric matrix)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 60/183
11 12 11 121
12 22 12 22
Let and A A B B
A B A
A A B B
Theorem: (Inverse of a partitioned symmetric matrix)
-11
11 11 12 22 12Then B A A A A
-11
22 22 12 11 12 B A A A A
-11 1
12 22 12 11 12 22 12 B A A A A A A
-11 1
12 11 12 22 12 11 12 B A A A A A A
11 1 1 111 11 12 22 12 11 12 12 11 A A A A A A A A A
11 1 1 1
22 22 12 11 12 22 12 12 22 A A A A A A A A A
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 61/183
11 12 11 121
12 22 12 22
Let and A A B B
A B A A A B B
Proof:
11 11 12 12 11 12 12 22
12 11 22 12 12 12 22 22
0=
0
q
p q
I A B A B A B A B
I A B A B A B A B
11 12 11 121
12 22 12 22
Then p
A A B B I AA AB
A A B B
11 11 12 12 11 12 12 22
12 11 22 12 12 12 22 22
0or
0
q
p q
A B A B I A B A B
A B A B A B A B I
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 62/183
1 1 1
11 11 11 12 12 12 11 12 22
1 1 1
12 22 12 11 22 22 22 12 12
hence0
B A A A B B A A B
B A A B B A A A B
1 1 1
11 12 22 12 11 11or I A A A A B A
1
11 12 22 12 11and A A A A B I
1 1 1
11 11 11 12 22 12 11and B A A A A A B
-11
11 11 12 22 12hence B A A A A -1
1
22 22 12 11 12similarly B A A A A -1
1 1
12 22 12 11 12 22 12and B A A A A A A -1
1 1
12 11 12 22 12 11 12 B A A A A A A
Theorem: (Determinant of a partitioned symmetric
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 63/183
11 12
12 22
Let A A
A A A
Theorem: (Determinant of a partitioned symmetric
matrix)
1
11 12 22 12 22
122 12 11 12 11
Then A A A A A
A
A A A A A
Proof: 111 12 11 11 12
1
12 22 12 22 12 11 12
0 Note
0
A A A I A A A
A A A I A A A A
*0
and0
B B C B D
C D D
Theorem: (Marginal distributions for the Multivariate
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 64/183
1
2
Let q
p q x x x
Theorem: (Marginal distributions for the Multivariate
Normal distribution)
11 12
12 22
S S
S S S
have p-variate Normal distribution
with mean vector 1
2
q
p q
m m
m
and Covariance matrix
Then the marginal distribution of is qi-variate Normaldistribution (q1 = q, q2 = p - q)
iiS
with mean vector im
and Covariance matrix
i x
Theorem: (Conditional distributions for the
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 65/183
1
2
Let q
p q x x x
Theorem: (Conditional distributions for the
Multivariate Normal distribution)
11 12
12 22
S S
S S S
have p-variate Normal distribution
with mean vector 1
2
q
p q
m m
m
and Covariance matrix
Then the conditional distribution of given is qi-variate Normal distribution
1-ii j ii ij jj ij
S S S S S
with mean vector 1=i j i ij jj j j xm m m S S
and Covariance matrix
i x j x
Proof: (of Previous
two theorems)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 66/183
1
2
x
x x
Proof: (of Previous two theorems)
11 12
12 22
S S
S S S
is
where1
2
q
p q
m m
m
,
The joint density of
and
112
12
1 2 / 2
1,
2
x x
p f x f x x e
m m
S S
1
1 2212
,
/ 2
1
2
Q x x
p e
S
1
1 2,Q x x x xm m S
1
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 67/183
-1 11 1211 121
21 2212 22
S S S SS S S S S
where
,
and
1
1 2,Q x x x xm m S
11 121 1
1 1 2 2 21 222 2
, x
x x x
m m m
m
S S
S S
22
2 2 2 2 x xm m
S
11 12
1 1 1 1 1 1 2 22 x x x xm m m m S S
-1
12 1 1 21
11 12 22 12 11 12 =
S S S S S S S S
111 1 1 1 111 11 12 22 12 11 12 12 11
S S S S S S S S S S 1
22 1
22 12 11 12
S S S S S
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 68/183
also
,
and
-1
1 1
1 1 11 12 22 12 11 12 2 22 x xm m
S S S S S S
1
1 1 11 1 1 x xm m S
1
11 22 12 11 12
S S S S S S
22
2 2 2 2 x xm m S
11 12
1 1 1 1 1 1 1 1 2 2, 2Q x x x x x xm m m m
S S
1
1 1 1
1 1 11 12 22 12 11 12 12 11 1 1 x xm m S S S S S S S S
-1
1
2 2 22 12 11 12 2 2 x xm m
S S S S
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 69/183
,
-1
1 1
2 2 12 11 1 1 22 12 11 12 x xm m
S S S S S S
1
1 2 1 1 11 1 1Hence ,Q x x x xm m S
1
2 2 12 11 1 1 x xm m S S
1 1 2 1 2,Q x Q x x
1
1 1 1 11 1 1where Q x x xm m S
1
2 1 2 2 2 12 11 1 1and ,Q x x x xm m
S S
-1
1 1
22 12 11 12 2 2 12 11 1 1 x xm m
S S S S S S
-1
2 2 x b A x b
1h b S S
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 70/183
1
2 12 11 1 1where b xm m
S S
1
22 12 11 12and A
S S S S
11 22
12
,
1 2 / 2
1now ,
2
Q x x
p f x f x x e
S
11 1 2 1 22
1122
,
/ 2 111 22 12 11 12
1
2
Q x Q x x
p
e
S S S S S
111 1 11 1 12
12
/ 2
11
1
2
x x
qe
m m
S
S
112 22
12
/ 2
1
2
x b A x b
p qe
A
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 71/183
1 1 1 2 2 1 2 1, , q p f x f x x dx f x x dx dx
111 1 11 1 12
12
/ 2
11
1
2
x x
qe
m m
S S
112 22
12
2/ 2
1
2
x b A x b
p qe dx
A
The marginal distribution of is1 x
11 1 1 11 1 12
12
/ 2
11
12
x xq
e m m
S S
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 72/183
1 2
2|1 2 1
1 1
,
f x x
f x x f x
112 22
12
/ 2
1
2
x b A x b
p qe
A
The conditional distribution of given is:2 x
1 x
1
2 12 11 1 1where b xm m
S S
1
22 12 11 12and A
S S S S
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 73/183
1
2 1 22 12 11 12The matrix
S S S S S
is called the matrix of partial var iances and covariances .
th
2 1The , element of the matrixi j S
1,2....ij q
is called the partial covariance (variance if i = j)
between xi and x j given x1, … , xq.
1,2....
1,2....
1,2.... 1,2....
ij q
ij q
ii q jj q
is called the partial correlation between xi and x j given
x1, … , xq.
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 74/183
1
12 11the matrix
S S
is called the matrix of regression coeff icients for
predicting xq+1, xq+2, … , x p from x1, … , xq.
1
2 1 1 2 12 11 1where xm m m
S S
Mean vector of xq+1, xq+2, … , x p given x1, … , xqis:
Example:x
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 75/183
p
10
15and
6
14
m
Suppose that
1
2
3
4
x
x x
x x
Is 4-variate normal with
4 2 4 2
2 17 6 5
4 6 14 6
2 5 6 7
S
x
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 76/183
1
10and
15m
The marginal distribution of 1
1
2
x x
x
is bivariate normal with
11
4 2
2 17
S
1
10
15 and
6
m
The marginal distribution of
1
1 2
3
x
x x
x
is trivariate normal with
11
4 2 4
2 17 6
4 6 14
S
x
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 77/183
Find the conditional distribution of
1
12
15
5
x
x x
given
114 2
2 17 S
1 2
10 6and
15 14
m m
Now
and
3
2
4
x x
x
2214 6
6 7 S
124 2
6 5 S
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 78/183
1
2 1 22 12 11 12
S S S S S
114 6 4 6 4 2 4 2
6 7 2 5 2 17 6 5
9 3
3 5
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 79/183
1
12 11
S S
The matrix of regression coeff icients for predicting x3, x4
from x1, x2.
14 6 4 2
2 5 2 17
0.875 .250
0.375 .250
1
2 12 11 1 m m
S S
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 80/183
1 2
2 1 1
1 2
0.875 0.250 6.5
0.375 0.250 6.5
x x x
x xm
6 0.875 0.250 10
14 0.375 0.250 15
6.5
6.5
2 12 11 1m m
2 1m m
0.875 15 0.250 5 6.5 7.875
0.375 15 0.250 5 6.5 13.375
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 81/183
The Chi-square distribution
The Chi-square distribution
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 82/183
The Chi-square ( c 2) distribution with n d.f.
21
2 2
112
2
0
0 0
x x e x
f x
x
n
n
n
2
2 2
2
2
10
2
0 0
x
x e x
x
n
n n
The Chi-square distribution
Graph: The c 2 distribution
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 83/183
0
0.1
0.2
0 4 8 12 16
(n = 4)
(n = 5)
(n = 6)
Basic Properties of the Chi-Square distribution
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 84/183
1. If z has a Standard Normal distribution then z 2 has
a c 2 distribution with 1 degree of freedom.
p q
2. If z 1, z 2,…, z n are independent random variables
each having Standard Normal distribution then
has a c 2 distribution with n degrees of freedom.
2 2 2
1 2 ...U z z z n
3. Let X and Y be independent random variables
having a c 2 distribution with n 1 and n 2 degrees of
freedom respectively then X + Y has a c 2
distribution with degrees of freedom n 1 + n 2.
continued
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 85/183
4. Let x1, x2,…, xn, be independent random variables
having a c 2 distribution with n 1
, n
2,…, n
n degrees
of freedom respectively then x1+ x2 +…+ xn has a
c 2 distribution with degrees of freedom n 1 +…+ n n.
5. Suppose X and Y are independent random variableswith X and X + Y having a c 2 distribution with n 1and n (n > n 1 ) degrees of freedom respectively
then Y has a c 2 distribution with degrees of
freedom n - n 1.
The non-central Chi-squared distribution
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 86/183
q
If z 1, z 2,…, z n are independent random variables eachhaving a Normal distribution with mean m i and
variance 2 = 1, then
has a non-central c2 distribution with n degrees of
freedom and non-centrality parameter
2 2 2
1 2
...U z z z n
n
m 1
2
21
i
i
Mean and Variance of non-central c2
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 87/183
distribution
If U has a non-central c2
distribution with n degrees of freedom and non-centrality parameter
n
m 1
2
21
i
i
Then
n
m n n 1
22i
iU E n 42 U Var
If U has a central c2
distribution with n degrees of freedom and is zero, thus
n U E n 2U Var
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 88/183
Distribution of Linear and
Quadratic Forms
S SN
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 89/183
Suppose S, μy N
Consider the random variable22
222
2
111 nnn ya ya yaU yAy
nnnn y ya y ya 1,12112 22
Questions
1. What is the distribution of U ? (many statistics
have this form)
2. When is this distribution simple?
3. When we have two such statistics when are
they independent?
Si l t C I0N
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 90/183
Simplest Case I0y , N
22
2
2
1 n y y yU
yIy
Then the distribution of U is the central c 2
distribution with n = n degrees of freedom.
Now consider the distribution of other quadratic forms
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 91/183
Now consider the distribution of other quadratic forms
then, Suppose
2
IμyTheorem
N
c
, 1 22
2
2
12n y y yU n
yIy
where
n
m
1
2
2
12
i
i
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 92/183
then, Note IμyzProof
11
N
c
, 1 22
2
2
12n y y yU n
yIyzz
with
n
m
1
2
2
11121
2
i
iμμ
thenSuppose SμyTheorem
N
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 93/183
then, Suppose SμyTheorem N
c , 1 nU S yy
with μμ S
21
n.s. withLet AAAProof S
Consider 1yAz
, 1 IμA
N
111 , Then S AAμAz
N
c , and nU zz
with μAμA11
1
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 94/183
with μAμA2
μμ
μAAμμAAμ
11
S
21
1
21
2
1
also
yy
yAAy
yAAy
yAyAzz
1
1
11
11
S
U
1
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 95/183
Hence c , 1 nU S yy
with μμ
S 21
then,Suppose I0zTheorem
N
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 96/183
then, Suppose I0zTheorem N
d.f.on withdistributi centralahas r U c zAz
if and only if
• A is symmetric idempotent of rank r.
Proof Since A is symmetric idempotent of rank r , there exists an orthogonal matrix Psuch that
A = PDP or PAP = D
Since A is idempotent the eigenvalues of Aare 0 or 1 and the # of 1’s is r , the rank of A.
Thus
0I
Dr
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 97/183
Thus
00
Dr n
and
00
0IAPP Let 21 PPP
000I
APPAPPAPPAPP
PPAP
PAPP
Then
2212
21111
21
2
1
11consider
yzPzPy
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 98/183
12
consider yzP
zPy
, Now IPP0Py
N , PP0
N
,I0
N
1 thus y
,I0
r N
d.f.on withdistributiahas and 2
11 r U c yy
zAzyAPPyy00
0Iyyy
11 Now U
d.f.on withdistributiahas Thus 2 r c zAz
theorem previousof tiongeneraliza(ATheorem
thS I
N
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 99/183
on.distributi , central-nonahas c r U
zAz
if and only if
• A is symmetric idempotent of rank r.
Proof Similar to previous theorem
then, Suppose Iμz N
μAμ
21 parameter centrality-nonwith
then, Suppose SμyTheorem
N
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 100/183
,pp μy
ondistributi , has c r U yAy
with μAμ 21
if and only if the following two conditions are
satisfied:
1. AS is idempotent of rank r.
2. SA is idempotent of rank r.
.such that beLet QQQProof S
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 101/183
ondistributi
,,ahas then 11111IμQQQμQyQz
S N N
on withdistributi
,ahas
theorem previousthe
fromthen,or Let
11
211
11
μAμμBQQμ
yAyyBQQyzBz
BQQAAQQB
r χ U
.rank of idempotentis if onlyandif r AQQB
.rank of isif onlyandif rank of is r r AAQQB
similar congruent,areand AAQQB
BBBAQQB idempotentisAlso
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 102/183
BBBAQQB idempotentis Also
AAA
AAA
QAQQAQQAQ
QAQQAQQAQ
QAQQQQAQQAQQQ
AQQAQQAQQ
SSS
SSS
shown becanitSimilarly
and
i.e.
i.e.
.or
. i.e.
11
.idempotentareandidempotentis Thus AABSS
gSummarizin
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 103/183
ondistributi , has c r U yAy
with μAμ
21
if and only if the following two conditions are
satisfied:
1. AS is idempotent of rank r.
2. SA is idempotent of rank r.
Application:
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 104/183
Application:
Let y1, y2, … , yn be a sample from the Normal
distribution with mean m , and variance 2.
Then
has a c 2 distribution with n = n -1 d.f. and = 0
(central)
2
21
snU
Proof
11 y
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 105/183
Proof
1
1
where,
2
1
1I1y m
N
y
y
y
n
n
i
n
i
in
y
y1
2
1
1
2
21
n
i
i y y sn
U 1
2
22
2 11
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 106/183
y11Iy
y11yyy
nn
11122
Ayy
11IA n
11 where 2
nnn
nnn
nnn
111
111
111
2
1
1
1
1
11II11IA
S
111Now 2
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 107/183
11II11IA
Snn
Now2
11IA S n1 Also
SS 11I11IAAnn
11
11111111I 2
111
nnn
nnnn 11111111I since 111
S A11In
1
1rankofidempotentis1
Thus S nr11IA
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 108/183
1rank of idempotentis Thus S nr n
11IA
Hence
has a c 2
distribution with n = n -1 d.f. and non-centrality parameter
yAy
2
21
snU
111I11A1 n12
21
21 m m m
012
21 111111
nm
Independence of Linear and Quadratic Forms
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 109/183
p Q
ondistributi ,ahave letAgain Sμy
N
and ,
s'V.R.followingheConsider t
21 yCvyByyAy
U U
? of tindependen
?of tindependen iswhen
1
21
yCvyAy
yByyAy
U
U U
Theorem
di t ib tihL t I
N
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 110/183
ondistributi ,ahave Let Iμy
N
.if
of tindependenis then
0CA
yCvyAy
U
Proof Since A is symmetric there exists an
orthogonal matrix P such that PAP = D whereD is diagonal.
Note: since CA = 0 then rank(A) = r n and
some of the eigenvalues (diagonal elements of D) are zero.
thus11
00
0DPPA
0PAPCP0CA
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 111/183
0PAPCP0CA
PC
BB
BB
00
00
00
0D
BB
BB
2221
1211
11
2221
1211
where
0B0BD
0DB0DB
21111
11211111
,thenexistssinceand
, i.e.
11
. thus 2
22
12B0
B0
B0PC
ndist',,ahasthen,Let IμPPPμPzyPz
N N
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 112/183
1111
2
111
21
now
zDzz
z
00
0Dzz
zPPAzyAy
22
2
1 Also zB
z
zB0zPCyC 2
221111
21
of tindependenisthenof tindependenisSince
zByCzDzyAyzz
Theorem
N kfihdi t ib tihL t SS
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 113/183
n N rank of ison wheredistributi ,ahave Let SSμy
formlinear theof tindependenis
formquadratictheimpliesthen
yCv
yAy0AC
S U
Proof Exercise.
Similar to previous thereom.
Application:
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 114/183
Application:
Let y1, y2, … , yn be a sample from the Normal
distribution with mean m , and variance 2.
Then
t.independenare 1
1
and 1
22
n
ii y yn s y
Proof
yCy1
n
n
nnn
y
y
y
yv 12
1
111
y1
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 115/183
yAyy11Iy
11
12
nn s
yCy1
n
n
nnn
y
y yv 12111
01111I1
11II1AC
S
11
1
11
11
and
nnnnn
nnn
I1μ
μy
S
S
m and
on withdistributi ,ahas Now
N
Q.E.D.
Theorem (Independence of quadratic forms)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 116/183
.rank of is
on wheredistributi ,ahave Let
n
N
S
Sμy
t.independenare and
formsquadratictheimpliesthen
21 yByyAy
0BA
S
U U
Proof Let S = is non- singular) .
VVWW
VWBVAW0WV
0BA0BABA
S
and
symmetric.areand both: Note, whereor
Expected Value and Variance of quadratic
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 117/183
Theorem
thenandar and Suppose
S
S
AμAμ
yAyyμy
tr U E
U V E
S
S
AμAμeeAμAμ
eeAμAμeAeμAμ
eAeeAμμAeμAμ
eμAeμ
0eeeμyeProof
tr E tr
tr E tr E
E E E
E U E
E E
andthen,Let
forms
Summary
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 118/183
S AμAμyAy tr E U E
μyeeAeμAμ
where E
Summary
n
i
n
j jiij
n
i
n
j jiij eea E a 1 11 1 m m
Example – One-way Anova
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 119/183
n
j
ijni
k
i
n
j
iij y ySS y yU 1
1
1 1
Error
2
1 where
y11, y12, y13, … y1n a sample from N (m 1, 2)
y21, y22, y23, … y2n a sample from N (m 2, 2
)
yk 1, yk 2, yk 3, … ykn a sample from N (m k , 2)
Now iiijij y E y E m m
Thus
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 120/183
k
i
n
j iij
y y E U E 1 1
2
1
k
i
n
j
iij
k
i
n
j
iij ee E 1 1
2
1 1
2m m
22
1
1
2
11
2
1 and where
e
n
j
iijn
e s E ee s
21 nk
k
i
e
i sn E 1
210
Now let
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 121/183
k
i ik
k
i i
y ySS y ynU 1
1
Treatments1
2
2
where
m m m k
i
ik ii y E y E 1
1 and Now
Thus
k
i
i y yn E U E 1
2
2
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 122/183
i 1
k
i
i
k
i
i eenE n1
2
1
2m m
2
1
21 e
k
i
i s E k nn
m m
n
eVar s E eee
ee s
iek
k
iik e
2
21
1
2
1
12
and ,,, fromcalculated
variancesample where
2
1
21 m m
k n
k
i
i
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 123/183
Statistical Inference
Making decisions from data
There are two main areas of Statistical Inference
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 124/183
• Estimation – deciding on the value of a
parameter – Point estimation
– Confidence Interval, Confidence region Estimation
• Hypothesis testing
– Deciding if a statement (hypotheisis) about a
parameter is True or False
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 125/183
The general statistical modelMost data fits this situation
Defn (The Classical Statistical Model)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 126/183
The data vector
x = ( x1 , x2 , x3 , ... , xn)
The model
Let f (x| q) = f ( x1 , x2 , ... , xn | q 1 , q 2 ,... , q p)
denote the joint density of the data vector x =
( x1 , x2 , x3 , ... , xn) of observations where the
unknown parameter vector q W (a subset of p-dimensional space).
An Example
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 127/183
The data vector
x = ( x1 , x2 , x3 , ... , xn) a sample from the normaldistribution with mean m and variance 2
The model
Then f (x| m ,
2
) = f ( x1 , x2 , ... , xn | m ,
2
), the jointdensity of x = ( x1 , x2 , x3 , ... , xn) takes on the form:
where the unknown parameter vector q (m , 2) W
={(x,y)|-∞ < x < ∞ , 0 ≤ y < ∞}.
n
i
ii x
nn
n
i
x
ee f 1
22
2
2/1
22
2
1
2
1
m
m
m x
Defn (Sufficient Statistics)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 128/183
Let x have joint density f (x| q) where the unknown
parameter vector q W.
Then S = (S 1(x) ,S 2(x) ,S 3(x) , ... , S k (x)) is called a set of
sufficient statistics for the parameter vector q if the
conditional distribution of x given S = (S 1(x) ,S 2(x)
,S 3(x) , ... , S k (x)) is not functionally dependent on the
parameter vector q.
A set of sufficient statistics contains all of the
information concerning the unknown parameter vector
A Simple Example illustrating Sufficiency
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 129/183
Suppose that we observe a Success-Failure experiment
n = 3 times. Let q denote the probability of Success.
Suppose that the data that is collected is x1, x2, x3 where
xitakes on the value 1 is the ith trial is a Success and 0 if
the ith trial is a Failure.
The following table gives possible values of ( x1, x2, x3).
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 130/183
( x1, x2, x3) f ( x1, x2, x3|q ) S =S xi g (S |q ) f ( x1, x2, x3| S )
(0, 0, 0) (1 - q )3 0 (1 - q )
3 1
(1, 0, 0) (1 - q )2q 1 1/3
(0, 1, 0) (1 - q )2q 1 1/3
(0, 0, 1) (1 - q )2q 1
3(1 - q )2q
1/3
(1, 1, 0) (1 - q )q 2 2 1/3
(1, 0, 1) (1 - q )q 2 2 1/3
(0, 1, 1) (1 - q )q 2 2
3(1 - q )q 2
1/3
(1, 1, 1) q 3 3 q
3 1
The data can be generated in two equivalent ways:
1. Generating ( x1, x2, x3) directly from f ( x1, x2, x3|q) or
2. Generating S from g(S|q ) then generating ( x1, x2, x3) from f ( x1,
x2, x3|S ). Since the second step does involve q no additional
information will be obtained by knowing ( x1, x2, x3) once S is
determined
The Sufficiency Principle
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 131/183
Any decision regarding the parameter q should
be based on a set of Sufficient statistics S 1(x),
S 2(x), ...,S k (x) and not otherwise on the value of x.
A useful approach in developing a statistical
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 132/183
A useful approach in developing a statistical
procedure
1. Find sufficient statistics
2. Develop estimators , tests of hypotheses etc.using only these statistics
Defn (Minimal Sufficient Statistics)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 133/183
Let x have joint density f (x| q) where theunknown parameter vector q W.
Then S = (S 1(x) ,S 2(x) ,S 3(x) , ... , S k (x)) is a set
of Minimal Sufficient statistics for the parameter vector q if S = (S 1(x) ,S 2(x) ,S 3(x) , ...
, S k (x)) is a set of Sufficient statistics and can be
calculated from any other set of Sufficientstatistics.
Theorem (The Factorization Criterion)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 134/183
Let x have joint density f (x| q) where the unknown parameter vector q W.
Then S = (S 1(x) ,S 2(x) ,S 3(x) , ... , S k (x)) is a set of
Sufficient statistics for the parameter vector q if
f (x| q) = h(x) g (S, q)
= h(x) g (S 1(x) ,S 2(x) ,S 3(x) , ... , S k (x), q).
This is useful for finding Sufficient statisticsi.e. If you can factor out q-dependence with a set of
statistics then these statistics are a set of Sufficient
statistics
Defn (Completeness)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 135/183
Let x have joint density f (x| q) where the unknown parameter vector q W.
Then S = (S 1(x) ,S 2(x) ,S 3(x) , ... , S k (x)) is a set of
Complete Sufficient statistics for the parameter vector
q if S = (S 1(x) ,S 2(x) ,S 3(x) , ... , S k (x)) is a set of
Sufficient statistics and whenever
E[f(S 1(x) ,S 2(x) ,S 3(x) , ... , S k (x)) ] = 0
then
P[f(S 1(x) ,S 2(x) ,S 3(x) , ... , S k (x)) = 0] = 1
Defn (The Exponential Family)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 136/183
Let x have joint density f (x| q)| where theunknown parameter vector q W. Then f (x| q)
is said to be a member of the exponential family
of distributions if:
,
0
)()(exp)()(1
Otherwise
b xa pS g h f iiii
k
i
i θxθxθx
q W,where
1) ∞ < a < b < ∞ are not dependent on
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 137/183
1) - ∞ < ai < bi < ∞ are not dependent on q.
2) W contains a nondegenerate k-dimensional
rectangle.
3) g (q), ai ,bi and pi(q) are not dependent on x.
4) h(x), ai ,bi and S i(x) are not dependent on q.
If in addition.
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 138/183
If in addition.
5) The S i( x) are functionally independent for i = 1, 2,..., k .
6) [S i(x)]/ x j exists and is continuous for all i = 1, 2,..., k j = 1,2,..., n.
7) pi(q) is a continuous function of q for all i = 1, 2,..., k .
8) R = {[p1(q),p2(q), ...,p K (q)] | q W,} contains nondegenerate
k-dimensional rectangle.
Then
the set of statistics S1(x), S2(x), ...,Sk (x) form a Minimal
Complete set of Sufficient statistics.
Defn (The Likelihood function)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 139/183
Let x have joint density f (x|q) where the unkown parameter vector q W. Then for a
given value of the observation vector x ,the
Likelihood function, Lx(q), is defined by: Lx(q) = f (x|q) with q W
The log Likelihood function l x(q) is defined by:
l x(q) =lnLx(q) = lnf (x|q) with q W
The Likelihood Principle
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 140/183
Any decision regarding the parameter q should
be based on the likelihood function Lx (q) and not
otherwise on the value of x.If two data sets result in the same likelihood
function the decision regarding q should be the
same.
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 141/183
Some statisticians find it useful to plot the
likelihood function Lx (q) given the value of x.
It summarizes the information contained in xregarding the parameter vector q.
An Example
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 142/183
The data vector
x = ( x1 , x2 , x3 , ... , xn) a sample from the normaldistribution with mean m and variance 2
The joint distr ibution of x
Then f (x| m , 2) = f ( x1
, x2
, ... , xn | m , 2), the joint
density of x = ( x1 , x2 , x3 , ... , xn) takes on the form:
where the unknown parameter vector q (m , 2) W
={(x,y)|-∞ < x < ∞ , 0 ≤ y < ∞}.
n
i
ii x
nn
n
i
x
ee f 1
22
2
2/1
22
2
1
2
1
m
m
m x
The Likelihood function
A d i k
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 143/183
Assume data vector is known
x = ( x1 , x2 , x3 , ... , xn)
The L ikel ihood function
Then L(m , )= f (x| m , ) = f ( x1 , x2 , ... , xn | m , 2),
22
1 22/ 2
1
1 1
2 2
n
ii
i
x xn
n ni
e em m
2
1
1
2
/ 2
1
2
n
i
i
x
n n e
m
2 2
1
12
2
/ 2
1
2
n
i i
i
x x
n ne
m m
or
2 2
1
12
2
/ 2
1,
2
n
i i
i
x x
n n L e
m m
m
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 144/183
2 2
1 1
1 22
/ 2
1
2
n n
i i
i i
x x n
n ne
m m
2 2 211 2
2/ 2
1
2
n s nx nx n
n n e
m m
2 2
2 2 2 21
1since or 11
n
i ni
ii
x nx
s x n s nxn
1
1
and since then
n
i ni
i
i
x
x x nx
n
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 145/183
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 146/183
Contour Map of Likelihood n = 100
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 147/183
1
S1
m
0 20
50
70
Now consider the following data: (n = 100)
57.1 72.3 75.0 57.8 50.3 48.0 49.6 53.1 58.5 53.7
77 8 43 0 69 8 65 1 71 1 44 4 64 4 52 9 56 4 43 9
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 148/183
2 2199 11.8571 100 62.02
250 100
1,
6.2832 L e
m m
77.8 43.0 69.8 65.1 71.1 44.4 64.4 52.9 56.4 43.9
49.0 37.6 65.5 50.4 40.7 66.9 51.5 55.8 49.1 59.5
64.5 67.6 79.9 48.0 68.1 68.0 65.8 61.3 75.0 78.0
61.8 69.0 56.2 77.2 57.5 84.0 45.5 64.4 58.7 77.5
81.9 77.1 58.7 71.2 58.1 50.3 53.2 47.6 53.3 76.4
69.8 57.8 65.9 63.0 43.5 70.7 85.2 57.2 78.9 72.9
78.6 53.9 61.9 75.2 62.2 53.2 73.0 38.9 75.4 69.7
68.8 77.0 51.2 65.6 44.7 40.4 72.1 68.1 82.2 64.7
83.1 71.9 65.4 45.0 51.6 48.3 58.5 65.3 65.9 59.6
mean 62.02
s 11.8571
Likelihood n = 100
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 149/183
1
S1
0
2E-170
4E-170
6E-170
8E-170
1E-169
1.2E-169
1.4E-169
1.6E-169
m
0
2050
70
Contour Map of Likelihood n = 100
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 150/183
1
S1
m
0 20
50
70
The Sufficiency Principle
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 151/183
Any decision regarding the parameter q should
be based on a set of Sufficient statistics S 1(x),
S 2(x), ...,S k (x) and not otherwise on the value of x.
If two data sets result in the same values for the
set of Sufficient statistics the decision regardingq should be the same.
Theorem (Birnbaum - Equivalency of the
Likelihood Principle and Sufficiency Principle)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 152/183
p y p )
Lx1(q) Lx
2(q)if and only if
S 1(x1) = S 1(x2),..., and S k (x1) = S k (x2)
The following table gives possible values of ( x1, x2, x3).
(x1 x2 x3) f(x1 x2 x3|q) S =Sxi g(S |q) f(x1 x2 x3| S)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 153/183
( x1, x2, x3) f ( x1, x2, x3|q ) S =S xi g (S |q ) f ( x1, x2, x3| S )
(0, 0, 0) (1 - q )3 0 (1 - q )
3 1
(1, 0, 0) (1 - q )2
q 1 1/3(0, 1, 0) (1 - q )
2q 1 1/3
(0, 0, 1) (1 - q )2q 1
3(1 - q )2q
1/3
(1, 1, 0) (1 - q )q 2 2 1/3
(1, 0, 1) (1 - q )q 2 2 1/3
(0, 1, 1) (1 - q )q 2 2
3(1 - q )q 2
1/3
(1, 1, 1) q 3 3 q 3 1
The Likelihood function
S = 0
0
0.2
0.4
0.6
0.8
1
1.2
0 0.2 0.4 0.6 0.8 1
S = 1
0
0.02
0.04
0.06
0.08
0.10.12
0.14
0.16
0 0.2 0.4 0.6 0.8 1
S = 2
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0 0.2 0.4 0.6 0.8 1
S = 3
0
0.2
0.4
0.6
0.8
1
1.2
0 0.2 0.4 0.6 0.8 1
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 154/183
Estimation Theory
Point Estimation
Defn (Estimator)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 155/183
Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of observations having joint density f (x|q) where
the unknown parameter vector q W.
Then an estimator of the parameter f (q) = f (q 1 ,q 2 , ... , q k ) is any function T (x)=T ( x1 , x2 , x3 , ... ,
xn) of the observation vector.
Defn (Mean Square Error)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 156/183
Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of observations having joint density f (x|q) where theunknown parameter vector q W.
Let T(x) be an estimator of the parameter
f (q). Then the Mean Square Error of T(x) isdefined to be:
2))()((... θxθx f T E E S M T
xθxθx d f T )|())()(( 2f
Defn (Uniformly Better)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 157/183
Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of observations having joint density f (x|q) wherethe unknown parameter vector q W.
Let T(x) and T*(x) be estimators of the parameter f (q). Then T(x) is said to beuni formly better than T*(x) if:
θθ xx *...... T T E S M E S M Wθwhenever
Defn (Unbiased )
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 158/183
Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of observations having joint density f (x|q) wherethe unknown parameter vector q W.
Let T(x) be an estimator of the parameter f (q).Then T(x) is said to be an unbiased estimator of the parameter f (q) if:
θxθxxx f d f T T E )|()(
Theorem (Cramer Rao Lower bound) Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of observations having joint density f(x| ) where
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 159/183
observations having joint density f (x|q) where
the unknown parameter vector q W. Supposethat:i) exists for all x and for all . Wθ
θ
θx
)|( f
ii)
x
θ
θxxθx
θd f d f )|()|(
iii)
iv)
x
θ
θxxxθxx
θd
f t d f t
)|()|(
W
θ
θx allfor
)|(0
2
i
f E
q
Let M denote the p x p matrix with ijth element.
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 160/183
p p j
θˆ
p ji f E m ji
ij ,,2,1, )|(ln2
q q θx
Then V = M -1 is the lower bound for the covariance
matrix of unbiased estimators of q.That is, var(c' ) = c'var( )c ≥ c' M -1c = c'V c where
is a vector of unbiased estimators of q.θˆ θˆ
Defn (Uniformly Minimum Variance
Unbiased Estimator)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 161/183
Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of observations having joint density f (x|q) where
the unknown parameter vector q W. ThenT *(x) is said to be the UMVU (Uniformly
minimum variance unbiased) estimator of f (q)if:
1) E [T *(x)] = f (q) for all q W.
2) Var [T *(x)] ≤ Var [T (x)] for all q W whenever E [T (x)] = f (q).
Theorem (Rao-Blackwell)
Let x = (x x x x ) denote the vector of
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 162/183
Let x ( x1 , x2 , x3 , ... , xn) denote the vector of
observations having joint density f(x|q) wherethe unknown parameter vector q W.
Let S 1(x), S 2(x), ...,S K (x) denote a set of sufficientstatistics.
Let T(x) be any unbiased estimator of f(q).Then T*[S 1(x), S 2(x), ...,S k (x)] = E [T (x)|S 1(x),S 2(x), ...,S k (x)] is an unbiased estimator of f(q)such that:
Var [T*(S 1(x), S 2(x), ...,S k (x))] ≤ Var [T (x)]for all q W.
Theorem (Lehmann-Scheffe')
Let x = (x x x x ) denote the vector of
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 163/183
Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of
observations having joint density f(x|q) wherethe unknown parameter vector q W.
Let S 1(x), S 2(x), ...,S K (x) denote a set of
complete sufficient statistics.Let T*[S 1(x), S 2(x), ...,S k (x)] be an unbiased
estimator of f(q). Then:
T*(S 1(x), S 2(x), ...,S k (x)) )] is the UMVUestimator of f(q).
Defn (Consistency)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 164/183
Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of observations having joint density f (x|q) where
the unknown parameter vector q W. Let T n(x)
be an estimator of f (q). Then T n(x) is called aconsistent estimator of f (q) if for any e > 0:
W θθx allfor 0lim e f nn T P
Defn (M. S. E. Consistency)
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 165/183
Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of observations having joint density f (x|q) where the
unknown parameter vector q W. Let T n(x) be an
estimator of f (q). Then T n(x) is called a M. S. E.consistent estimator of f (q) if for any e > 0:
0lim...lim2
θxθ f n
nT
nT E E S M
n
Wθallfor
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 166/183
Methods for Finding Estimators
1. The Method of Moments2. Maximum Likelihood Estimation
Methods for finding estimators
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 167/183
g
1. Method of Moments
2. Maximum Likelihood Estimation
Method of Moments
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 168/183
Let x1, … , xn denote a sample from the densityfunction
f ( x; q 1, … , q p) = f ( x; q)
The k th moment of the distribution being
sampled is defined to be:
1 1, , ; , ,k k k p p E x x f x dxm q q q q
The k th sample moment is defined to be:
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 169/183
To find the method of moments estimator of
q 1, … , q p we set up the equations:
1
1 n
k k i
im xn
1 1 1, , p mm q q
2 1 2
, , p
mm q q
1, , p p pmm q q
We then solve the equations
mm q q
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 170/183
for q 1, … , q p.
1 1 1, , p mm q q
2 1 2, , p mm q q
1, , p p pmm q q
The solutions 1, , pq q
are called the method of moments estimators
The Method of Maximum Likelihood
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 171/183
Suppose that the data x1, … , xn has joint densityfunction
f ( x1, … , xn ; q 1, … , q p)
where q (q 1, … , q p) are unknown parametersassumed to lie in W (a subset of p-dimensional
space).
We want to estimate the parametersq 1, … , q p
Definition: Maximum Likel ihood Estimation
Suppose that the data x1 x has joint density
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 172/183
Suppose that the data x1, … , xn has joint density
function f ( x1, … , xn ; q 1, … , q p)
Then the L ikel ihood function is defined to be
L(q) = L(q 1, … , q p)= f ( x1, … , xn ; q 1, … , q p)
the Maximum L ikel ihood estimators of the parameters
q 1, … , q p are the values that maximize
L(q) = L(q 1, … , q p)
the Maximum L ikel ihood estimators of the parameters
q1, … , q are the values
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 173/183
q 1, … , q p are the values
1
1 1, ,ˆ ˆ, , max , ,
p
p p L Lq q q q q q
1ˆ ˆ, , pq q
Such that
Note: 1maximizing , , p L q q
is equivalent to maximizing
1 1, , ln , , p pl Lq q q q
the log-l ikelihood function
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 174/183
Hypothesis Testing
Defn (Test of size )
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 175/183
Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of observations having joint density f (x| q) where
the unknown parameter vector q W.
Let w be any subset of W.Consider testing the the Null Hypothesis
H0: q w
against the alternative hypothesis
H1: q w.
Let A denote the acceptance region for the test.
(all values x = ( x1 , x2 , x3 , ... , xn) of such that the
d i i t t H i d )
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 176/183
decision to accept H0 is made.)
and let C denote the critical region for the test
(all values x = ( x1 , x2 , x3 , ... , xn) of such that the
decision to reject H0 is made.).
Then the test is said to be of size if
and allfor )|( w θxθxx
C
d f C P
w 0oneleastatfor )|( θxθxxC
d f C P
Defn (Power) Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of
observations having joint density f (x| q) where the
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 177/183
unknown parameter vector q W.
Consider testing the the Null Hypothesis
H0: q w
against the alternative hypothesis
H1: q w.
where w is any subset of W. Then the Power of the test for
q w is defined to be:
C
C d f C P xθxxθ )|(
Defn (Uniformly Most Powerful (UMP) test of
size )
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 178/183
Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of observations having joint density f (x| q) where theunknown parameter vector q W.
Consider testing the the Null HypothesisH0: q w
against the alternative hypothesis
H1: q w.where w is any subset of W.
Let C denote the critical region for the test . Thenthe test is called the UMP test of size if :
Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 179/183
observations having joint density f (x| q) where theunknown parameter vector q W.
Consider testing the the Null Hypothesis
H0: q w
against the alternative hypothesis
H1: q w.
where w is any subset of W.
Let C denote the critical region for the test . Thenthe test is called the UMP test of size if :
and allfor )|( w θxθxxC
d f C P
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 180/183
w 0oneleastatfor )|( θxθxx
C d f C P
and for any other critical region C* such that:
and allfor )|(*
*
w
θxθxx
C
d f C P
w 0
*
oneleastatfor )|(* θxθxxC
d f C P
then
.allfor )|()|(*
w θxθxxθxC C
d f d f
Theorem (Neymann-Pearson Lemma)Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of observations having
joint density f (x| q) where the unknown parameter vector q W =
( )
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 181/183
(q0, q1).
Consider testing the the Null Hypothesis
H0: q = q0
against the alternative hypothesis
H1: q = q1.
Then the UMP test of size has critical region:
K
f
f C
)|(
)|(
1
0
θx
θxx
where K is chosen so that C
d f xθx )|( 0
Defn (Likelihood Ratio Test of size )Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of observations having
joint density f (x| q) where the unknown parameter vector q W.
C id i h h N ll H h i
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 182/183
Consider testing the the Null Hypothesis
H0: q w
against the alternative hypothesis
H1: q w.
where w is any subset of W Then the Likelihood Ratio (LR) test of size a has critical region:
where K is chosen so that
W
K f
f C
)|(max
)|(max
θx
θxx
θ
θ w
and allfor )|( w θxθxxC
d f C P
w 0oneleastatfor )|( θxθxx
C
d f C P
Theorem (Asymptotic distribution of
Likelihood ratio test criterion) Let x = ( x1 , x2 , x3 , ... , xn) denote the vector of observations having
7/28/2019 Section 02 Review of Probability and Statistics
http://slidepdf.com/reader/full/section-02-review-of-probability-and-statistics 183/183
joint density f (x| q) where the unknown parameter vector q W.
Consider testing the the Null Hypothesis
H0: q w
against the alternative hypothesis
H1: q w.where w is any subset of W
Then under proper regularity conditions on U = 2ln(x)
)|(max
)|(maxLet
θx
θxx
θ
θ
f
f
W
w