ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
ACTL2002/ACTL5101 Probability and Statistics
c© Katja Ignatieva
School of Risk and Actuarial StudiesAustralian School of Business
University of New South Wales
Week 3 Video Lecture NotesProbability: Week 1 Week 2 Week 3 Week 4
Estimation: Week 5 Week 6 Review
Hypothesis testing: Week 7 Week 8 Week 9
Linear regression: Week 10 Week 11 Week 12
Video lectures: Week 1 VL Week 2 VL Week 4 VL Week 5 VL
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Introduction
Special sampling distributions & sample mean and variance
Numerical methods to summarize dataIntroductionMeasures of location & spreadNumerical example
Graphical procedures to summarize dataSummarizing data
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Introduction
Population vs sample
Population: the large body of data;
Sample: a subset of the population.
Question: For the following four cases would we refer to apopulation or sample:
1. All the actuaries in Australia;2. The temperature on 5, randomly chosen, days;3. All NSW cars;4. The basket of goods of each fifth customer on a given day.
Solution: 1. Population; 2. Sample; 3. Population 4. Sample.
402/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Introduction
Summarising data: Numerical approaches
Given a set of observations x1, x2, x3, . . . , xn selected from apopulation (usually assumed i.i.d. (independent and identicallydistributed)).
Sorted data in ascending order: x(1), x(2), . . . , x(n), such thatx(1) is the smallest and x(n) is the largest.
Objectives:
- Understand the main features of data and to summarise data(essential first step in analysing data);
- Make inferences about the population(more on this later in the course).
403/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Measures of location & spread
Special sampling distributions & sample mean and variance
Numerical methods to summarize dataIntroductionMeasures of location & spreadNumerical example
Graphical procedures to summarize dataSummarizing data
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Measures of location & spread
Measures of locationUsed to estimate the central point of the sample, also calledmeasures of central tendency:
The sample mean is given by:
x =1
n·
n∑k=1
xk
The population mean is given by:
µx =∑all x
pX (x) · x
100α% trimmed mean, the average of the observations afterdiscarding the lowest 100α% and highest 100α%:
x̃α =x(bnαc+1) + . . .+ x(n−bnαc)
n − 2bnαc,
where bnαc is the greatest integer less than or equal to nα.404/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Measures of location & spread
Measures of spread
The sample variance:
s2 =1
n − 1·
n∑k=1
(xk − x)2 =1
n − 1·
(n∑
k=1
x2k +
n∑k=1
x2 − 2n∑
k=1
xkx
)
=1
n − 1·
(n∑
k=1
x2k − n · x2
).
The population variance:
σ2 = Var(X ) =∑all x
pX (x) · (x − µX )2 =∑all x
pX (x) · x2 − µ2X
Sample standard deviation: s =√s2.
Population standard deviation: σ =√σ2.
405/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Measures of location & spread
Quantiles
Pα, αth quantile or (α× 100)th percentile:
1
n[number of xk<Pα] ≤ α ≤ 1
n[number of xk≤Pα]
approximated by linear interpolation as the ((n − 1)α + 1)th
observation.
Quartiles: Q1 (25th percentile) and Q3 (75th percentile).
Quantile function: F−1X (u), u ∈ [0, 1], where FX (x) = u.
Question: What are the 0.025, 0.16, 0.5, 0.84 and 0.975quantiles of the N(0,1) distribution?
Solution: They are -1.96, -1, 0, 1 and 1.96, respectively.406/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Measures of location & spread
Mode: The mode m is the value that maximises the p.m.f.pX (x) in the discrete case or the p.d.f. fX (x) in thecontinuous case.
Median, M:
M =
x( n+12 ), if n is odd;
12 ·(x( n
2 ) + x( n2
+1)
), if n is even.
Median absolute deviation:
MAD = median of the numbers:{|xi −M|}.
Range:
R = x(n) − x(1).
Interquartile range:
IQR = Q3 − Q1.407/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Numerical example
Special sampling distributions & sample mean and variance
Numerical methods to summarize dataIntroductionMeasures of location & spreadNumerical example
Graphical procedures to summarize dataSummarizing data
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Numerical example
Numerical exampleAn insurance company has occurred the 26 claims with thefollowing amounts:
1 120 1 000 760 348 1 548 3 400 588990 975 346 1 100 752 335 1 245450 1 000 2 430 1 245 850 605 540478 584 1 406 760 1 000
with
26∑i=1
xi = 25 855;
26∑i=1
x2i = 36 904 873.
408/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Numerical example
Numerical example
First step: arrange in ascending order:
335 588 990 1 245346 605 1 000 1 406348 752 1 000 1 548450 760 1 000 2 430478 760 1 100 3 400540 850 1 120584 975 1 245
409/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Numerical example
Numerical exampleSome statistics:
Mean:
x =1
n·
n∑i=1
xi =25 855
26= 994.42.
Variance:
s2 =1
n − 1·
(n∑
i=1
x2i − n · x2
)
=1
25·(36 904 873− 26 · (994.42)2
)= 447, 762.6.
Standard deviation:
s =√
447 762.6 = 669.2.410/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Numerical example
Numerical example
Determine P0.35, i.e., the 35th percentile (0.35 quantile)
This is the (25 · (0.35)) + 1 = 9.75th observation.
Then, linear interpolation gives:
P0.35 = x(9) + 0.75 ·(x(10) − x(9)
)= 605 + 0.75 · (147) = 715.25
= 0.25 · x(9) + 0.75 · x(10)
411/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Numerical example
Numerical example
Recall: x(1) = 335 and x(26) = 3, 400.
Range:
R = 3, 400− 335 = 3, 065.
Quartiles:
Q1 = x(25·0.25+1) Q2 = M = x(25·0.5+1) Q3 = x(25·0.75+1)
= x(6.25) = x(13.5) = x(19.75)
= 0.75x(6) + 0.25x(7) =(x(13) + x(14)
)/2 = 0.25x(19) + 0.75x(20)
= 585 = 912.5 = 1, 115
Interquartile range:
IQR = Q3 − Q1 = 1, 115− 585 = 530.
412/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Numerical example
Numerical example
Determine the 10% trimmed mean:
Step 1: Compute bnαc = b26 · (0.1)c = 2.
Hence, we should discard the 2 smallest and the 2 largest ofthe observations.
Step 2: Compute the trimmed mean:
x̃0.10 =x(3) + . . .+ x(24)
22= 879.27.
413/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Graphical procedures to summarize data
Summarizing data
Special sampling distributions & sample mean and variance
Numerical methods to summarize dataIntroductionMeasures of location & spreadNumerical example
Graphical procedures to summarize dataSummarizing data
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Graphical procedures to summarize data
Summarizing data
Empirical cumulative distribution function (ecdf)
Given a set of observations x1, . . . , xn the empirical cumulativedistribution function is given by:
Fn (x) =1
n· (number of observations IXk≤x)
E[Fn(x)] =FX (x)
Var (Fn(x)) =1
n· FX (x) · (1− FX (x)) .
Note: pn(x) = IXk=x/n, the proportion of observations equalto x .
Proves for E[Fn(x)] and Var (Fn(x)) are not part of thecourse.
414/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Graphical procedures to summarize data
Summarizing data
0 500 1000 1500 2000 2500 3000 35000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Claim amount
E.c.d.f.
F2
6(x
)
Data:
335 346 348450 478 540584 588 605752 760 760850 975 990
1000 1000 10001100 1120 12451245 1406 15482430 3400
415/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Graphical procedures to summarize data
Summarizing data
0 500 1000 1500 2000 2500 3000 35000
2
4
6
8
10
12
14
Claim amount
Histogram
Freq
uenc
y
Data:
335 346 348450 478 540584 588 605752 760 760850 975 990
1000 1000 10001100 1120 12451245 1406 15482430 3400
Quant the number of observations in each bin (0, 500], (500, 1000],(1000, 1500], (1500, 2000], (2000, 2500], (2500, 3000], (3000, 3500].
Bin sizes chosen such that it provides good summary of the data, i.e., not
too short and not too long.416/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Graphical procedures to summarize data
Summarizing data
Stem-and-leaf display
Stem-and-leaf:
0 | 3330 | 55566688891 | 00000112241 | 52 | 42 |3 | 4
Data:
335 346 348450 478 540584 588 605752 760 760850 975 990
1000 1000 10001100 1120 12451245 1406 15482430 3400
Each row corresponds to a bin.The number before | displays the number of thousands (or hundreds/tensetc.).Each number after | displays the 3rd (or 2nd/1st) digit of an observation.
Note: rounding!417/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Graphical procedures to summarize data
Summarizing data
Boxplot (Box-and-Whiskers plot)
0
500
1000
1500
2000
2500
3000
3500Boxplot
Cla
im s
ize
Data:
335 346 348450 478 540584 588 605752 760 760850 975 990
1000 1000 10001100 1120 12451245 1406 15482430 3400
Red line: median; Blue box: Q1 and Q3 (height of box: IQR)Black lines: 10th and 90th percentileRed circles: outliers.418/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Graphical procedures to summarize data
Summarizing data
Q-Q plot calculationsThis is done by plotting the quantile function of your chosendistribution against the order statistics, x(i).
A small continuity adjustment is made, too.
For the example above, a standard normal Q-Q plot, we have:i 1 2 · · · 25 26
i−0.526 0.019 2 0.057 7 · · · 0.942 3 0.980 8
Φ−1(i−0.5
26
)-2.069 9 -1.574 4 · · · 1.574 4 2.069 9
x(i) 335 346 · · · 2 430 3 400
419/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Graphical procedures to summarize data
Summarizing data
Q-Q plot (quantile-quantile plot)
0 500 1000 1500 2000 2500 3000 3500−2.5
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
Claim size
Stan
dard
nor
mal
qua
ntile
s
Q−Q plot
Data:
335 346 348450 478 540584 588 605752 760 760850 975 990
1000 1000 10001100 1120 12451245 1406 15482430 3400
Q-Q plot displays if a distribution is a correct approximation and/or when
not (tails). Calculations: see previous slide.420/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3
ACTL2002/ACTL5101 Probability and Statistics
c© Katja Ignatieva
School of Risk and Actuarial StudiesAustralian School of Business
University of New South Wales
Week 3Probability: Week 1 Week 2 Week 4
Estimation: Week 5 Week 6 Review
Hypothesis testing: Week 7 Week 8 Week 9
Linear regression: Week 10 Week 11 Week 12
Video lectures: Week 1 VL Week 2 VL Week 3 VL Week 4 VL Week 5 VL
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Last two weeks
Introduction to probability;
Definition of probability measure, events;
Calculating with probabilities; Multiplication rule,permutation, combination & multinomial;
Distribution function;
Moments: (non)-central moments, mean, variance (standarddeviation), skewness & kurtosis;
Generating functions;
Special (parametric) univariate distributions.
501/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
This week
Joint probabilities:
- Discrete and continuous random variables;
- Bivariate and multivariate random variables;
Covariance;
Correlation;
Law of iterative expectations;
Conditional variance identity.
502/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Introduction
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Introduction
The Bivariate Case
We are often interested in the joint behavior of two (or more)random variables.
Denote a bivariate random vector by a pair as follows:X = [X1,X2]>. The joint distribution function of X is:
FX1,X2(x1, x2) = Pr (X1 ≤ x1,X2 ≤ x2) .
We can write:
Pr (a1 ≤ X1 ≤ b1, a2 ≤ X2 ≤ b2) =FX1,X2 (b1, b2)
− FX1,X2 (b1, a2)
− FX1,X2 (a1, b2)
+ FX1,X2 (a1, a2) .
503/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Introduction
Discrete Random Variables
In the case where X1 and X2 are both discrete randomvariables which can take values
x11, x12, . . . and x21, x22, . . .
respectively, we define:
pX1,X2 (x1i , x2j) = Pr (X1 = x1i ,X2 = x2j) , for i , j = 1, 2, . . .
as the joint probability mass function of X , then:
∞∑i=1
∞∑j=1
pX1,X2 (x1i , x2j) = 1.
504/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Introduction
Discrete Random Variables
The marginal p.m.f. of X1 and X2 are respectively
pX1 (x1i ) =∞∑j=1
pX1,X2 (x1i , x2j)
and
pX2 (x2j) =∞∑i=1
pX1,X2 (x1i , x2j) .
(sum over the other random variable(s)).
Prove: use Law of Total Probability.
505/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Introduction
Example discrete random variablesAn insurer offers both disability insurance (DI) andunemployment insurance (UI) to small companies.
Most companies buy DI and UI, because of a large discount.
The claims are categorized in “no claims”, “mild claims”, and“severe claims”.
Last year the 100 insured felt in the following categories:
DI no no no mild mild mild severe severe severeUI no mild severe no mild severe no mild severe
# 74 6 2 3 2 4 1 3 5
Question: Find the marginal p.m.f. of DI and UI.
Solution: x no mild severe
pDI (x) 74+6+2100 = 0.82 3+2+4
100 = 0.09 1+3+5100 = 0.09
pUI (x) 74+3+1100 = 0.78 6+2+3
100 = 0.11 2+4+5100 = 0.11
506/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Introduction
Continuous Random Variables
In the case where X1 and X2 are both continuous randomvariables, we set the joint density function of X as
fX1,X2 (X1,X2) =∂
∂x1
∂
∂x2FX1,X2 (x1, x2)
and therefore the joint cumulative density function is given by:
FX1,X2 (x1, x2) =
∫ x2
−∞
∫ x1
−∞fX1,X2 (z1, z2) dz1dz2.
Note:
FX1,X2 (∞,∞) =
∫ ∞−∞
∫ ∞−∞
fX1,X2 (z1, z2) dz1dz2 = 1
FX1,X2 (−∞,−∞) =
∫ −∞−∞
∫ −∞−∞
fX1,X2 (z1, z2) dz1dz2 = 0.
507/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Introduction
Continuous Random Variables
The marginal density function of X1 and X2 are respectively:
fX1 (x1) =
∫ ∞−∞
fX1,X2 (x1, z2) dz2 and fX2 (x2) =
∫ ∞−∞
fX1,X2 (z1, x2) dz1.
The marginal cumulative distribution function of X1 and X2
are then respectively:
FX1 (x1) =
∫ x1
−∞fX1 (u) du and FX2 (x2) =
∫ x2
−∞fX2 (u) du,
or, alternatively:
FX1 (x1) =
∫ ∞−∞
∫ x1
−∞fX (u1, u2) du1du2
and FX2 (x2) =
∫ x2
−∞
∫ ∞−∞
fX (u1, u2) du1du2.
508/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Introduction
Continuous Random Variables: example
The joint p.d.f. of X and Y is given by:
fX ,Y = 4 · x · (1− y), for 0 ≤ x , y ≤ 1, and 0 otherwise.
a. The marginal p.d.f. of X is: fX (x) =∫∞−∞ fX ,Y (x , y)dy =∫ 1
0 4 · x · (1− y)dy =[4 · x · (y − 1/2 · y2)
]10
= 2x .
b. The marginal c.d.f. of X is:FX (x) =
∫ x−∞ fX (z)dz =
∫ x0 2zdz =
[z2]x
0= x2, if 0 ≤ x ≤ 1
and zero if x < 0 and one if x > 1.
c. The marginal p.d.f. of Y is: fY (y) =∫∞−∞ fX ,Y (x , y)dx =∫ 1
0 4 · x · (1− y)dx =[1/2 · 4 · x2(1− y)
]10
= 2(1− y).
d. The marginal c.d.f. of Y is: FY (y) =∫ y−∞ fY (z)dz =∫ y
0 2(1− z)dz =[2z − z2
]y0
= 2y − y2, if 0 ≤ y ≤ 1 and zeroif y < 0 and one if y > 1.
509/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Discrete caseLet X be the random variable taking one if there is a positivereturn on the asset portfolio and zero otherwise.
Let Y be the random variable for the claims for homeinsurance, which can take value 0, 1, 2, and 3 for few, normal,many claims and a large number of claims due to floods,respectively.
The marginal probability mass functions of X and Y are:
X = x Pr (X = x)
0 1/21 1/2
and
Y = y Pr (Y = y)
0 1/81 3/82 3/83 1/8
Question: What would be the joint probability densityfunction if X and Y are independent?510/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Discrete case
Solution: If the two are independent, we would have:
Pr (X = x ,Y = y) = Pr (X = x) · Pr (Y = y)
For all X = x and Y = y the joint distribution, if they areindependent, is described in the table below:
Pr(X = x ,Y = y) Y = yX = x 0 1 2 3
0 1/16 3/16 3/16 1/161 1/16 3/16 3/16 1/16
511/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Discrete case
Suppose instead they are not independent and their jointdistribution could be described as:
Pr(X = x ,Y = y) Y = yX = x 0 1 2 3
0 0 3/16 3/16 1/81 1/8 3/16 3/16 0
Question: Proof that X and Y are dependent.
Solution: We have Pr(Y = 3) = 1/8 and Pr(X = 1) = 1/2,however Y takes the value 3 the probability that X takes thevalue 1 is zero (joint probability of Y = 3 and X = 1 is zero).
512/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Example: Multinomial distribution
Suppose we have n independent trials with r outcomes withprobabilities p1, p2, . . . , pr .
The joint frequency distribution is given by:
pN1,N2,...,Nr (n1, n2, . . . , nr ) =n!
n1! · n2! · . . . · nr !pn1
1 · pn22 · . . . · p
nrr .
The marginal distribution is (Binomial distribution!) given by:
pNi(ni ) =
∑N1
, . . . ,∑Ni−1
,∑Ni+1
, . . . ,∑Nr
pN1,N2,...,Nr (n1, n2, . . . , nr )
∗=
(n
ni
)· pnii · (1− pi )
n−ni .
Can do this by summing the marginals.* Using Binomial expansion (prove not required).
513/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Continuous case
Now consider an example of a bivariate random vector[X ,Y ]> whose joint density function is:
fX ,Y (x , y) = c(x2 + xy
), for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1,
and zero otherwise. To find the constant c, it must be a validdensity so that:
1 =
∫ ∞−∞
∫ ∞−∞
fX ,Y (x , y) dxdy =
∫ 1
0
∫ 1
0c(x2 + xy
)dxdy
= c ·∫ 1
0
[1
3x3 +
1
2x2y
]1
0
dy =c ·[
1
3y +
1
4y2
]1
0
= c · 7
12.
Hence, c = 12/7, then also fX ,Y (x , y) ≥ 0 for all x , y .
a. Question: Find the marginal densities.
b. Question: Find the joint distribution function.514/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Continuous case
a. Solution: Knowing the constant, we can then determine themarginal densities. First the marginal density for X :
fX (x) =
∫ ∞−∞
fX ,Y (x , y)dy =
∫ 1
0
12
7
(x2 + xy
)dy
=12
7
(x2 +
1
2x
), for 0 ≤ x ≤ 1,
and zero otherwise, and for Y :
fY (y) =
∫ ∞−∞
fX ,Y (x , y)dx =
∫ 1
0
12
7
(x2 + xy
)dx
=12
7
(1
3+
1
2y
), for 0 ≤ y ≤ 1,
and zero otherwise.515/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Continuous caseb. Solution: You can also determine the joint distribution
function if 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 by:
FX ,Y (x , y) =
∫ y
−∞
∫ x
−∞fX ,Y (u, v)dudv =
∫ y
0
∫ x
0
12
7
(u2 + uv
)dudv
=
∫ y
0
[12
7
(1
3u3 +
1
2u2v
)]x0
dv =
∫ y
0
12
7
(1
3x3 +
1
2x2v
)dv
=
[12
7
(1
3x3v +
1
4x2v2
)]y0
=12
7
(1
3x3y +
1
4x2y2
).
Hence:
FX ,Y (x , y) =
0, if x < 0 or y < 0;127
(13x
3y + 14x
2y2), if 0 ≤ x ≤ 1, 0 ≤ y ≤ 1;
FX (x) , if y > 1;FY (y) , if x > 1.
516/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Continuous case
−0.500.51 1.5
−0.500.511.5−0.5
00.5
11.5
x
joint p.d.f.
y
F X,Y(x
,y)
−0.5 0 0.5 1 1.5−0.5
0
0.5
1
1.5
x
F X(x)
marginal p.d.f.
−0.5 0 0.5 1 1.5−0.5
0
0.5
1
1.5
y
F Y(y)
marginal p.d.f.
−0.5 0 0.5 1 1.5−0.5
0
0.5
1
1.5
x
y
slide 519
517/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Continuous case
You can then determine the marginal distributions:
FX (x) = FX ,Y (x , 1) =
0, if x < 0;127
(13x
3 + 14x
2), if 0 ≤ x ≤ 1;
1, if x > 1,
and
FY (y) = FX ,Y (1, y) =
0, if y < 0;127
(13y + 1
4y2), if 0 ≤ y ≤ 1;
1, if y > 1.
Can you confirm the marginal densities are correct?
518/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Continuous case
It becomes straightforward to compute probability statements suchas (using lower right panel on slide 517):
Pr (X < Y ) =
∫ 1
0
∫ y
0
12
7
(x2 + xy
)dxdy
=12
7·∫ 1
0
[(x3
3+
x2y
2
)]y0
dy
=12
7·∫ 1
0
(y3
3+
y3
2
)dy
=
∫ 1
0
12
7
(5
6y3
)dy =
12 · 57 · 6
[y4
4
]1
0
=5
14,
so that Pr (X > Y ) =∫∞−∞
∫∞y fX ,Y (x , y)dxdy = 9/14.
519/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
MeansConsider the bivariate random vector X = [X1 X2]>.
The mean of X is the vector whose elements are the means ofX1 and X2, that is,
E[X ] =
[E [X1]E [X2]
]=
[µ1
µ2
].
If X1,X2, . . . ,Xn are jointly distributed random variables withexpectations E [Xi ] for i = 1, . . . , n and Y is a affine functionof the Xi , i.e.,
Y = a +n∑
i=1
biXi ,
then, we have the additively rule:
E [Y ] =E
[a +
n∑i=1
biXi
]= a +
n∑i=1
E [biXi ] =a +n∑
i=1
biE [Xi ] .520/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Variances, Covariances
Recall: variance of X is a measure for the spread of X .
Covariance is a measure of the spread between X1 and X2.
The variance of the random vector X is also called thevariance-covariance matrix:
Var (X ) =
[Var (X1) Cov (X1,X2)
Cov (X1,X2) Var (X2)
]=
[σ2
1 σ12
σ12 σ22
],
where the covariance is defined as:
Cov (X1,X2) ≡ σ12 =E [(X1 − µ1) · (X2 − µ2)]
=E [X1 · X2 − X1 · µ2−µ1 · X2 + µ1 · µ2]
=E [X1 · X2]− E [X1] · E [X2] .
Note: Cov(Xi ,Xi ) = σii = σ2i , and covariance only defined
for two r.v..521/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Example: Consider the example from slide 506.
No
Mild
Severe
No
Mild
Severe
0
0.2
0.4
0.6
0.8
DIUI
pro
ba
bili
ty
Question: Is covariancepositive or negative?
Let “no”=0, “mild”=1, and“severe”=2.
Question: Calculate the mean ofX1 = DI and X2 = UI .
Solution:E [X1] = 3+2+4
100 · 1 + 1+3+5100 · 2 = 0.27.
E [X2] = 6+2+3100 · 1 + 2+4+5
100 · 2 = 0.33.
Question: Calculate the covariancebetween X1 and X2.
Solution:E [X1 · X2] = 0.02 · 1 · 1 + 0.04 · 1 · 2 +0.03 · 2 · 1 + 0.05 · 2 · 2 = 0.36.Cov(X1,X2) = E [X1 · X2]− E [X1] ·E [X2] = 0.36− 0.27 · 0.33 = 0.2709.
522/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Example: Consider the example from slide 509.
0
0.5
1
0
0.5
10
1
2
3
4
xy
f X,Y
(x,y
)
Question: Is covariancepositive or negative?
Question: Calculate the means.
Solution: E [X1] =∫∞−∞ x · fX (x)dx =∫ 1
0 2x2dx = [2/3 · x3]10 = 2/3.
E [X2] =∫∞−∞ y · fY (y)dx =
∫ 10 y · 2
·(1− y)dy = [y2 − 2/3y3]10 = 1/3.
Question: Calculate the covariancebetween X1 and X2.
Solution:E [X1 · X2] =
∫∞−∞
∫∞−∞ fX ,Y (x , y) · x ·
ydxdy =∫ 1
0
∫ 10 4 · x2 · (y − y2)dxdy =∫ 1
0 4/3(y − y2)dy = 4/6− 4/9 = 4/18.Cov(X1,X2) = E [X1 · X2]− E [X1] ·E [X2] = 4/18− 2/3 · 1/3 = 0.
523/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Let X ∼ Beta(0.2, 1) (prob of a claim) and Y |X ∼ NB(3,X )(Y ∼ Beta-Negative-Binomial). Home insurance, insured qualifiedas bad risk if 3 claims within 50 quarters.Question: Does it have a negative or positive covariance?524/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Properties of Covariance
If X and Y are jointly distributed random variables withexpectations µX and µY the covariance of X and Y is
Cov (X ,Y ) =E [(X − µX ) · (Y − µY )]
=E [X · Y−X · µY − Y · µX + µX · µY ]
=E [X · Y ]− µX · µY .
If X and Y are independent:
Cov (X ,Y ) = E [X · Y ]− µX · µY∗= E [X ] · E [Y ]− µX · µY = 0.
* using independence X , Y .
525/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Properties of Covariance
Let X ,Y ,Z be random variables, and a, b ∈ < we have:
Cov (a + X ,Y ) =E [(a + X − (a + µX )) · (Y − µY )]
=E [(X − µX ) · (Y − µY )]
=Cov (X ,Y )
Cov (a · X , b · Y ) =E [(a · X − a · µX ) · (b · Y − b · µY )]
=E [a · (X − µX ) · b · (Y − µY )]
=a · b · E [(X − µX ) · (Y − µY )] = a · b · Cov (X ,Y )
Cov (X ,Y + Z ) =E [(X − µX ) · (Y + Z − µY − µZ )]
=E [(X − µX ) · ((Y − µY ) + (Z − µZ ))]
=E [(X − µX ) · (Y − µY ) + (X − µX ) · (Z − µZ )]
=Cov (X ,Y ) + Cov (X ,Z ) .
526/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Properties of Covariance
Suppose X1,X2,Y1 and Y2 are r.v., and a, b, c , d ∈ <, then:
Cov (aX1 + bX2, cY1 + dY2)∗=Cov (aX1 + bX2, cY1)
+ Cov (aX1 + bX2, dY2)∗=Cov (aX1, cY1) + Cov (aX1, dY2)
+ Cov (bX2, cY1) + Cov (bX2, dY2)∗∗=acCov (X1,Y1) + adCov (X1,Y2)
+ bcCov (X2,Y1) + bdCov (X2,Y2) .
* using: Cov(X ,Y + Z ) = Cov(X ,Y ) + Cov(X ,Z ).
** using: Cov(aX , bY ) = abCov(X ,Y ).
527/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Properties of Covariance
Let Xi , Yi be r.v., a, bi , c , dj ∈ < for i = 1, . . . , n andj = 1, . . . ,m.
We can generalize this as follows:
Suppose:
U = a +n∑
i=1
bi · Xi and V = c +m∑j=1
dj · Yj .
Then:
Cov (U,V ) =n∑
i=1
m∑j=1
bi · dj · Cov (Xi ,Yj) .
528/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Properties of Covariance
Note that Cov (X ,X ) = Var (X ), so we have the variance ofthe sum of r.v. is:
Var (X1 + X2) =Cov(X1 + X2,X1 + X2)
=Cov(X1,X1) + Cov(X2,X2) + 2Cov(X1,X2)
=Var (X1) + Var (X2) +2Cov (X1,X2).
Also,
Var (aX1) = Cov (aX1, aX1) = a2Cov (X1,X1) = a2Var (X1) ,
using the result that we can take a constants out of acovariance.
529/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Example CovarianceConsider the example from slides 506 and 522.
The costs for disability insurance are $1 million if mild and $2million if severe.
The costs for unemployment insurance are $0.5 million if mildand $1 million if severe.
The price of the contract is the expected value plus half thestandard deviation.
Question: What is the price for DI, UI, and DI and UIcombined?
Solution: E[X 2
1
]= 3+2+4
100 · 12 + 1+3+5100 · 22 = 0.45 and
E[X 2
2
]= 6+2+3
100 · 12 + 2+4+5100 · 22 = 0.55.
Var(X1) = E[X 2
1
]− (E [X1])2 = 0.45− 0.272 = 0.3771 and
Var(X2) = E[X 2
2
]− (E [X2])2 = 0.55− 0.332 = 0.4411.
530/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Solution (cont.)
Price DI (=1 million ×X1):
Price DI =E [X1 ×million] +√Var(X1 ×million)/2
=E [X1]×million +
√Var(X1)×million2/2
=0.27million +√
0.3771×million2 = 0.5770million.
Price UI (=0.5 million ×X2):
Price UI =E [X2 × 0.5 million] +√
Var(X2 × 0.5 million)/2
=E [X2]× 0.5 million +
√Var(X2)× 0.25 million2/2
=0.165million +√
0.4411× 0.25 million2/2
=0.3310million.531/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Solution (cont.)
Price DI and UI combined (=1 million ×X1 + 0.5 million×X2):
Price UI and DI =E [X1 ×million + X2 × 0.5 million]
+√
Var(X1 ×million + X2 × 0.5 million)/2
=E [X1]×million + E [X2]× 0.5 million
+
√Var(X1)×million2 + Var(X2)× 0.25 million2
+2Cov(X1,X2)0.5 million2/2
=0.27million + 0.165million
+
√(0.3771 + 0.441/4 + 0.2709)×million2/2
=0.8704million.
This gives a 4.15% discount!532/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Correlation coefficient
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Correlation coefficient
Correlation coefficient
Large covariance: high dependency or large variance?
We define the correlation coefficient between X1 and X2:
ρ (X1,X2) ≡ Cov (X1,X2)√Var (X1) · Var (X2)
,
provided Cov (X1,X2) exists and the variances Var (X1) andVar (X2) are each non-zero.
The value of the correlation coefficient is always between −1and 1, i.e.
−1 ≤ ρ (X1,X2) ≤ 1.
Note: correlation coefficient is only defined for 2 r.v..
Prove: see next slides.533/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Correlation coefficient
Prove: Let Y = X1σ1− X2
σ2, Var(Y ) ≥ 0 we have:
0 ≤Var(X1
σ1− X2
σ2
)=Var
(X1
σ1
)+ Var
(X2
σ2
)− 2Cov
(X1
σ1,X2
σ2
)=
1
σ21
Var (X1)︸ ︷︷ ︸=1
+1
σ22
Var (X2)︸ ︷︷ ︸=1
− 21
σ1
1
σ2Cov (X1,X2)︸ ︷︷ ︸
=ρ
√Var(X1)·Var(X2)
σ1·σ2=ρ
=2 (1− ρ) .
Consequently, we see that ρ ≤ 1 because the variance of arandom variable is non-negative.
Proof continues next slide.534/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Correlation coefficient
Similarly by considering Y = X1σ1
+ X2σ2
, Var(Y ) ≥ 0 we have:
0 ≤ Var
(X1
σ1+
X2
σ2
)= 2 (1 + ρ) ,
we see that ρ ≥ −1, which proves the result. ♦
The correlation coefficient gives a measure of the linearrelationship between the two variables. In fact, ρ = ±1 gives:
Pr (X2 = aX1 + b) = 1
for some constants a 6= 0 and b so that you can write anaffine relationship between the two.
Question: Does a correlation of zero implies independence?Solution:
535/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Correlation coefficient
Note that we have that if X , Y are independent, thenCov(X ,Y ) = 0, hence:
ρ(X ,Y ) =Cov(X ,Y )√
Var(X ) · Var(Y )=
0√Var(X ) · Var(Y )
= 0.
However, the reverse does not need to hold.
Let X , Y be r.v. with j.p.m.f. (we have set X = Y 2):
Pr(X = x ,Y = y) Y = yX = x −1 0 1
0 0 1/3 01 1/3 0 1/3
We have E [Y ] = 0, E [X ] = 2/3, and E [XY ] = 0. We have:
ρ(X ,Y ) =Cov (X ,Y )√
Var (X )Var (Y )=
E [XY ]− E [X ] · E [Y ]√Var (X )Var (Y )
= 0.
536/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Correlation coefficient
Correlation coefficient
−1.5 −1 −0.5 0 0.5 1 1.5 2 2.5
−2
−1
0
1
2
3
4
5
6
X
Y
quadratic dependence →
linear dependence
ρ=0 ρ=0.9 ρ=−0.9 ρ=0537/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Conditional Distributions
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Conditional Distributions
Conditional Distributions: Discrete case
Let X ,Y be random variables with j.p.m.f.Pr (X = xi ,Y = yj).
The conditional probability of X given Y is:
Pr (X = xi |Y = yj) =Pr (X = xi ,Y = yj)
Pr (Y = yj).
If Pr (Y = yj) = 0, then we define Pr (X = xi |Y = yj) = 0.
Example: Let X ∼ POI(3), Y ∼ POI(2), and X and Y areindependent.
We have:
Pr(X = 2|Y = 3)=Pr(X = 2,Y = 3)
Pr(Y = 3)=
Pr(X = 2) · Pr(Y = 3)
Pr(Y = 3)=Pr(X = 2).
538/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Conditional Distributions
Conditional Distributions: Continuous case
Let X ,Y be random variables with j.p.d.f. fX ,Y (x , y).
The conditional density of Y given X is:
fY |X (y |x) =fX ,Y (x , y)
fX (x).
If fX (x) = 0, then we define fY |X (y |x) = 0.
Example: consider the example from slide 509.
Question: Find fX |Y (x |y = 0.5)
Solution: fX |Y (x |y = 0.5) =fX ,Y (x ,0.5)fY (0.5) = 2x
1 = 2x .
539/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Conditional Distributions
Application: an imperfect particle counter
Define the random variable N as the number of incomingclaims and X as claims paid. Probability of a fraudulent claimis q = 1− p and number of claims paid is Binomial:
(X |N = n) ∼ Binomial (n, p) .
If the number of incoming claims follows a Poissondistribution (with parameter λ) then the number of claimspaid turns out to also be Poisson with parameter λ · p. This isan example of “thinning” of a Poisson probability.
We will see more on thinning of a Poisson probability inACLT2003/5103 using Markov chains.
Proof: See next slides.
540/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Conditional Distributions
Application: an imperfect particle counter
Proof: the law of total probability (why can we apply ithere?) gives:
Pr (X = k) =∞∑n=0
Pr (X = k |N = n ) · Pr (N = n)
=∞∑n=k
(n
k
)· pk · (1− p)n−k · λ
n · e−λ
n!, since n ≥ k .
=∞∑n=k
n!
(n − k)! · k!· pk · (1− p)n−k · λ
n · e−λ
n!
continues on next slide.
541/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Conditional Distributions
Application: an imperfect particle counter
Now (making change of variables j = n − k in the third line):
=∞∑n=k
n!
(n − k)! · k!· pk · (1− p)n−k · λ
n · e−λ
n!
=(λ · p)k
k!· e−λ ·
∞∑n=k
λn−k · (1− p)n−k
(n − k)!
=(λ · p)k
k!· e−λ ·
∞∑j=0
(λ · (1− p))j
j!
∗=
(λ · p)k
k!· e−λ · eλ·(1−p) =
(λ · p)k
k!e−λ·p,
which is the p.m.f. of a Poisson(λ · p) random variable.* using exponential function exp(x) =
∑∞i=0 x
i/i !, withx = λ(1− p).
542/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
The Bivariate Normal Distribution
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
The Bivariate Normal Distribution
The Bivariate Normal Distribution
Suppose [X ,Y ]> has a bivariate normal distribution, then itsdensity is given by:
fX ,Y (x , y) =1
2πσXσY√
1− ρ2exp
(− 1
2 (1− ρ2)A
),
where
A =
(x − µXσX
)2
− 2ρ
(x − µXσX
)·(y − µYσY
)+
(y − µYσY
)2
.
543/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
The Bivariate Normal Distribution
The following results are important although quite tedious to show(see section 5.10 of W+(7ed) for some of the derivation):
1. The marginals are: X ∼ N(µX , σ
2X
)and Y ∼ N
(µY , σ
2Y
).
2. The conditional distributions are:
(Y |X = x ) ∼ N
(µY + ρ (x − µX )
σYσX
, σ2Y
(1− ρ2
))and
(X |Y = y ) ∼ N
(µX + ρ (y − µY )
σXσY
, σ2X
(1− ρ2
)).
3. The correlation coefficient between X and Y is: ρ (X ,Y ) = ρ.
544/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
The Bivariate Normal Distribution
Simulating multivariate normal distribution
Bivariate case: use properties 1 & 2 to simulate from i.i.d.standard normal distributions:
X =µX + σXZ1
Y =µY + σY ρZ1 + σY
√(1− ρ2)Z2,
where Z1 and Z2 are i.i.d N(0, 1).
OPTIONAL: In case of multivariate normal, letZ = [Z1 . . .Zn]> i.i.d. N(0, 1), we have:
- The Cholesky decomposition: AA> = Σ (Σ is thevariance-covariance matrix).
- We have: X = µ+ AZ .
545/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Law of Iterated Expectations
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Law of Iterated Expectations
Law of Iterated Expectations
Note: E[X |Y = y ] is a constant, but E[X |Y ] is a randomvariable.
For any two random variables X and Y , we have the law ofiterated expectations:
E [E [Y |X ]] = E [Y ] .
To prove this in the continuous case, first consider:
E [E [Y |X ]] =
∫ ∞−∞
E [Y |X = x ] · fX (x) dx
=
∫ ∞−∞
(∫ ∞−∞
y · fY |X (y |x ) dy
)fX (x) dx .
546/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Law of Iterated Expectations
Interchanging order of integration, we have
E [E [Y |X ]] =
∫ ∞−∞
y
∫ ∞−∞
fY |X (y |x ) fX (x) dx︸ ︷︷ ︸=fY (y)
dy
∗=
∫ ∞−∞
y · fY (y) dy
=E [Y ]
* using the law of total probability (why can we use it here?).
547/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Conditional variance identity
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Conditional variance identity
Conditional variance identity
Another important result is the conditional variance identity:
Var (Y ) = Var (E [Y |X ]) + E [Var (Y |X )] .
Proof (* using the law of iterative expectations):
Var (Y ) =E[Y 2]− (E [Y ])2
∗=E
[E[Y 2|X
]]− (E [E [Y |X ]])2
=E[E[Y 2|X
]]−E
[(E [Y |X ])2
]+ E
[(E [Y |X ])2
]− (E [E [Y |X ]])2
=E [Var (Y |X )] + Var (E [Y |X ]).
Proof can also be found in section 5.11 of W+(7ed).548/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Application & Exercise
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Application & Exercise
Application: Random Sums
An insurance company usually has uncertainty in both thenumber of claims and the claim amount of each claim filled.
Denote the total claim size is S , individual claim size Xi andN is the total number of claims.
We are interested in the (distribution) mean and variance of arandom sum defined as:
S = X1 + X2 + . . .+ XN ,
where both the Xi ’s and N are random variables.
We assume all the Xi are independent and also independent ofN.
549/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Application & Exercise
Application: Random SumsMean of S : The mean of the aggregate claims is:
E [S ] = E [Xi ] · E [N] .
This is straightforward:
E [S ] =E [E [S |N ]]
=E
[E
[N∑i=1
Xi |N
]]
=E
[N∑i=1
E [Xi |N]
]=E [N · E [Xi |N]]∗=E [E [Xi ]] · E [N] = E [Xi ] · E [N] .
* using independence Xi and N.550/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Application & Exercise
Application: Random SumsVariance of S : The variance of the aggregate claims is:
Var (S) = (E [Xi ])2 · Var (N) + E [N] · Var (Xi ) .
This is also straightforward to show:
Var (S)∗=E [Var (S |N )] + Var (E [S |N ])
=E
[Var
(N∑i=1
Xi
)]+ Var (E [Xi ] · N)
∗∗=E [N] · E
Var (Xi )︸ ︷︷ ︸constant
+
E [Xi ]︸ ︷︷ ︸constant
2
· Var (N)
=E [N] · Var (Xi ) + (E [Xi ])2 · Var (N)
* using conditional variance identity, ** using independencebetween Xi and N.
551/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Application & Exercise
Application: Random Sums
Moment Generating Function of S : The m.g.f. of theaggregate claims is given by:
MS (t) = MN (log (MX (t))) .
Finding the m.g.f. is also straightforward:
MS (t) =E[etS]
= E[E[etS∣∣∣N]]
=E[(MX (t))N
]= E
[eN·log(MX (t))
]=MN (log (MX (t))) .
Note that when the number of claims has a Poissondistribution, the resulting total claims S is said to have aCompound Poisson distribution.
552/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Application & Exercise
ExerciseLet X ∼ Gamma(α, β) and Y |X ∼ EXP(1/X ).
a. Question: Find E [Y ].(Note: E [X ] = α/β, EXP(λ)=Gamma(1,λ))
b. Question: Find Var (Y ). (Note: Var (X ) = α/β2)
a. Solution:
E [Y ] =E [E [Y |X ]]
=E [X ] = α/β.
b. Solution:
Var (Y ) =Var (E [Y |X ]) + E [Var (Y |X )]
=Var(X ) + E[X 2]
=α/β2 + Var(X ) + (E [X ])2
=α/β2 + α/β2 + (α/β)2 =(2α + α2
)/β2.
553/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Multivariate Case
Introduction
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Multivariate Case
Introduction
The Multivariate Case
Let X = [X1,X2, . . . ,Xn]> be a random vector with nelements. The joint distribution function (DF) of X isdenoted by:
FX1,X2,...,Xn (x1, . . . , xn) = Pr (X1 ≤ x1, . . . ,Xn ≤ xn) .
In the discrete case, we define the joint probability massfunction as:
pX1,X2,...,Xn (x1, . . . , xn) = Pr (X1 = x1, . . . ,Xn = xn) .
In the continuous case, we define the joint density function ofX as:
fX1,X2,...,Xn (x1, . . . , xn) =∂
∂x1. . .
∂
∂xnFX1,X2,...,Xn (x1, . . . , xn) .
554/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Multivariate Case
Introduction
The joint DF is given by:
FX1,X2,...,Xn (x1, . . . , xn) =
∫ xn
−∞. . .
∫ x1
−∞fX1,X2,...,Xn (z1, . . . , zn) dz1 . . . dzn.
To derive marginal p.m.f.’s or densities, simply evaluate (sumor integrate) overall the region except for the variable ofinterest. For example in the continuous case, the marginaldensity of Xk , for k = 1, 2, . . . , n is given by:
fXk(xk) =
∫ ∞−∞
. . .
∫ ∞−∞
fX1,X2,...,Xn (z1, . . . , xk,. . . . , zn)∏j 6=k
dzj .
555/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Multivariate Case
Introduction
Independent Random Variables
The random variables X1,X2, . . . ,Xn are said to beindependent if their joint distribution function can be writtenas the product of their marginal distribution functions:
FX1,X2,...,Xn (x1, . . . , xn) = FX1 (x1) · . . . · FXn (xn) .
As a consequence, their joint density can also be written as:
fX1,X2,...,Xn (x1, . . . , xn) = fX1 (x1) · . . . · fXn (xn) ,
in the continuous case and for the discrete case as:
pX1,X2,...,Xn (x1, . . . , xn) = pX1 (x1) · . . . · pXn (xn) .
556/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Multivariate Case
Introduction
Also, we have (if independent):
E [X1 · X2 · . . . · Xn] = E [X1] · E [X2] · . . . · E [Xn] ,
and in general, (if independent) we have:
E [gX1 (X1) · gX2 (X2) · . . . · gXn (Xn)] =E [gX1 (X1)] · E [gX2 (X2)]
· . . . · E [gn (Xn)] .
557/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Multivariate Case
Introduction
If X1 and X2 are independent, then:
1. Cov [X1,X2] = 0 and so ρ (X1,X2) = 0.
2. E [X1 |X2 ] = E [X1] and of course E [X2 |X1 ] = E [X2].
3. A very useful result about independence is that X1,X2, . . . ,Xn
are independent if and only if we can write the jointdistribution as a product of functions that involve only eachrandom variable:
FX1,X2,...,Xn (x1, . . . , xn) = HX1 (x1) · . . . · HXn (xn)
for some functions HX1 , . . . ,HXn .
558/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Summarizing data
Exercises
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Summarizing data
Exercises
Exercise: summarizing data
An insurer assumes that the time between claims isexponential distributed. A reinsurer pays out when the insurerhas two or more claims within two years. The distribution ofinterest is Gamma(2,3).
Sorted observations:1.56 1.88 2.53 3.393.62 3.68 5.24 5.255.31 5.56 5.66 6.17
Questions: Find the
a. Median;b. Range;c. 10% trimmed mean;d. Inter quantile range.
Solutions:a. M = (3.68 + 5.24)/2 = 4.46;b. R = 6.17− 1.56 = 4.61;c. x̃0.10 =
x(2)+...+x(11)
10 = 4.212;d. Q1 = 2.53 + 0.75 · (3.39− 2.53) = 3.18
Q3 = 0.75 · 5.31 + 0.25 · 5.56 = 5.37IQR = 5.37− 3.18 = 2.19.559/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Summarizing data
Exercises
0 5 10 15 20 250
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1E.c.d.f.
F n(x)
Colored lines: E.c.d.f.Black solid line: Gamma(2,3) c.d.f.Black dashed lines: Gamma(2,3) c.d.f. ±2σ
Question: Is the Gamma(2,3) the correct distribution?
Solution: Yes.560/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Summary
Summary
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Summary
Summary
Summary joint probabilitiesJoint distribution function:
FX1,X2(x1, x2) = Pr (X1 ≤ x1,X2 ≤ x2) .
Marginal p.m.f.:
pX1 (x1i ) =∞∑j=1
pX1,X2 (x1i , x2j) .
Marginal density function:
fX1 (x1) =
∫ ∞−∞
fX1,X2 (x1, z2) dz2.
Conditional probability:
Pr (X = xi |Y = yj) =Pr (X = xi ,Y = yj)
Pr (Y = yj).
561/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Summary
Summary
Summary joint probabilities
Covariance:
Cov (X1,X2) ≡ σ12 = E [X1 · X2]− E [X1] · E [X2] .
Correlation:
ρ (X1,X2) =Cov (X1,X2)√
Var (X1) · Var (X2).
Law of iterative expectations:
E [E [Y |X ]] = E [Y ] .
Conditional variance identity:
Var (Y ) = Var (E [Y |X ]) + E [Var (Y |X )] .562/562
Top Related