Probability and Statistics

Probability and StatisticsPart 1. Probability Concepts and Limit Theorems

Chang-han Rhee

Stanford University

Sep 19, 2011 / CME001

Outline

Probability ConceptsProbability SpaceRandom VariablesExpectationConditional Probability and Expectation

Limit TheoremsModes of ConvergenceLaw of Large NumbersCentral Limit Theorem

Outline

Probability of an Eventin a random experiment

Relative frequency of an event, when repeating a random experiment.

e.g. coin flip, dice roll, roulette

Sample SpaceSet of all possible outcomes.

I Single coin flipΩ = H,T

I Two coin flipsΩ = (H,H), (H,T), (T,H), (T,T)

I Single dice rollΩ = 1, 2, 3, 4, 5, 6

I Two dice rollsΩ = (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)

(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)

(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6)

(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6)

(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6)

(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)

Subset of sample space.I Single coin flip : The event that the coin lands head

I Two coin flips : The event that the first coin lands head

A = (H,H), (H,T)

I Single dice roll : The event that the dice falls on an odd number

A = 1, 3, 5

I Two dice roll : The event that the sum is 4

A = (1, 3), (2, 2), (3, 1)

Ω = (H,H), (H,T), (T,H), (T,T)

Sample Space

Event: first coin lands on head

Outcome: both coin lands on tail

Probability

DefinitionA set function P is called a probability if

I 0 ≤ P(A) ≤ 1 for each event AI P(Ω) = 1 (Unitarity)I For each sequence A1,A2, . . . of mutually disjoint events

(∞∪1

∞∑1

P(Ai) (Countable Additivity)

Back to Examples

I Fair CoinP(∅) = 0

P(H) = 1/2

P(T) = 1/2

P(H,T) = 1

I Biased Coin ( p ∈ [0, 1] )P(∅) = 0

P(H) = p

P(T) = 1 − p

P(H,T) = 1

Outline

Random Variables

A random variable is a function from a sample space to a real number.e.g.

I Winnning in single coin flip

X(H) = 1

X(T) = −1

I First roll, second roll, and sum of two dice

X(i, j) = i

Y(i, j) = j

Z(i, j) = i + j

Discrete Random VariablesA discrete random variable X assumes values in discrete subset S of R.

The distribution of a discrete random variable is completely describedby a probability mass function pX : R → [0, 1] such that

P(X = x) = pX(x)

e.g.I [Bernoulli] X ∼ Ber(p) if X ∈ 0, 1 and

P(X = 1) = 1 − P(X = 0) = pi.e.,

pX(1) = p and pX(0) = 1 − p

I [Binomial] X ∼ Bin(n, p) if X ∈ 0, 1, . . . , n and

pX(k) =(

)pk(1 − p)n−k

Continuous Random VariablesA continuous random variable X assumes values in R.

The distribution of continuous random variables is completelydescribed by a probability density function fX : R → R+ such that

P(a ≤ X ≤ b) =∫ b

afX(x)dx

e.g.I [Uniform] X ∼ Unif (a, b), a < b if

fX(x) = 1

b−a a ≤ x ≤ b0 o.w.

I [Gaussian/Normal] X ∼ N(µ, σ2), µ ∈ R, σ2 > 0 if

fX(x) =1√

2πσ2e−

(x−µ)2

Probability Distribution∗

Each random variable induces another probability PX : 2R → [0, 1] onreal line through the following:

PX((−∞, x]) := P(X ≤ x)

We often denote the distribution function with FX:

FX(x) := P(X ≤ x)

[NOTATION] The right-hand sides of the previous displays areshorthand notation for the following:

P(X ≤ x) := P(ω ∈ Ω : X(ω) ≤ x)

Note: Distribution can be identical even if the supporting probabilityspace is different.

e.g.X(H) = 1

X(T) = −1

Y(i) =

1 if i is odd−1 if i is even

Joint Distribution

Two random variables X and Y induce a probabillity PX,Y on R2:

PX,Y((−∞, x]× (−∞, y]) = P(X ≤ x,Y ≤ y)

A collection of random variables X1,X2, . . . ,Xn induce a probabillityPX1,...,Xn on Rn:

PX1,...,Xn((−∞, x1]× · · · × (−∞, xn]) = P(X1 ≤ x1, · · · ,Xn ≤ xn)

Joint distribution of two discrete random variables X and Y assumingvalues in SX and SY can be completely described by joint probabilitymass function pX,Y : R× R → [0, 1] such that

P(X = x,Y = y) = pX,Y(x, y)

Joint distribution of two continuous random variables X and Y can becompletely described by joint probability density functionfX,Y : R× R → R+ such that

P(X ≤ x,Y ≤ y) =∫ x

−∞

−∞fX,Y(x, y)dydx

Outline

Expectation

For discrete random variable X, the expectation of X is

E[X] =∑x∈S

x pX(x)

For continuous random variable Y , the expectation of Y is

E[Y] =∫ ∞

−∞y fY(y)dy

Computation of Expectation

We can also compute the expectation of g(X) and g(Y) as follows:

E[g(X)] =∑x∈S

g(x)pX(x)

E[g(Y)] =∫ ∞

−∞g(x)pY(x)dx

Properties of Expectation

I LinearityEaX + bY = aEX + bEY

I MonotonocityX ≤ Y =⇒ EX ≤ EY

Probability as an Expectatoin

[NOTATION] We denote the indicator function of A with IA(·)

IA(ω) =

1 if ω ∈ A0 if ω /∈ A

Probability can be written as an expectation:

PX(A) = E IA(X)

More generally,P(A) = E IA

Summary Statistics

I MeanE[X]

I Variance

var(X) = E[(X − EX)2]

= EX2 − (EX)2

I Standard Deviation

σ(X) =√

var(X)

Outline

Conditoinal Probability

The conditional probability of A given B is defined as

P(A|B) = P(A ∩ B)P(B)

Conditional Probability Mass and Density

If X and Y are both discrete random variables with joint probabilitymass function pX,Y(x, y),

P(X = x|Y = y) = pX|Y(x|y) :=pX,Y(x, y)

If X and Y are both continuous random variables with joint densityfunction fX,Y(x, y),

P(a ≤ X ≤ b|Y = y) =∫ b

afX|Y(x|y)dx

fX|Y(x|y) =fX,Y(x, y)

Independence

Two events A and B are independent if

P(A ∩ B) = P(A)P(B)

Two random variables X and Y are independent if

P(X ≤ x,Y ≤ y) = P(X ≤ x)P(Y ≤ y)

Conditional ExpectationDiscrete Random Variable

For discrete random variables X and Y , the conditional expectation ofX given Y = y is

E[X|Y = y] =∑x∈S

x pX|Y(x|y) =∞∑

xP(X = x,Y = y)

P(Y = y)

Conditional ExpectationContinuous Random Variable

For continuous random variables X and Y , the conditional expectationof X given Y = y is

E[X|Y = y] =∫ ∞

−∞x fX|Y(x|y)dx =

∫ ∞

−∞x

fX,Y(x, y)fY(y)

Properties of Conditional Expectation

I LinearityE[aX + bY|Z] = aE[X|Z] + bE[Y|Z]

I Monotonocity

X ≤ Y =⇒ E[X|Z] ≤ E[Y|Z]

Outline

Almost Sure Convergence

Let X1,X2, . . . be a sequence of random variables. We say that Xn

converges almost surely to X∞ as n → ∞ if

P(Xn → X∞ as n → ∞) = 1

We use the notation Xna.s.→ X∞ to denote almost sure convergence, or

convergence with probability 1.

Lp Convergence

[NOTATION] For p > 0, we denote p-norm of X with ∥ · ∥p

∥X∥p := (E|X|p)1/p

Let X1,X2, . . . be a sequence of random variables. For p > 0, we saythat Xn converges to X∞ in pth mean if

∥Xn − X∞∥p → 0

as n → ∞.

We use the notation XnLp

→ X∞ to denote convergence in pth mean, orLp convergence.

Convergence in Probability

converges in probability to X∞ if for each ϵ > 0,

P(|Xn − X∞| > ϵ) → 0

as n → ∞.

We use the notation Xnp→ X∞ to denote convergence in probability.

Weak Convergence

converges weakly to X∞ if

P(Xn ≤ x) → P(X∞ ≤ x)

as n → ∞ for each x at which P(X∞ ≤ ·) is continuous.

We use the notation Xn ⇒ X∞ or XnD→ X∞ to denote convergence in

probability or convergence in distribution.

Implications

Almost Sure Convergence Lp Convergence

Convergence in Probability

Weak Convergence

Outline

Weak Law of Large Numbers

Theorem (Weak Law of Large Numbers)Suppose that X1,X2, · · · is a sequence of i.i.d. r.v.-s such thatE|X1| < ∞. Then,

1n(X1 + · · ·+ Xn)

P→ EX1 as n → ∞

Strong Law of Large Numbers

Theorem (Strong Law of Large Numbers)Suppose that X1,X2, · · · is a sequence of i.i.d. r.v.-s such that EX1exists. Then,

1n(X1 + · · ·+ Xn)

a.s.→ EX1 as n → ∞

Outline

Central Limit Theorem

TheoremSuppose that the Xi’s are iid rv’s with common finite variance σ2.Then, if Sn = X1 + · · ·+ Xn,

Sn − nE(X1)√n

⇒ σN(0, 1)

as n → ∞.

From here, we can deduce the following approximation:

Sn − EX1D≈ 1√

nN(0, 1)

Probability and Statistics - Stanford...

Documents

Transcript of Probability and Statistics - Stanford...

Probability and Statistics - caniban.files.wordpress.com · SCHAUM’S Easy OUTLINES PROBABILITY AND STATISTICS BASED ONSCHAUM’S Outline of Probability and Statistics BY MURRAY

Statistics and Probability 13.1 Basic Statistics

Probability & Statistics,

1 Probability and Statistics What is probability? What is statistics?

Summarizing Data. Statistics statistics probability probability vs. statistics sampling inference.

Probability ans statistics

Basic EKG Refresher1

Probability & Statistics

Chapter 6: Probability. Flow of inferential statistics and probability SamplePopulation Inferential Statistics Probability.

Probability & Statistics Chapter 4 Elementary Probability ...images.pcmac.org/SiSFiles/Schools/IL/JacksonvilleSchoolDistrict... · Probability & Statistics Chapter 4 Elementary Probability

September GA Refresher1

Probability and Statistics in Geologydsw/geomath_lect12_stats.pdf · Probability and Statistics in Geology Probability and statistics are an important aspect of Earth Science. ...

Conditional Probability and Probability Trees Statistics

21998904 Probability Statistics

Applied Probability & Statistics

Probability & Statistics 4

Probability Theory Statistics

Probability with Statistics · Probability with Statistics 3. Probability with Statistics 4. Probability with Statistics 5. suq PGCMGGIJ 01 S. eporuq bLgcueG SLGse suq 01 Ron SLGge

PROBABILITY AND STATISTICS - Programs, Courses AIU ...courses.aiu.edu/STATISTICS/7/SESSION 7.pdf · SESSION 7 Probability and Statistics Probability Line Probability is the chance