Post on 28-Sep-2015
description
EE132B - Recitation 1BProbability Review
Outline of Review
Probability AxiomsDiscrete Random VariablesContinuous Random VariablesExpectation Values and VariancesMoment Generating Function
2
Components of a Probability Model
( )
An experiment is the process of observing a phenomenon with multiple possible outcomes.
Sample Space : A set of all possible observable outcomes of a random phenomena.The sample space may be discrete
S
( )
( )
or continuous.
Set of Events : A set (collection) of one or more outcomes in the sample space, where .
Probability of Events : A consistent description of the likelihood of observing an event.
Th
EE S
P
( )us a probability model is a triplet defined as , , .S E P
3
Probability
Probability of an event estimates the proportion of times the event is expected to occur in repeated random experiments, and is denoted as ( ).
Some properties:Probability values are always between
AP A
( )( )
0 ( ) 1 .Probability is a numerical value of the likelihood that an event will occur.
0 indicates an event that is never/impossible to occur.1 indicates an event that is certain to occur.
P A
P A
P A
=
=
4
Terminology and Definitions
( )( )
( )
( )
Given events , ,Union of two events: or Intersection of two events: and
Complement of an event: Not
Cardinality (Size) of Sets:Let the number of elements of the set size of
A B SA B A B
A B A B
A A
A A A
A
=
B A B A B = +
5
Mutually Exclusive Events
( )
Sample space is a set and events are the subsets of this (universal) set.
Two events and are mutually exclusive (disjoint) iff. if and only if their intersection is empty, i.e.
A set of even
A BA B
n
=
( )ts 2 are mutually exclusive iff., , otherwise i
i j
n
A i jA A
>
= =
6
Probability Axioms
( )( )
( )( )
( ) ( ) ( )
1 2
For any event , the probability of that evant is such that:0 1
10
1
If events and are not mutually exclusive,( ) ( ) ( ) ( )
If events , , ..., are all mutua
c
n
A P A
P A
P S
P
P A P A P A
A BP A B P A P B P A B
A A A
=
=
= =
= +
( )
( )11
lly exclusive, then
since 0 for .
n n
i iii
i j
P A P A
P A A i j==
=
=
7
Probability Axioms (cont.)
( )
1 2
1 1 2 1 2 11
Given events { , , , }, and the probability of each outcome ( ),Sum of disjoint products :
... ...
Total probability theorem :
( | ) ( )
n i i
n
i n ni
i ii
E A A A p P e
P A P(A ) P( A A ) P( A A A A )
P A P A B P B
=
=
= =
= + + +
=
1
1 2
1
where , , .... are disjoint.
Sum of total probability :
( ) 1,
where 0 ( ) 1 for .
k
k
n
ii
i
B B B
P A
P A i=
=
8
Conditional Probability
( ) ( )( ) ( )
( )
The probability of event occurring, given that event has occurred, is called the conditional probability of given .
,, 0
( , ) is the probability that both events and occur
A BA B
P A BP A B P B
P B
P A B P A B A B
=
=
( ) ( ) ( )( ) ( )
( ) ( ) ( ) ( )
together.
The (generalized) multiplication rule for the intersection of two events:,
More variables: , , ,
( , | ) ( | ) ( | , )
P A B P A B P B
P B A P A
P A B C P A B C P B C P C
P A B C P A C P B A C
=
=
=
=9
Mutual Independence
( ) ( )
( ) ( ) ( ) ( ) ( )
Events and are said to be mutually independent iff or ( | ) ( )
Knowing about event gives you no information about event .
Next, we have:, , where 0, 0
A BP A B P A P B A P B
B A
P A B P A P B P A P B
Mutually Inde
= =
=
. Note that the probability of the union of mutually exclusive events is the sum of their probabilities, while the probability of the intersection of mutually indep
pendent vs Mutually Exclusive Events
endent events is the product of their probabilities.
10
Bayes Rule
( ) ( )
( ) ( ) ( )( )( ) ( )
( ) ( ) ( ) ( )
Suppose we know , want to find .
From the multiplication rule and total probability theorem,
' Given ( ) and ( | ), where , 1, 2i i i
P A B P B A
P A B P BP B A
P A
P A B P B
P A B P B P A B P B
General Version of Bayes RuleP B P A B B i
=
=+
=
( ) ( ) ( )( ) ( )
1
,..., , are mutually exclusive,
i ii n
j jj
n
P A B P BP B A
P A B P B=
=
11
Binary Communication Example
To
R1
Ro
T1
P(Ro|To)
P(Ro|T1)
P(R1|To)
P(R1|T1)
0 1
0 1
Channel diagram of an unreliable binary communication channel
T = "0 is transmitted" T = "1 is transmitted"R = "0 is received" R = "1 is received"
12
Binary Communication Example (cont.)
0 0 1 1
0 1
0 0 0 0 0 1 1
1 1 1 1 1 0 0
Given : ( | ) 0.92; ( | ) 0.95( ) 0.45; ( ) 0.55
We can find :( ) ( | ) ( ) ( | ) ( )
0.4415( ) ( | ) ( ) ( | ) ( )
0.5585 or
P R T P R TP T P T
P R P R T P T P R T P T
P R P R T P T P R T P T
= =
= =
= +
=
= +
=
( ) ( ) ( )
1 0
1 0 0 1
0 1 1 1 0 0
it can be calculated as ( ) 1 ( )
( | ) ( ) ( | ) ( )
P R P R
P error P T R P T RP R T P T P R T P T
=
= +
= +
P(R1|T1)
T0
T1 R1
R0P(Ro|To)
13
Random Variables
Sample space is often too large to deal with directly. For example, if we have a sequence of Bernoulli trials (e.g. coin toss), we don't need the detailed information about the actual pattern of heads
( )
( ) ( ) ( )
and tails but only the number of heads and tails. Such abstractions leadto the notion of RV .
Discrete RV : countable number of valuesProbability Mass Function : ,
Contin
X
Random Variable
Xpmf p x P X x=
( )
( ) ( )
uous RV : uncountably infinite number of different values Probability Density Function ( ):
: Uppercase letters , for RVs, lowercase , for the values.
X
Xpdf f x
NotationX Y x y
14
Probability Mass Function
( ) ( )1 2
Probability Mass Function ( )For a discrete RV with possible values , , ..., , the is defined as:
The is the probability that the value of the random variable obtained ba
n
X i i
pmfX x x x pmf
p x P X x
pmf X
= =
( )
( )( )
sed on the outcomes of the random experiments is equal to . Note that the is defined for a specific RV value, i.e. .
Some properties:0 1,
1
The discrete random variable a
i
i
X
Xx
xpmf P X x
pmfp x x
p x
=
=
ssigns some value to each sample point .x s S
15
Cumulative Distribution Function
( )
( ) ( )( )
( )
Cumulative Distribution Function The of a random variable gives the probability that will take on a value less than or equal to when the experiment is performed.
So
X
Xx t
cdfcdf X X
xF t P X t
P X t
p x
=
=
=
( )( ) ( ) ( ) ( ) ( )( )
( ) ( )( ) ( )
( )
1 2 1 2
me properties:0 1, for
is a monotonically increasing function of , since if , then lim 0 and lim 1
is right continuous
X
X
X X
X Xx x
X
cdfF x x
P a X b P X b P X a F b F a
F x x
x x F x F x
F x F x
F x
= =
= =
.
1/2
1
16
Common Discrete Random Variables
Constant UniformBernoulliBinomialGeometricPoisson
17
Constant Random Variable
( )
( )
: 1, if 0, otherwise
This is also known as the Dirac delta function (in math),or a unit impulse function (in engineering)
:0, if 1, otherwise
This is also known as the
X
X
pmfx c
p x
cdfx c
F x
==
=
unit step function.
18
Uniformly Distributed Random Variable
( )
( ) ( )0
1Discrete RV that assumes discrete values with equal probability .
:1/ , if image of
0, otherwise
: Assume takes integer values 1, 2,3,..., , then
iX i
x
X Xi
X nn
pmfn x X
p x
cdfX n
xF x p i
n
=
=
= =
( ) ( ) ( )
, for 0
Note that 1 .X X X
x n
p i F i F i
=
19
Bernoulli Random Variable
( ) ( )( ) ( )
1
0
RV generated by a single Bernoulli trial that has a binary valued outcome {0,1}.Such a binary valued is called the indicator or Bernoulli random variable.
:1 1 , where 0 10 0
X
X
X
pmfp p P X p p
p p P X q
= = = =
= = = =
( )
( )
1
1 , 0 , 1
0, otherwise
:0, 0 , 0 1
1, 1
X
X
p
p xp x p x
cdfx
F x q xx
=
== =
Application: Bernoulli Trials
To be considered a Bernoulli trial, an experiment must meet three criteria:1. There must be only 2 possible outcomes.2. Each outcome must have an invariant probability of sucess.
The probability of success is usually denoted by , and the probabilityof failure is denoted by 1 .
3. The outcome of each trial is completely independent of the outcomeof any other trials (mutually independent).
Sequence
pq p=
{ }( ) ( ) ( ) ( ){ }{ }
( ) ( )
1
2
of Bernoulli trials: independent repetitionsLet represent the sample space of Bernoulli trials
0,1
0,0 , 0,1 , 1,0 , 1,1
2 -tuples of 0's and 1's
given that 0 , 1 , 1, with , 0
n
nn
nS n
S
S
S n
P q P p p q p q
=
=
=
= = + =
21
Application: Bernoulli Trials (cont.)
( ) ( ) ( ) ( ) ( )11 1
We want to find the probability of exactly successes in trials.Probability of sequence of successes followed by ( - ) failures.
... ...
How about any sequence of suc
kk k n
k n k
k nk n k
P P A P A P A P A P A
p q
k
+
=
=
( )
( )
cesses out of trials?
The occurance of sucesses can be arranged in different ways.
exactly successes and - failures
!where , 0 is called the bin! !
k
k n k
nn
kk
p P k n k
np q
kn n n kk k n k
=
=
= omial coefficient.
22
Binomial Random Variable
( ) ( )
Binomial RV fixed number of independent Bernoulli trials.Let RV be the number of successes in Bernoulli trials.
:
, 0
0, otherwise
:
n
n
k n Y
k n k
nY n
pmfp P Y k P k
np q k n
k
cdf
F
= = =
=
( ) ( )0
1n
tn kk
Yk
nt p p
k
=
=
23
Applications of Binomial Distribution
Transmitting binary digits through a communication channel, the number of digits received correctly, out of transmitted digits is given by a binomial distribution where:
probability of successfu
nC n
p =
( )
( ) ( ) ( )
( ) ( )
lly transmitting one digit
The probability of exactly errors, and the probability of an error-free transmission are given by:
1
error free
n
e
in ie C
ne
i p i
np i p n i p p
i
P p i p
= =
= =
24
Geometric Distribution
Consider a sequence of Bernoulli trials up to and including the firstsuccess. We want to find the probability that it will take exactly trials to produce the first success.
If is the probability of
n
p
( ) ( ) 11
a success and be the probability of failure for each Bernoulli trial (recall independence), then has a geometric distribution with the respective and :
:
1 , 1, 2,3,...nnZ
qZ
pmf cdf
pmf
p n q p p p n
cdf
= = =
( ) ( )1
1
1 0
:11 1 , 01
tt tn m t
Zn m
qF t p p p q p q tq
= =
= = = =
25
Memoryless Property
Geometric distribution is the only discrete distribution that exhibits memoryless property, whereby future outcomes are independent of the past events.
Let be the RV denoting total number of trials uZ p to and including the first success. Assume trials completed with all failures. Let denote additional trials up to and including the first success, i.e. or .
The conditional probab
nY
Z n Y Y Z n= + =
( )( )
ility is given by:
...
i
i
q
q P Y i Z n
P Z n i Z n
= = >
= = >
=
26
Memoryless Property (cont.)
( )( )
( )( )( )( )
( )
( )
( )
1
1
Continued... and
1
1 1
Thus after unsuccessful trials, the number of trials remaining until the first success has the same as had
i
Z
Z
n i
n
i
Z
P Z n i Z nq
P Z n
P Z n iP Z n
p Z n iF n
pqq
pqp i
npmf Z
+
= + >=
>
= +=
>
= +=
=
=
=
originally.
27
Poisson Random Variable
( )
A discrete RV follows the Poisson distribution with parameter if its is:
, 0,1, 2,...!
Assuming is rate of arrivals, gives the probability of exactly arrivals over some cont
k
Xpmf
eP X k kk
Xk
= = =
inuous interval (0, ].
In a small interval of length , then probability of two or more arrivals in the interval of length may be neglected, i.e. the probability of a new arrival is .
Suppose the
t
tt t
interval (0, ] is divided into subintervals of length / , and thearrival of a job in any interval is mutually independent with the other arrivals.
t n t n
28
Poisson Approximation to BinomialFor a large , each of the intervals can be thought of as constituting a sequence of Bernoulli trials with probability of success .Therefore, probability of arrivals in a total of n intervals
n np t n
k=
( )
( ) ( )
is given by a Binomial distribution with parameters , :
!(1 ) 1! !
As and 0, with (rate of occurance) moderate, binomial distribution converg
k n kk n k
n p
n n t tP X k p pk n k k n n
n p np
= = =
=
( )
( )
es to Poisson with parameter :
! ( 1)...( 1)lim 1 1! ! !
( 1)...( 1) 1
1
1 1
!
k n k k n k
n
nk
n
n
k
n
k
n
n t t n k n nn k k n n k n n
n k n nn
en
neP X k
k
+ = +
= 29
Poisson Distribution Example
In a bank, a rechargeable Bluetooth device sends a "ping" packet to a computer every time a customer enters the door. Customers arrive with Poisson distribution of customers per day. The Bluetooth d evicehas a battery capacity of Joules. Every packet consumes Joules, therefore the device can send / packets before it runs out of battery.Assuming that the device starts fully charged in the
m nu m n=
( ) ( )1
0
morning, what is the probability that it runs out of energy by the end of the day?
1 1!
ku
k
eP Y u P Y uk
=
= < =
30
Continuous Random Variables
( ) ( )
( )
( )
( )
The probability density function of a continuous RV is denoted as .Some properties:
0 1,
1
The cumulative distribution function of a continuous RV is denoted as
X
X
X
pdf X f xpdf
f x x
f x dx
cdf X F
=
( )
( ) ( )
( )( )
( ) ( )
( ) ( ) ( )
.
( ) ,
Some properties: must be a continuous function of monotonically non-decreasing in
lim 0 and lim 1
0
X
x
X
X
X
X Xx xc
Yc
x
F x P X x f u du x
cdfF x x
F x x
F x F x
P X c P c X c f y dy
= =
= =
= = = =
31
Continuous Random Variables (cont.)
( ) ( )
( ) ( ) ( )
( ) ( )
If is a continuous RV, its and can be derived from each other.
:
:
,x
b
Xa
X pdf cdf
pdfdF x
f xdx
cdf
F x P X x f u du x
P a X b f u du
=
= =
=
32
Common Continuous Random Variables
UniformExponentialGammaGaussian (Normal)
33
Uniform Random Variable
( )
is constant over the interval ( , ) : 1 ,
0, otherwise
is a ramp function:0,
( ) ,
1,
pdf a b
a x bf x b a
cdfx a
x aF x a x bb a
b x
=
Exponential Distribution
Characteristics :Commonly used in communications & queuing theory.A non-negative continuous random variable.It exhibits memoryless property.(continuous counterpart of geometric distribution)Relat
ed to (discrete) Poisson distribution.
Often used to model :Interarrival times between two IP packets (or voice calls).Service time distribution at a server.Time to failure, time to repair etc.
35
Exponential Distribution (cont.)
( )
( ) ( )
( ) ( )
A continuous RV is exponential distributed with parameter if has the following:
:
, if 0 0, if 0
:
1 , if 0 0, if 0
Note that:
1
x
x
t
X
e xf x
x
cdf
e xF x P X x
x
P X t f x dx
=
Exponential and Poisson Distribution
Let be a discrete RV to represent the number of jobs arriving to a file server in the interval (0, ], and that it is Poisson distributed with parameter .Let RV defined to be the time to next a
tNt
tX
( ) ( ) ( )
( )
0
rrival:
00!
Therefore, 1 , where is exponentially distributed with parameter .
tt
t
tX
e tP X t P N e
F t e X
> = = = =
=
37
Memoryless Property
Let RV defined to be the time to next failure. Assume , i.e. we have observed that the component has not failed until time .
The probabilty of having to wait at least additional seconds given
X X tt
h
>
( ) { } { }( )( )( )( )( )
( )
that one has already been waiting seconds... is the same as the probability ofwaiting at least seconds:
and
Exponential is the only continuo
t h
t
h
th
P X t h X tP X t h X t
P X t
P X t hP X t
ee
e P X h
+
> + >> + > =
>
> +=
>
=
= = >
us distribution with the memoryless property.
38
Gamma Random Variable
( )
( ) ( )
1
0
1
The gamma function is defined as:
, for 0
The of a gamma RV with parameters 0 and 0 is given by:
, 0
Characteristics:The gamma RV is a basic distribution of
z x
x
X
z x e dx z
pdf Xx ef x x
r
= >
> >
= < <
( )statistics for non-negative variables.
It is very versatile due to the gamma function .By varying parameters and , the gamma PDF can fit into many types of experiment data.Many RV are special c
z
ases of the gamma RV, e.g. Erlang, exponential, and chi-square distributions are special cases.
39
Gaussian (Normal) Random Variable
A basic distribution of statistics. Many applications arise from central limit theorem (average of values of observations approaches normal distribution, irrespective of form of original distributio
nn under quite general conditions).
In a normal distribution, about 68% of the values are within one standarddeviation of the mean and about 95% of the values are within two standard deviations of the mean.
40
Normal Random Variable (cont.)
( )21
2
2
The RV is Gaussian (normal) distributed if it has the following :
1 ,2
where mean, standard deviation, and variance.
Characteristics: Bell shaped and symmetrical
x
X pdf
f x e x
=
= = =
2
and completely describe the RV , thus the RV can also be represented as ~ ( , )
Central Limit Theorem: Sum of a large number of mutually independent RV's (having arbitrary distributions) star
XX N ?
ts following Normal distribution as .
This is significant in statistical estimation/communication theory etc.n
41
Standard Normal Distribution
( )
( )
Though Gaussian has a closed-form, but there exists no closed form for the Guassian . How do we determine ?
Answer: use tables after transformation to standard normal distribution 0,1 .
pdfcdf P a X b
N
< =
=
=
=
( )
( ) ( ) ( ) ( ) can be obtained by differention:
1 , 0 2
0, otherwise
Y
X XYY
f y
f y f y yd F yyf y
dy
+ > = =
43
Convolution Integral
( ) ( )0
For the special case when , where and are non-negative, independent random variables
( ) ( ) ( ) , 0
The above integral is often called the convolution of and .
Thus the den
t
Z X Y
X Y
Z X Y X Y
f t f x f t x dx t
f x f y
= +
=
sity of the sum of two non-negative independent, continuous random variables is the convolution of the individual densities, which ishard to calculate. That is why we use generating function instead (which we will discuss later).
44
Introduction: Expectation and Variance
In order to completely describe the behavior of a RV, an entirefunction ( or ) must be given. In some situations we areinterested just in a few parameters that summarize the informationprovided
cdf pdf
by these functions.
For example, when a large collection of data is assembled, we aretypically interested not in the individual numbers, but rather in acertain quantities such as the average.
There are several ways to abstract information from the / into a single number: mean, median, mode, variance etc.
cdf pdf
45
Expected Value
[ ] ( ) ( )
( ) ( )
( ) ( )
The expected value (mean, average) of a RV :
is discrete
is continuous
general equation
The expectation of is a weighted average of the possible values that
X k X kk
X
X
X
E X X x p x X
x f x dx X
x dF x
X X
= = =
=
=
[ ] ( ) ( )
( )0
0
can take on.
For a non-negative random variable, instead of using the density function, the expectation value can be obtained through the distribution function:
is discrete
1
k
X
E X P X k X
F x dx
=
= >
=
( ) is continuousX
46
Semi-Markov Process Example
0
The following distribution is of interest:
1 (1 ) , 0( )
1 ,
The mean of the random variable with the above distribution function is:
[ ] (1 ( ))
(1 /
ttT
Xe t T
F tt T
X
E X F t dt
t T
=
= = =
= = =
= = = =
= +
2
0 0 0
2 2 2 2
! ! !
Variance : ( ) [ ] ( [ ])
j j
j j jje e
j j j
Var X E X E X
= = =
= + = +
= = + =
60
Relationship of Continuous Distibution
61
Exponential Distribution
( )( )
[ ] ( )
0 0
2 2
0
20 0
2
2 22 2 2
: 1 , 0
: , 0
Mean and variance:
1
[ ]
2
2 2[ ]
2 1 1( ) [ ] ( [ ])
x
x
xX
x x
x
x x
cdf F x e x
pdf f x e x
E X x f x dx xe dx
xe e dx
E X x e dx
x e xe dx
E X
Var X E X E X
=
=
= =
= + =
=
= +
= =
= = =
62
Exponential Distribution (cont.)If a component obeys an exponential failure law with parameter (known as the failure rate), then its expected life, or its mean time to failure (MTTF), is 1/ .
If repair time obeys an exponential d
istribution with parameter ?(known as the repair rate), then its expectation, or its mean time to failure (MTTR), is 1/ ? .
If the interarrival times of jobs to a server are exponentially distributed with parameter (known as the arrival rate), then the mean (average)interarrival time is 1/
Finally, if the service time requirement of a job is an exponentially distributed random variable with par
ameter ? (known as the service rate), then the mean (average) service time is 1/?
63
Continuous Uniform Distribution
( )
[ ] ( )
( )( )
[ ] ( )
2 2
1 1
2
The density function is given by:1 ,
Expectation is calculated as:
2 2
The moment is computed as:1
1Therefore,
12
X
b
a
th
n nbn n
a
f x a x bb a
x b a a bE X dxb a b a
nb aE X x dx
b a n b a
b aVar X
+ +
=