Post on 23-Feb-2016
description
Ch 8Fundamentals of Probability Theory
ENGR 4323/5323Digital and Analog Communication
Engineering and PhysicsUniversity of Central Oklahoma
Dr. Mohamed Bingabr
Chapter Outline
• Concept of Probability
• Random Variables
• Statistical Averages (MEANS)
• Correlation
• Linear Mean Square Estimation
• Sum of Random Variables
• Central Limit Theorem
2
Deterministic and Random Signals
Deterministic Signals: Signals that can be determined by
mathematical equation or graph. It is possible to predict the
future values with 100% certainty.
Random Process Signals: Unpredictable message signals and
noise waveform. These type of signal are information-bearing
signals and they play key roles in communications.
3
Concept of Probability
4
Experiment: In probability theory an experiment is a process
whose outcome cannot be fully predicted. (Throwing a die)
Sample space: A set that contain all possible outcomes of an
experiment. {1, 2, 3, 4, 5, 6}
Sample point (element): an outcome of an experiment. {3}
Event: A subset of the sample space that share some
common characteristics. {2, 4, 6} even number
Complement of event A (Ac): Event containing all points not in
A. {1, 3, 5}
Concept of Probability
5
Null event (ø): Event that has no sample point.
Union of events A and/or B (A U B): The event that contains all points in A or in B or in both.
Intersection (joint) of events A and B (A ∩ B, AB): The event that contain all points common to event A and B.
Mutually Exclusive: Events A and B are mutually exclusive if A occur then B can not occur.
Relative frequency and Probability: If event A is of interest and an experiment is conducted N times then the relative frequency of A occurrence (probability) is
Concept of Probability
7
Joint Probability:
If A and B are mutually exclusive A ∩ B = ø then
Conditional Probability: the probability of one event is influenced by the outcome of another event.
Independent Events: The occurrence of one event is not influenced by the occurrence of the other event.
A B
S
Bernoulli Trials
8
Bernoulli trial is an experiment where there are two possible outcomes, success or failure. If the probability of success is p then the probability of failure is (1-p).
Number of way to arrange k success in n trials =
𝑝 ¿ = 𝑛!
𝑘!(𝑛−𝑘)! 𝑝𝑘(1−𝑝)𝑛−𝑘
Example 1
9
A binary symmetric channel (BSC) has an error probability Pe = 0.001 (i.e., the probability of receiving 0 when 1 is transmitted, or vice versa). Note that the channel behavior is symmetrical with respect to 0 and 1. A sequence of 8 binary digits is transmitted over this channel. Determine the probability of receiving exactly 2 digits in error.
Example 2
10
In binary communication, one of the techniques used to increase the reliability of a channel is to repeat a message several times. For example, we can send each message (0 or 1) three times. Hence, the transmitted digits are 000 (for message 0) or 111 (for message 1). Because of channel noise, we may receive any one of the eight possible combinations of three binary digits. The decision as to which message is transmitted is made by the majority rule. If Pe is the error probability of one digit, and P(ϵ) is the probability of making a wrong decision in this scheme. Find P(ϵ) in term of Pe. If Pe = 0.01 then what is P(ϵ) ?
Multiplication Rule for Conditional Probability
11
𝑃 ( 𝐴∩𝐵 )=𝑃 ( 𝐴 ) 𝑃 (𝐵 /𝐴 )𝑃 (𝐴1 𝐴2 … 𝐴𝑛 )=𝑃 (𝐴1 ) .𝑃 ( 𝐴2/𝐴1 ) .𝑃 ( 𝐴3 /𝐴1 𝐴2 ) …𝑃 (𝐴𝑛/ 𝐴1 𝐴2 … 𝐴𝑛−1 )
Example
Suppose a box of diodes consist of Ng good diodes and Nb bad diodes. If five diodes are randomly selected, one at a time, without replacement, determine the probability of obtaining the sequence of diodes in the order of good, bad, good, good, bad.
The Total Probability Theorem
12
Let n disjoint events A1, …, An from a partition of the sample spaces S such that
Then the probability of an event B can be written as
𝑃 (𝐵)=∑𝑖=1
𝑛
𝑃 (𝐵/ 𝐴𝑖)𝑃 ( 𝐴𝑖)
¿ 𝑖=1¿𝑛 𝐴𝑖=𝑆 and 𝐴𝑖∩ 𝐴 𝑗=∅ , if 𝑖≠ 𝑗
This theorem simplifies the analysis of the more complex events of interest, B, by identifying all different causes Ai.
Example
13
The decoding of a data packet may be in error because of N distinct error patterns E1, E2, …, En it encounters. These error patterns are mutually exclusive, each with probability P(Ei) = pi. When the error pattern Ei occurs, the data packet would be incorrectly decoded with probability qi. Find the probability that the data packet is incorrectly decoded.
Baye’s Theorem
14
Baye’s theorem determines the likelihood of a particular cause of an event among many disjoint possible causes.
Theorem
Let n disjoint events A1, …, An form a partition of the sample space S. Let B be an event with P(B) >0. Then for j=1, …, n,
𝑃 (𝐴 𝑗 /𝐵)=𝑃 (𝐵 /𝐴 𝑗 )𝑃 (𝐴 𝑗)
𝑃 (𝐵)=
𝑃 (𝐵/ 𝐴 𝑗 )𝑃 (𝐴 𝑗)
∑𝑖=1
𝑛
𝑃 (𝐵/ 𝐴𝑖)𝑃 (𝐴𝑖)
Example
15
A communication system always encounter one of three possible interference waveforms: F1, F2, or F3. The probability of each interference is 0.8, 0.16, and 0.04, respectively. The communication system fails with probability 0.01, 0.1, and 0.4 when it encounters F1, F2, and F3, respectively. Given that the system has failed, find the probability that the failure is a result of F1, F2, or F3, respectively.
Random Variable
16
A discrete random variable has numerical values that resulted from mapping sample points (outcomes of experiment) to these numbers.
The outcomes of tossing a coin are {H, T} we can assign 1 for head and -1 for tail. The random variable X = {1, -1}
∑𝑖𝑃𝑥 (𝑥 𝑖 )=1
Random Variable
17
𝑃𝑥𝑦 (𝑥𝑖 , 𝑦 𝑗 )=𝑃𝑥 (𝑥 𝑖 )𝑃 𝑦 (𝑦 𝑗 )
For two independent random variables X and Y (tossing two coins):
∑𝑖∑𝑗𝑃𝑥 𝑦 (𝑥𝑖 , 𝑦 𝑗 )=1
Example
18
A binary symmetric channel (BSC) error probability is Pe. The probability of transmission 1 is Q, and that of transmitting 0 is 1-Q. Determine the probability of receiving 1 and 0 at the receiver.
Conditional Probabilities
19
∑𝑖𝑃x∨ y (𝑥 𝑖|𝑦 𝑗 )=∑
𝑗𝑃 y∨x ( 𝑦 𝑗|𝑥 𝑖 )=1
If x and y are two RVs, then the conditional probability of
x = xi given y = yj is denoted by Px|y(xi|yj)
∑𝑖∑𝑗𝑃 x y (𝑥 𝑖 , 𝑦 𝑗 )=1
𝑃 y (𝑦 𝑗)=∑𝑖𝑃 xy(𝑥𝑖 , 𝑦 𝑗)
𝑃 x (𝑥 𝑖 )=∑𝑗𝑃 xy(𝑥 𝑖 , 𝑦 𝑗)
Conditional Probabilities
20
𝑃 y ( 𝑦 𝑗 )=∑𝑖𝑃 y∨x ( 𝑦 𝑗|𝑥 𝑖 )𝑃 x (𝑥 𝑖 )
If x and y are two RVs, then the conditional probability of
x = xi given y = yj is denoted by Px|y(xi|yj)
𝑃 x (𝑥 𝑖 )=∑𝑗𝑃 x∨ y (𝑥𝑖|𝑦 𝑗 )𝑃 y (𝑦 𝑗 )
Example
21
Over a certain binary communication channel, the symbol 0 is transmitted with probability 0.4 and 1 is transmitted with probability 0.6. It is given that P(ϵ|0) = 10-6 and P(ϵ|1) = 10-4, where P(ϵ|xi) is the probability of detecting the error given that xi is transmitted. Determine P(ϵ), the error probability of the channel.
Cumulative Distribution Function (CDF)
22
A CDF, Fx(x), of an RV X is the probability that X takes a value less than or equal to x.
Property of CDF
1) Fx(x) 0
2) Fx() = 1
3) Fx (-)=0
4) Fx(x) is a nondecreasing function.
Example
23
In an experiment, a trial consists of four successive tosses of a coin. If we define an RV x as the number of heads appearing in a trial, determine Px(x) and Fx(x).
Continuous Random Variable
24
The random variable has continuous value.
px(x) is the probability density function (pdf) that describes the relative frequency of occurrence of different values of x.
Properties of the probability density function:
∫− ∞
∞
𝑝 x (𝑥 ) 𝑑𝑥=1
𝑃 (𝑥1<𝑥≤ 𝑥2 )=∫𝑥1
𝑥2
𝑝x (𝑥 )𝑑𝑥=𝐹 x (𝑥2)− 𝐹 x (𝑥1)
𝑝x (𝑥 ) ≥ 0
Cumulative distribution function:𝐹 x (𝑥 )=∫−∞
𝑥
𝑝x (𝑢)𝑑𝑢=1
𝑝x (𝑥 )=𝑑𝐹 x (𝑥 )𝑑𝑥
The Gaussian (Normal) Random Variable
26
𝑝x (𝑥 )= 1√2𝜋
𝑒−𝑥2/2
𝐹 x (𝑥 )= 1√2𝜋 ∫
−∞
𝑥
𝑒−𝑥2 /2𝑑𝑥
𝑄 (𝑥 )=1 −𝐹 x (𝑥)
Q (𝑦 )= 1√2𝜋∫
𝑦
∞
𝑒−𝑥2/2𝑑𝑥
𝐹 x (𝑥 )=𝑃 ( x ≤ 𝑥 )=1 −𝑄 (𝑥 )
𝑃 (x>𝑥 )=𝑄 (𝑥)
Standard Gaussian RV (µ = 0, σ = 1)
The Gaussian (Normal) Random Variable
29
𝑝x (𝑥 )= 1𝜎 √2𝜋
𝑒−(𝑥−𝑚)2 /2𝜎 2
𝐹 x (𝑥 )= 1𝜎 √2𝜋 ∫
− ∞
𝑥
𝑒−(𝑥−𝑚 )2/2𝜎 2
𝑑𝑥
𝐹 x (𝑥 )=𝑃 ( x ≤ 𝑥 )=1 −𝑄 (𝑥−𝑚𝜎 )
𝑃 (x>𝑥 )=𝑄 (𝑥−𝑚𝜎 )
General Gaussian RV (µ , σ)
Example
30
Over a certain binary channel, message m = 0 and 1 are transmitted with equal probability by using a positive and negative pulse, respectively. The received pulse corresponding to 1 is p(t), shown in the figure, and the received pulse corresponding to 0 is –p(t). Let the peak amplitude of p(t) be Ap at t = Tp. The channel noise n(t) has a normal distribution with zero mean and standard deviation. Because of the channel noise, the received pulse will be
What is the probability of error Pe.
𝑟 (𝑡 )=±𝑝 (𝑡 )+𝑛(𝑡 )
Example (cont.)
31
𝑃𝑒=∑𝑖𝑃 (𝜖 ,𝑚𝑖)
𝑃𝑒=∑𝑖𝑃 (𝑚𝑖)𝑃 (𝜖∨𝑚𝑖)
𝑃𝑒=𝑃 (0 )𝑃 (𝜖|0 )+𝑃 (1)𝑃 (𝜖∨1)
𝑃 (𝜖|0 )=𝑃 (𝑛>𝐴𝑃 )=𝑄( 𝐴𝑝
𝜎𝑛)
𝑃 (𝜖|1 )=𝑃 (𝑛<− 𝐴𝑃 )=𝑄 ( 𝐴𝑝
𝜎𝑛)
𝑃𝑒=𝑄 ( 𝐴𝑝
𝜎𝑛)
Joint Distribution
32
For two RVs x and y, the CDF Fxy(x,y)𝐹 xy (𝑥 , 𝑦 )=𝑃 (x ≤ 𝑥∧y≤ 𝑦)
𝑝xy (𝑥 , 𝑦 )= 𝜕2
𝜕 𝑥𝜕 𝑦 𝐹 xy(𝑥 , 𝑦 )
𝑃 (𝑥1<x ≤𝑥2 , 𝑦1< y ≤ 𝑦2 )=∫𝑥1
𝑥2
∫𝑦1
𝑦2
𝑝xy (𝑥 , 𝑦 )𝑑𝑥𝑑𝑦
𝑝x (𝑥 )=∫− ∞
∞
𝑝 xy (𝑥 , 𝑦 )𝑑𝑦
𝑝 y (𝑦 )=∫− ∞
∞
𝑝 xy (𝑥 , 𝑦 )𝑑𝑥
Conditional Densities
33
For two RVs x and y, the CDF Fxy(x,y)𝐹 xy (𝑥 , 𝑦 )=𝑃 (x ≤ 𝑥𝑎𝑛𝑑 y ≤ 𝑦 )
𝑝x ∨ y (𝑥∨𝑦 )=𝑝 xy (𝑥 , 𝑦 )𝑝 y (𝑦 )
𝑝 y∨ x ( 𝑦∨𝑥 )=𝑝 xy (𝑥 , 𝑦 )𝑝x (𝑥 )
Bayes’ rule𝑝x ∨ y (𝑥∨𝑦 )𝑝 y ( 𝑦 )=𝑝 y∨x (𝑦∨𝑥 )𝑝x (𝑥 )
Independent Random Variables
𝑝x ∨ y (𝑥∨𝑦 )=𝑝x (𝑥 )
𝑝x ∨ y (𝑥∨𝑦 )=𝑝x (𝑥 )
𝑝xy (𝑥 , 𝑦 )=𝑝 x (𝑥 )𝑝 y (𝑦 )
Rayleigh Density Example
34
Derive the Rayleigh probability density function (pdf).
𝑝𝑟 (𝑟 )={ 𝑟𝜎2 𝑒−𝑟 2 /2𝜎 2
𝑟 ≥ 0
0𝑟<0
Statistical Averages (MEANS)
35
The average value or expected value of RV x
x=𝐸 [𝑥]=∑𝑖=1
𝑛
𝑥 𝑖𝑃 x (𝑥 𝑖)
x=𝐸 [𝑥]=∫− ∞
∞
𝑥𝑝 x (𝑥 )𝑑𝑥
Mean of a function g(x) of a random variable x
𝑔 (𝑥)=∑𝑖=1
𝑛
𝑔 (𝑥¿¿ 𝑖)𝑃 x(𝑥 𝑖)¿
𝑔 (𝑥)=∫−∞
∞
𝑔(𝑥 )𝑝 x (𝑥 )𝑑𝑥
The random variable x can be the alphabetic letters and the function could be the PCM
Example
36
Example:The output voltage of sinusoid generator is A cos(ωt). This output is sampled randomly. The sampled output is an RV x, which can take on any value in the range (-A, A). Determine the mean value and the mean square value of the sample output.
Statistical Averages (MEANS)
37
Mean of the Sum
+
𝑔1(𝑥)𝑔2(𝑦 )=∫− ∞
∞
∫− ∞
∞
𝑔1(𝑥)𝑔2(𝑦 )𝑝 x y (𝑥 , 𝑦 )𝑑𝑥𝑑𝑦
Mean of the product
If RVs x and y are independent, then
𝑔1(𝑥)𝑔2(𝑥)=∫− ∞
∞
𝑔1(𝑥)𝑝x (𝑥 )𝑑𝑥∫− ∞
∞
𝑔2(𝑦 )𝑝 y (𝑦 ) 𝑑𝑦
Moments
38
The nth moment of an RV x
The nth central moment of an RV x
The variance and standard deviation
x𝑛=∫−∞
∞
𝑥𝑛𝑝 x (𝑥 ) 𝑑𝑥
(x − x )𝑛=∫− ∞
∞
(𝑥− x )𝑛𝑝x (𝑥 )𝑑𝑥
𝜎 x2=(x − x )2=x2 − x2
Variance of a Sum of Independent RVs
41
z= x+ y 𝜎 z2=𝜎 x
2+𝜎 y2
ExampleFind the total mean square error in PCM
Quantization Channelm ~m
𝑞=𝑚−�� 𝜖=m−~m
m
Chebyshev’s Inequality
42
The standard deviation σ of an RV x is a measure of the width of its PDF. The standard deviation in communication is also used to estimate the bandwidth of a signal spectrum.
𝑃 (¿ x − x∨≤𝑘𝜎 x ) ≥ 1− 1𝑘2
𝑃 (¿ x∨≤𝑘𝜎 x ) ≥ 1− 1𝑘2
Correlation
43
The covariance is a measure of the nature of dependence between the RVs x and y.
𝜎 xy=(x− x)( y− y)
𝜎 xy=xy − x y
Correlation coefficient is a normalized covariance.
𝜌 xy=𝜎 xy
𝜎x𝜎 y
−1≤ 𝜌 xy ≤1
Independent variable are uncorrelated, the converse is not necessarily true.
Linear Mean Square Estimation
44
When two random variables x and y are related (dependent), then it is possible to estimate the value of y from a knowledge of the value of x.
𝜖2=( y− y )2y=𝑎 x
Minimum square error is one possible criterion for the estimation of y.
The optimum estimation is to choose a to make
𝜕𝜖2
𝜕𝑎 =2𝑎 x2− 2 xy=0 𝑎= xyx2 =
𝑅xy
𝑅xx
𝜖2= ( y−𝑎 x )2= ( y−𝑎 x ) y −𝑎 .𝜖 x
𝜖2=( y−𝑎 x ) y=𝑅yy −𝑎𝑅xy
𝜖 x=( y −𝑅xy
𝑅xxx )x=0
Mean square error
Using n Random Variable for Estimation
45
Using n random variables x1, x2,…,xn to estimate a random variable x0.
𝜕𝜖2
𝜕 𝑎𝑖=− 2 [ x0 − (𝑎1 x1+𝑎2 x2+…+𝑎𝑛 x𝑛 ) ] x𝑖=0
x0=𝑎1 x1+𝑎2 x2+…+𝑎𝑛 x𝑛
𝑅0 𝑖=𝑎1 𝑅𝑖1+𝑎2 𝑅𝑖2+…+𝑎𝑛𝑅𝑖𝑛
𝜖2= [ x0− (𝑎1 x1+𝑎2 x2+…+𝑎𝑛 x𝑛) ]2
𝑅𝑖𝑗= x𝑖 x 𝑗where
[𝑎1
𝑎2
⋮𝑎𝑛
]=[𝑅11 𝑅12 … 𝑅1𝑛
𝑅21 𝑅22 … 𝑅2𝑛
⋯ ⋯ ⋯ ⋯𝑅𝑛1 𝑅𝑛2 ⋯ 𝑅𝑛𝑛
]−1
[𝑅0 1
𝑅02
⋮𝑅0𝑛
]𝜖2=𝑅00− (𝑎1𝑅01+𝑎2𝑅02+…+𝑎𝑛𝑅0𝑛)
Example
46
In differential pulse code modulation (DPCM), instead of transmitting sample values directly, we estimate (predict) the value of each sample from the knowledge of previous n samples. The estimation error k, the difference between the actual value and the estimated value of the kth sample, is quantized and transmitted. Because the estimation error k is smaller than the sample value mk, for the same number of quantization levels, the SNR is increased. The SNR improvement is equal to , where and are the mean square values of the speech signal and the estimation error , respectively.
Find the optimum linear second-order predictor and the corresponding SNR improvement.
Sum of Random Variables
47
How does the pdf of z relate to the pdfs of x and y?
If x and y are independent random variables
z= x+ y
𝐹 z (𝑧 )=𝑃 ( z≤ 𝑧 )=𝑃 ( x≤ ∞ , y ≤ 𝑧−𝑥 )=∫− ∞
∞
𝑑𝑥 ∫− ∞
𝑧−𝑥
𝑝xy (𝑥 , 𝑦 )𝑑𝑦
𝑝z (𝑧 )=𝑑 𝐹 𝑧 (𝑧)𝑑𝑧 =∫
− ∞
∞
𝑝xy (𝑥 , 𝑧−𝑥 )𝑑𝑥
𝑝z (𝑧 )=∫− ∞
∞
𝑝 x (𝑥 )𝑝 y (𝑧−𝑥 )𝑑𝑥
The PDF of z is the convolution of the PDFs of x and y.
Sum of Gaussian Random Variables
48
y is a Gaussian RV with
If x1 and x2 are jointly Gaussian but not necessarily independent then
y=x1+x2
The sum of jointly distributed Gaussian random variables is also a Gaussian random variable regardless of their relationship such as independence.
y=x1+x2 𝜎 y2=𝜎 x1
2 +𝜎 x2
2
𝜎 y2=𝜎 x1
2 +𝜎 x2
2 +2𝜎 x1 x2
Sum of Gaussian Random Variables
49
The fact that the sum of jointly distributed Gaussian random variables is also a Gaussian random variable, has important practical application.
For example, if xk is a sequence of jointly Gaussian signal samples passing through a discrete time filter with impulse response {hi}, then the filter output y is also Gaussian
𝑦=∑𝑖=0
∞
h𝑖 x𝑘−𝑖
The Central Limit Theorem
50
The sum of a large number of independent RVs tends to be a Gaussian random variable, independently of the probability densities of the variable added.
The Central Limit Theorem (for the Sample Mean)
51
Let x1, x2, …, xn be independent random variables from a given distribution with mean µ and variance σ2 with 0< σ2<. Then the sample mean
is a Gaussian random variable with mean equals µ and variance equals σ2/n.
lim𝑛→ ∞
𝑃 [ x𝑛−𝜇𝜎 /√𝑛
≤ 𝑥]=∫− ∞
𝑥 1√2𝜋
𝑒−𝑣2 /2𝑑𝑣
lim𝑛→ ∞
𝑃 [ x𝑛−𝜇𝜎 /√𝑛
>𝑥 ]=𝑄 (𝑥 )
Also is a Gaussian random variable with mean equals nµ and variance equals nσ2.