1 Review of Probability. 2 Probability Theory: Many techniques in speech processing require the...

Review of Probability

Probability Theory: Many techniques in speech processing

require the manipulation of probabilities and statistics.

The two principal application areas we will encounter are:Statistical pattern recognition.Modeling of linear systems.

Events: It is customary to refer to the probability of

an event.

An event is a certain set of possible outcomes of an experiment or trial.

Outcomes are assumed to be mutually exclusive and, taken together, to cover all possibilities.

Axioms of Probability: To any event A we can assign a number,

P(A), which satisfies the following axioms:P(A)≥0.P(S)=1. If A and B are mutually exclusive, then

P(A+B)=P(A)+P(B).

The number P(A) is called the probability of A.

Axioms of Probability (some consequence): Some immediate consequence:

If is the complement of A, then

P(0) ,the probability of the impossible event, is 0.P(A) ≤ 1.

If two event A and B are not mutually exclusive, we can show that P(A+B)=P(A)+P(B)-P(AB).

ASAA )(

)(1)( APAP

Conditional Probability: The conditional probability of an event A,

given that event B has occurred, is defined as:

We can infer P(B|A) by means of Bayes’ theorem:

ABPBAP

)()|()|(

BPBAPABP

Independence: Events A and B may have nothing to do with

each other and they are said to be independent.

Two events are independent if

P(AB)=P(A)P(B). From the definition of conditional probability:

)()|( APBAP )()|( BPABP

)()()()()( BPAPBPAPBAP

Independence: Three events A,B and C are independent

only if:

)()()()(

)()()(

CPBPAPABCP

CPBPBCP

CPAPACP

BPAPABP

Random Variables: A random variable is a number chosen at

random as the outcome of an experiment. Random variable may be real or complex

and may be discrete or continuous. In S.P. ,the random variable encounter are

most often real and discrete. We can characterize a random variable by

its probability distribution or by its probability density function (pdf).

Random Variables (distribution function): The distribution function for a random

variable y is the probability that y does not exceed some value u,

)()( uyPuFy

)()()( uFvFvyuP yy

Random Variables (probability density function): The probability density function is the

derivative of the distribution:

)()( uFdu

duf yy

u y dyyfvyuP )()(

1)( yF

dyyf y

Random Variables (expected value): We can also characterize a random

variable by its statistics. The expected value of g(x) is written

E{g(x)} or <g(x)> and defined as Continuous random variable:

Discrete random variable:

dxxfxgxg )()()(

xpxgxg )()()(

Random Variables (moments): The statistics of greatest interest are the

moment of X. The kth moment of X is the expected value

of . For a discrete random variable:

kkk xpxXm )(

Random Variables (mean & variance): The first moment, ,is the mean of x.

Continuous:

Discrete:

The second central moment, also known as the variance of p(x), is given by

xxpXX )(

dxxxfX )(

Random Variables …: To estimate the statistics of a random

variable, we repeat the experiment which generates the variable a large number of times. If the experiment is run N times, then each

value x will occur Np(x) times, thus

Random Variables (Uniform density): A random variable has a uniform density

on the interval (a, b) if :

otherwise ,0

),/(1)(

bxaabxf X

bxaabax

),/()(

22 )(12

Random Variables (Gaussian density): The Gaussian, or normal density function

is given by:22 2/)(

Random Variables (…Gaussian density): The distribution function of a normal

variable is:

If we define error function as

duunxNx

),;(),;(

duexerfx u 2/2

xerfxN

Two Random Variables: If two random variables x and y are to be

considered together, they can be described in terms of their joint probability density f(x, y) or, for discrete variables, p(x, y).

Two random variable are independent if

)()(),( ypxpyxp

Two Random Variables(…Continue): Given a function g(x, y), its expected

value is defined as: Continuous:

Discrete:

And joint moment for two discrete random variable is:

dxdyyxfyxgyxg ),(),(),(

yxpyxgyxg,

),(),(),(

jiij yxpyxm

Two Random Variables(…Continue): Moments are estimated in practice by averaging

repeated measurements:

A measure of the dependence of two random variables is their correlation and the correlation of two variables is their joint second moment:

yxxypxym,

11 ),(

iij yx

Two Random Variables(…Continue): The joint second central moment of x , y is

their covariance:

If x and y are independent then their covariance is zero.

The correlation coefficient of x and y is their covariance normalized to their standard deviations:

yxmyyxxxy 11))((

Two Random Variables(…Gaussian Random Variable):

Two random variables x and y are jointly Gaussian if their density function is :

yyxxyx

Two Random Variables(…Sum of Random Variables):

The expected value of the sum of two random variables is :

This is true whether x and y are independent or notAnd also we have :

Two Random Variables(…Sum of Random Variable): The variance of the sum of the two independent

random variable is :

If two random variable are independent, the probability density of their sum is the convolution of the densities of the individual variables :

Continuous:

Discrete:

222yxyx

duuzfufzf yxyx )()()(

uyxyx uzpupzp )()()(

Central Limit Theorem Central Limit Theorem (informal

paraphrase):

If many independent random variables are summed, the probability density function (pdf) of the sum tends toward the Gaussian density, no matter what their individual densities are.

Multivariate Normal Density The normal density function can be generalized

to any number of random variables. Let X be the random vector,

The matrix R is the covariance matrix of X

(R is Positive-Definite)

1exp||)2()( 2/12/ xxQRxN n

)()()( 1 xxRxxxxQ T

TxxxxR ))((

],...,,[ 21 nXXXCol

Random Functions : A random function is one arising as the

outcome of an experiment. Random functions do not need to be

functions of time, but in all cases of interest to us they will be.

A discrete stochastic process is characterized by many probability density functions of the form,

),...,,,,,...,,,( 321321 nn ttttxxxxp

Random Functions : If the individual values of the random

signal are independent, then

If these individual probability densities are all the same, then we have a sequence of independent, identically distributed samples (i.i.d.).

),()...,(),(),...,,,,...,,( 22112121 nnnn txptxptxptttxxxp

mean & autocorrelation

The mean is the expected value of x(t) :

The autocorrelation function is the expected value of the product :

txxptxtx ),()()(

),,,()()(),( 21,

21212121

ttxxpxxtxtxttrxx

)()( 21 txtx

ensemble & time average Mean and autocorrelation can be determined in

two ways:The experiment can be repeated many times

and the average taken over all these functions. Such an average is called ensemble average.

Take any one of these function as being representative of the ensemble and find the average from a number of samples of this one function. This is called a time average.

ergodicity & stationarity

If the time average and ensemble average of a random function are the same, it is said to be ergodic.

A random function is said to be stationary if its statistics do not change as a function of time.This is also called strict sense stationarity (vs.

wide sense stationarity). Any ergodic function is also stationary.

For a stationary signal we have:

Stationarity is defined as:

And the autocorrelation function is :

xtx )(

),,(),,,( 212121 xxpttxxp 12 tt

2121 ),,()(xx

xxpxxr

When x(t) is ergodic, its mean and autocorrelation are :

1lim)()()(

cross-correlation

The cross-correlation of two ergodic random functions is :

The subscript xy indicates a cross-correlation.

xy tytxN

tytxr )()(1

lim)()()(

Random Functions (power & cross spectral density): The Fourier transform of (the

autocorrelation function of an ergodic random function) is called the power spectral density of x(t) :

The cross-spectral density of two ergodic random functions is :

jerS )()(

jxyxy erS )()(

Random Functions (…power density): For an ergodic signal x(t), can be

written as:

Then from elementary Fourier transform properties,

)()()(

)()()( xxr

Assuming x(t) is real

Random Functions (White Noise): If all values of a random signal are

uncorrelated,

Then this random function is called white noise The power spectrum of white noise is constant,

White noise is a mixture of all frequencies.

)()( 2 r

Random Signal in Linear Systems : Let T[ ] represent the linear operation; then

Given a system with impulse response h(n),

A stationary signal applied to a linear system yields a stationary output,

])([)]([ txTtxT

)()()()()( nhnxnhnxny

)()()()( hhrr xxyy

2|)(|)()( HSS xxyy

1 Review of Probability. 2 Probability Theory: Many techniques in speech processing require the...

Documents

Transcript of 1 Review of Probability. 2 Probability Theory: Many techniques in speech processing require the...

20.1 Conditional Probability€¦ · 20.1 Conditional Probability Essential Question: How do you calculate a conditional probability? Explore 1 Finding Conditional Probabilities from

Probability - University of Washington › courses › cse473 › 19sp › ...Conditional Probabilities A simple relation between joint and conditional probabilities In fact, this

October 15. In Chapter 6: 6.1 Binomial Random Variables 6.2 Calculating Binomial Probabilities 6.3 Cumulative Probabilities 6.4 Probability Calculators.

byelenin.github.io · ........................................ A Short Review of Probability Theory Probabilities, the Sample Space and Random Variables Probabilities and the Sample

Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.

Chapter 4 Introduction to Probability n Experiments, Counting Rules, and Assigning Probabilities and Assigning Probabilities n Events and Their Probability.

Probability · Two basic rules that probabilities follow: Rule 1: Probabilities are always between 0 and 1 Rule 2: A probability of 0 means an event is impossible (never occurs),

Discrete Probability Distributions - Rof's · Discrete Probability Distributions The probability distribution for a random variable describes how probabilities are distributed overthe

6.3 Conditional Probability. Calculate Conditional Probabilities Determine if events are independent.

Chapter 4 Continuous Random Variables. Continuous Probability Distributions Continuous Probability Distribution – areas under curve correspond to probabilities.

Skill 20: Calculating Probabilities. What is probability? Probability measures how likely it is for an event to occur.

Statistics Introduction to probability. Contents Experiments, Counting Rules, and Assigning Probabilities Events and Their Probability Some Basic Relationships.

Introduction to Probability Assigning Probabilities and Probability Relationships

Combining Probabilities and Conditional Probability

1 Slide Slide Probability is conditional. Theorems of increase of probabilities. Theorems of addition of probabilities.

Geometric Probability 5.8. Calculate geometric probabilities. Use geometric probability to predict results in real-world situations.

Uncertainty and probability Using probabilities Using decision trees Probability revision.

COMMON PROBABILITY DISTRIBUTIONS. PROBABILITY DISTRIBUTION The set of probabilities for the possible outcomes of a random variable is called a “probability.

Lecture 2: Conditional Probability · Outline 1 De nition & Intuition 2 Bayes’ Rule & The Law of Total Probability 3 Conditional Probabilities are Probabilities 4 Independence of

Chapters 14, 15 (part 2) Probability Trees, Odds i)Probability Trees: A Graphical Method for Complicated Probability Problems. ii)Odds and Probabilities.