CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of...

31
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1

Transcript of CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of...

Page 1: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

CSE 473/573 Computer Vision and Image Processing (CVIP)

Ifeoma Nwogu

Lecture 27 – Overview of probability concepts

1

Page 2: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Schedule

• Last class – Object recognition and bag-of-words

• Today– Overview of probability models

• Readings for today: – Lecture notes

2

Page 3: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Random variables

• A random variable x denotes a quantity that is uncertain

• May be result of experiment (flipping a coin) or a real world measurements (measuring temperature)

• If observe several instances of x we get different values

• Some values occur more than others and this information is captured by a probability distribution

3Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 4: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Discrete Random Variables

4Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 5: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Continuous Random Variable

5Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 6: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Joint Probability

• Consider two random variables x and y• If we observe multiple paired instances, then some

combinations of outcomes are more likely than others

• This is captured in the joint probability distribution• Written as Pr(x,y)• Can read Pr(x,y) as “probability of x and– where x and y are scalars

6Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 7: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Joint Probability (2)

• We can definitely have more than two random variables and we would write Pr(x,y,z)

• We may also write Pr(x) - the joint probability of all of the elements of the multidimensional variable x = [x1,x2….xn] and if y = [y1,y2….yn]

then we can write Pr(x,y) to represent the joint distribution of all of the elements from multidimensional variables x and y.

7Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 8: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Joint Probability (3)

8Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

2D Hinton diagram Joint distributions of one discrete and one continuous variable

Page 9: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

MarginalizationWe can recover probability distribution of any variable in a joint distribution

by integrating (or summing) over the other variables

9Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 10: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

MarginalizationWe can recover probability distribution of any variable in a joint distribution

by integrating (or summing) over the other variables

10Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Scale and Normalization

Page 11: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

MarginalizationWe can recover probability distribution of any variable in a joint distribution

by integrating (or summing) over the other variables

11Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 12: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

MarginalizationWe can recover probability distribution of any variable in a joint distribution

by integrating (or summing) over the other variables

Works in higher dimensions as well – leaves joint distribution between whatever variables are left

12Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 13: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Conditional Probability

• Conditional probability of x given that y=y1 is relative propensity of variable x to take different outcomes given that y is fixed to be equal to y1.

• Written as Pr(x | y=y1)

13Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 14: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Conditional Probability• Conditional probability can be extracted from joint probability• Extract appropriate slice and normalize

14Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 15: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Conditional Probability

• More usually written in compact form

• Can be re-arranged to give

15Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 16: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Conditional Probability

• This idea can be extended to more than two variables

16Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 17: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Bayes’ RuleFrom before:

Combining:

Re-arranging:

17Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 18: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Bayes’ Rule Terminology

Posterior – what we know about y after seeing x

Prior – what we know about y before seeing x

Likelihood – propensity for observing a certain value of x given a certain value of y

Evidence –a constant to ensure that the left hand side is a valid distribution

18Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 19: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Independence• If two variables x and y are independent then variable x tells

us nothing about variable y (and vice-versa)

19Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 20: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Independence• If two variables x and y are independent then variable x tells

us nothing about variable y (and vice-versa)

20Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 21: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Independence• When variables are independent, the joint factorizes into a

product of the marginals:

21Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 22: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Probabilistic graphical models

22

Page 23: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

What is a graphical model ?A graphical model is a way of representing probabilistic

relationships between random variables.

Variables are represented by nodes:

Conditional (in)dependencies are represented by (missing) edges:

Undirected edges simply give correlations between variables (Markov Random Field or Undirected Graphical model):

Directed edges give causality relationships (Bayesian Network or Directed Graphical Model):

Page 24: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Graphical Models• e.g. Bayesian networks, Bayes nets, Belief

nets, Markov networks, HMMs, Dynamic Bayes nets, etc.

• Themes:– representation– reasoning– learning

• Original PGM reference book by Nir Friedman and Daphne Koller.

Page 25: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Motivation for PGMs

• Let X1,…,Xp be discrete random variables

• Let P be a joint distribution over X1,…,Xp

If the variables are binary, then we need O(2p) parameters to describe P

Can we do better?• Key idea: use properties of independence

Page 26: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Motivation for PGMs (2)• Two variables X and Y are independent if– P(X = x|Y = y) = P(X = x) for all values x, y– That is, learning the values of Y does not change prediction

of X

• If X and Y are independent then – P(X,Y) = P(X|Y)P(Y) = P(X)P(Y)

• In general, if X1,…,Xp are independent, then– P(X1,…,Xp)= P(X1)...P(Xp)– Requires O(n) parameters

Page 27: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Motivation for PGMs (3)

• Unfortunately, most of random variables of interest are not independent of each other

• A more suitable notion is that of conditional independence

• Two variables X and Y are conditionally independent given Z if– P(X = x|Y = y,Z=z) = P(X = x|Z=z) for all values x,y,z– That is, learning the values of Y does not change prediction of X once we know

the value of Z

Page 28: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Probability theory provides the glue whereby the parts are combined, ensuring that the system as a whole is consistent, and providing ways to interface models to data.

The graph theoretic side of graphical models provides both an intuitively appealing interface by which humans can model highly-interacting sets of variables as well as a data structure that lends itself naturally to the design of efficient general-purpose algorithms.

Many of the classical multivariate probabilistic systems studied in fields such as statistics, systems engineering, information theory, pattern recognition and statistical mechanics are special cases of the general

graphical model formalism -- examples include mixture models, factor analysis, hidden Markov models, and Kalman filters.

In summary….

Page 29: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Slide Credits

• Simon Prince – Computer vision, models and learning

29

Page 30: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Next class

• More on Graphical models• Readings for next lecture: – Lecture notes to be posted

• Readings for today: – Lecture notes to be posted

30

Page 31: CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Questions

31