midterm - Computer Sciencezickler/download/midterm.pdf · Midterm Exam CS283, Computer Vision...

Midterm ExamCS283, Computer Vision

Harvard University

Nov. 20, 2009

You have two hours to complete this exam. Show all of your work to get full credit, and write your work inthe blue books provided. Work written on this document will not be evaluated.

(Possibly) Useful Information

Given vectors u = [u1, u2, u3]⊤, v = [v1, v2, v3]

⊤ we can write the following.

u⊤v = u1v1 + u2v2 + u3v3; u× v = −v × u =

u2v3 − u3u2

u3v1 − u1v3

u1v2 − u2v1

A one-dimensional signal x[n], n = 0, . . . , N − 1 and its discrete Fourier transform X [u] are related by

X [u] =

N−1∑

n=0

x[n] exp(−j2πnu/N), x[n] =1

N

N−1∑

u=0

X [u] exp(j2πnu/N)

Also, as you proved in Assignment Four, if y[n] = (−1)nx[n], their Fourier transforms are related by Y [u] =X [N/2 − u], where N is the length of signal x[n].

Below are expressions for the general multi-variate Gaussian distribution for random variable x ∈ Rd having

mean µ and covariance matrix Σ, as well as the special case in which the covariance matrix is a scaledidentity matrix Σ = σ2I.

p(x) =1

(2π)d/2|Σ|1/2exp

[

−1

2(x − µ)⊤Σ−1x − µ)

]

, p(x) =1

(2π)d/2σdexp

[

−||x − µ||22σ2

]

>> help ones

ONES Ones array.

ONES(N) is an N-by-N matrix of ones.

ONES(M,N) or ONES([M,N]) is an M-by-N matrix of ones.

>> help zeros

ZEROS Zeros array.

ZEROS(N) is an N-by-N matrix of zeros.

ZEROS(M,N) or ZEROS([M,N]) is an M-by-N matrix of zeros.

>> help repmat

REPMAT Replicate and tile an array.

B = repmat(A,M,N) creates a large matrix B consisting of an M-by-N

tiling of copies of A. The size of B is [size(A,1)*M, size(A,2)*N].

The statement repmat(A,N) creates an N-by-N tiling.

B = REPMAT(A,[M N]) accomplishes the same result as repmat(A,M,N).

1

Question 1 (12 points)

The equation for a conic in the plane using inhomogeneous coordinates (x, y) is

ax2 + bxy + cy2 + dx + ey + f = 0. (1)

a. Suppose you are given a set of inhomogeneous points xi = (xi, yi), i = 1, . . . , N . Derive an expressionfor the least squares estimate of the conic c = (a, b, c, d, e, f) passing through those points. (Yourexpression may take the form of a null vector or eigenvector of a matrix.)

b. What is the minimum value of N that allows a unique solution for c?

c. “Homogenize” Eq. 1 by making the substitutions x → x1/x3, y → x2/x3, and show that in terms ofhomogeneous coordinates (x = (x1, x2, x3)) the conic can be expressed in matrix form,

x⊤Cx = 0,

with a symmetric matrix C.

d. Suppose we apply a projective transformation to our points: x′

i = Hxi. The transformed points x′

i willlie on a transformed conic represented by a new symmetric matrix C′. What is the relation betweenC′ and C?


Consider a camera with intrinsic parameter matrix

K =

300 0 3000 300 2000 0 1

and complete camera matrix

P =

300 0 300 3000 300 200 −4000 0 1 −2

.

Suppose we add a new camera P′ with the same orientation as that of camera P. The camera centre ofthis second camera is located at [3 0 2] (an inhomogeneous point in R

3), and it has a focal length that isone-third that of P.

a. What is the camera center for the first camera (P) in inhomogeneous coordinates?

b. Compute the camera matrix P′.

c. Compute the epipole in each camera, expressed in inhomogeneous coordinates.

d. Are the epipolar lines in the first camera parallel to one another? Justify your answer.

2


Consider a surface patch with BRDF

fr(s, v, n) =1√

n⊤s√

n⊤v,

where n, v, and s are the surface normal, view direction and light source direction, respectively. Here, theBRDF is expressed in ‘global coordinates’, instead of writing the input and output directions in a localcoordinate system relative to the surface normal. (The two representations are equivalent. That is, at asmall surface patch with normal vector n, given surface irradiance due to radiance from direction s, this tellsus the the value of the radiance that is emitted in direction v.)

Suppose we view such a surface patch from a known direction v, and suppose we capture two radiancemeasurements E1 and E2 under unit-strength distant lighting from known directions s1 and s2.

a. Write expressions for the measurements E1 and E2 in terms of the view, normal and source directions.

b. Show that you can recover the the surface normal from these two measurements. (Hint: derive anexpression for n up to scale, and then argue that the sign ambiguity can be resolved by requiringvisibility from direction v.)


a. Let xz [n], n ∈ {0, . . . , 2N − 1} be a one-dimensional image of length 2N with zeros at every alternatepixel. That is, xz [n] = 0 for every odd n. Now suppose we down-sample xz[n] by a factor of two toobtain xdz[n] = xz[2n], which is of length N . Give an expression for Xdz[u], u ∈ {0, . . .N −1} in termsof Xz[u], where Xdz and Xz are the one-dimensional discrete Fourier transforms of xdz[n] and xz [n]respectively.

b. Next, consider a general one-dimensional image x[n] of length 2N (where all elements can now benon-zero). Suppose we downsample x[n] to get xd[n] = x[2n], essentially throwing away the odd pixelsin x[n]. For this case, what is the expression for Xd[u] in terms of X [u]?


According to the principal of trichromacy, given three primaries (i.e., light sources with fixed spectraldistributions) P1(λ), P2(λ), P3(λ) a typical person can adjust the weights (the brightness) of these lightsources so that the resulting mixture looks the same as any given test light T (λ). We write this usingalgebraic notation as:

T (λ) ≡ w1P1(λ) + w2P2(λ) + w3P3(λ),

where ≡ means “looks the same as”.An important caveat is that subtractive matching must be allowed, meaning that the person needs to

have the ability to add some of the primaries to the test light. This can be viewed as adjusting a primaryto a “negative brightness,” if we are willing to apply the algebraic manipulation

T (λ) + w1P1(λ) ≡ w2P2(λ) + w3P3(λ) =⇒ T (λ) ≡ −w1P1(λ) + w2P2(λ) + w3P3(λ)

Explain why the need for subtractive matching implies that if the primaries Pi(λ) are all positive functionsof λ (which they are if we are using real lights) the corresponding color matching functions must be negativeat some wavelengths.

3


Suppose we are constructing a binary (two-category) classifier to discriminate between classes ω1 and ω2

based on measurements x ∈ Rd. We are interested in zero-one loss, so our decision rule is

Rule 1 : Decide ω1 if p(ω1|x) > p(ω2|x).

Another way of characterizing a classifier is through discriminant functions, and when there are two cate-gories, there are two equivalent ways to do this. We can define two discriminant functions g1(x) and g2(x)and use the rule

Decide ω1 if g1(x) > g2(x)

or we can define a single discriminant function g(x) , g1(x) − g2(x) and use the rule

Decide ω1 if g(x) > 0.

a. Assuming that the class conditional densities p(x|ωi) and prior distributions p(ωi) are known, showthat the discriminant function for Rule 1 can be written

g(x) = logp(x|ω1)

p(x|ω1)+ log

p(ω1)

p(ω2).

b. Suppose the two classes are equally probable, so p(ω1) = p(ω2), and suppose the class conditionaldensities are Gaussian distributions with means µ

1and µ

2and covariance matrices that are diagonal

and equal: Σ1 = Σ2 = σ2I. Show that the discriminant function for Rule 1 can now be written

g(x) = w⊤x + b.

where w,b ∈ Rd are vectors that depend on the means µi and the variance parameter σ. (Hint :

expand the quadratic forms ||x − µi||2 = x⊤x − 2µ⊤

i x + µ⊤

i µi and think about which terms can beignored.)

c. The decision rule Rule 1 induces a decision surface in the measurement space, with measurementsbeing assigned to one class or the other depending on which side of the surface they lie. Based on part(b), provide a geometric interpretation of the decision surface.


The next page contains Matlab code that clusters three-dimensional points using the Expectation-Maximizationalgorithm and a mixture-of-Gaussian model for the data.

a. Which two lines of this code must be modified so that the function performs the k-means algorithminstead?

b. Substitute new code for these lines to implement this change. (While this can be done by insertingonly two new lines of code, you are free to insert multiple lines of code in place of each of the tworemoved lines.)

4

1 function Zo=EM(X,Mo,Co)

2 %EM Expectation-Maximization for Gaussian mixtures in 3D.

3 % Input: X = (numpts) x 3 array of points

4 % Mo= (numclusters) x 3 array of initial cluster means

5 % Co= 3 x 3 x (numclusters) array of initial cluster

6 % covariance matrices

7 % Output: Zo= (numpts) x 1 vector with cluster number

8 % (i.e., one of 1,2,...(numclusters)) for each point

9

10 numpts=size(X,1);

11 numclusters=size(Mo,1);

12

13 % support maps for each point

14 Z=zeros(numclusters,numpts);

15

16 % mixture weights are initially assumed uniform

17 weights=ones(numclusters,1)/numclusters;

18

19 % Allocate space for mean and covariance at each iteration.

20 % Initialize to Mo and Co.

21 M=Mo; C=Co;

22

23 % repeat for ten iterations

24 for n=1:10

25

26 % E-step

27 for c=1:numclusters

28 Z(c,:)=weights(c)*gaussian(X,M(c,:),C(:,:,c))’;

29 end

30 Z=Z./repmat(sum(Z),[numclusters,1]);

31

32 % M-step

33 weights = mean(Z,2);

34 for c=1:numclusters

35 Xm=X-repmat(M(c,:),[numpts,1]);

36 C(:,:,c)=(repmat(Z(c,:),[3,1]).*Xm’)*Xm/sum(Z(c,:));

37 M(c,:)=sum(repmat(Z(c,:),[3,1]).*X’,2)/sum(Z(c,:));

38 end

39 end

40

41 % Final label for each point is cluster with maximum support

42 [y,Zo]=max(Z);

43 Zo=Zo’;

44

45 %%% SUB-ROUTINES

46

47 function G=gaussian(X,M,C)

48 % Evaluate multi-variate Gaussian with mean M and covariance C

49 % at points X.

50

51 ndims=length(M);

52 numpts=size(X,1);

53 X=X-repmat(M(:)’,[numpts,1]);

54 G=exp(-(sum(X’.*(inv(C)*X’)))’/0.5)/sqrt(((2*pi)^ndims)*det(C));

5


Consider the six textures below, which are numbered (i)–(vi). Below the textures are six sets of graphs,labeled (a)–(f). Each set of graphs corresponds to one of the six textures, and each set contains a normalizedhistogram (pz(z)) for the gray levels in the texture as well as two functions computed from the Fourierspectrum:

S(θ) =∫ rmax

0F (r, θ)dr

S(r) =∫ π

0F (r, θ)dθ

(r, θ)

r

θ

where F (r, θ) is a centered Fourier spectrum written in polar coordinates as shown above-right. Match eachtexture to it’s corresponding set of graphs by writing a label ((a)-(f)) for each texture ((i)-(vi)).

(i) (ii) (iii) (iv) (v) (vi)

(a)

0 50 100 150 200 250 3000

0.01

0.02

0.03

0.04

0.05

0.06

Radius

0 20 40 60 80 100 120 140 160 1800.005

0.01

0.015

0.02

0 50 100 150 200 2500

0.02

0.04

0.06

0.08

0.1

(b)

0 50 100 150 200 250 3000

0.02

0.04

0.06

0.08

0.1

Radius

0 20 40 60 80 100 120 140 160 1805

5.2

5.4

5.6

5.8

6

6.2

6.4

6.6x 10−3

0 50 100 150 200 2500

0.005

0.01

0.015

0.02

0.025

0.03

(c)

0 50 100 150 200 250 3000

0.005

0.01

0.015

0.02

0.025

Radius

0 20 40 60 80 100 120 140 160 1804.8

5

5.2

5.4

5.6

5.8

6

6.2

6.4x 10−3

0 50 100 150 200 2500

0.01

0.02

0.03

0.04

0.05

(d)

0 50 100 150 200 250 3000

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

Radius

0 20 40 60 80 100 120 140 160 1804

6

8

10

12

14x 10−3

0 50 100 150 200 2500

0.01

0.02

0.03

0.04

0.05

0.06

(e)

0 50 100 150 200 250 3000

0.005

0.01

0.015

0.02

0.025

0.03

0.035

Radius

0 20 40 60 80 100 120 140 160 1805

6

7

8

9

10x 10−3

0 50 100 150 200 2500

0.02

0.04

0.06

0.08

0.1

(f)

0 50 100 150 200 250 3000

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Radius

0 20 40 60 80 100 120 140 160 1804

5

6

7

8

9

10

11

12x 10−3

0 50 100 150 200 2500

0.01

0.02

0.03

0.04

0.05

0.06

6

midterm - Computer Sciencezickler/download/midterm.pdf · Midterm Exam CS283, Computer Vision...

Documents

Transcript of midterm - Computer Sciencezickler/download/midterm.pdf · Midterm Exam CS283, Computer Vision...