CIS 580, Machine Perception, Spring 2016 Homework 2 Due ...

CIS 580, Machine Perception, Spring 2016Homework 2

Due: 2015.02.24. 11:59AM

Instructions. Submit your answers in PDF form to Canvas. This is an individual assignment.

1 Recover camera orientationBy observing the vanishing points of lines or the vanishing line of a plane, we can estimate the camera’s orientation.In this problem, we have two pictures of Levine building shown in Figure 1 and Figure 2.

The world coordinate system is right handed with the origin at the door of the building; the vector from the doorto S 34th St defines z-axis and the vector from the door to the sky defines y-axis.

The camera intrinsic parameter K is given by

K =

894.0041 0 382.81440 893.0732 512.38070 0 1

,all numbers are in pixels.

1.1 Single vanishing point1. Compute the z-vanishing point, using the points given in Figure 1.

2. Compute the rotation angles, Pan α and Tilt β (defined in the lecture note) using the z-vanishing point.

1.2 Two vanishing points1. Compute the x/y-vanishing points in Figure 2.

2. Compute the rotation angles, Pan α, Tilt β and Yaw γ (defined in the lecture note) using x/y-vanishing points.

1

Figure 1: Camera orientation from a single vanishing point. We measured four points in the image belonging to twolines parallel with z-axis. Locations of points are specified as in the figure (best view in color).

Figure 2: Camera orientation from x-y vanishing points. We measured four points in the image belonging to two linesparallel with y-axis, marked in red, and four points belonging to two lines parallel with the x-axis, marked in green.Locations of points are specified as in the figure (best view in color).

2

2 Homography transformationHomography transformation describes the geometrical transformation between two planes. In this question, we willverify this transformation using a cell phone camera. Let’s define the world coordinate system as in Figure 3.

Recall in HW1, we learned how to locate camera optical center by converging lines as shown in Figure 3 andFigure 4. We first place our cell phone (camera) vertically on the paper, and adjusted its position so that the radiatinglines on the paper appear to be parallel to each other in the image. We denote this camera plane image plane A. We tiltthe phone (camera) forward around the ‘x’-axis by 45◦, creating image plane B. We will calculate the line equationsof the radiating lines on image plane B.

The calibration matrix for our camera is given by

K =

893.0732 0 512.38070 894.0041 382.81440 0 1

,all numbers are in pixels.

Figure 3: Homography transformation.

Figure 4: We marked four points on two radiating lines on the paper, and measured their position in the world coordi-nate (left). We measured the coordinates of their correspondences in the image A (right).

1. Let xA = KλH1X be the homography projection of points from the paper plane onto the image plane A, wherex and X are the homogeneous coordinates of points on the image plane and paper plane respectively.

We marked four points on two radiating lines on the paper plane, and measured their coordinates X. On imageplane A, we measured coordinates xA of these points as shown in Figure 4.

From these four point correspondences, we computed the homography matrix H1 to be

H1 =

2.6698 0.0084 −0.27370.0752 −0.0196 13.76340.1596 2.6648 1.0000

.3

The two lines in image plane A are parallel and intersect at a point at infinity. Write down the point at infinity inhomogeneous coordinate representation. Use the homography H1 and K to project this point back to the paperplane, and obtain its exact 3D coordinates.

2. Tilt the camera forward about ‘x’-axis by 45◦. Write down the camera rotation matrix R associated with thistransformation. Assume the camera rotates with respective to its optical center.Note that R maps the previous camera coordinates to the current camera coordinates Pcam a f ter = RPcam be f ore,where Pcam be f ore and Pcam a f ter are 3D point coordinates under camera coordinate system before / after the tiltrespectively.

3. Compute the homography transformation H that maps points from image plane A to image plane B.Hint: The projected 2D point on the image plane B is xB = KRλH1X.

4. For these two radiating lines we measured in Figure 4, compute their homogeneous line representation on imageplane B.Hint: l′ = H−T l.

5. Compute the intersection of these two radiating lines on image plane B. Are they converging or diverging?

3 Estimate the height of objectsWe can estimate the height of any object on the ground using three measurements in the image: 1) horizon line, 2)vanishing point in Z (perpendicular to the ground plane), and 3) a known object height on the ground. In this question,we will revisit the problem of estimating object heights, and solve the problem by cross ratio when image plane is notperpendicular to the ground plane.

Figure 5: Single View Metrology: estimating the height of objects using a known reference object.

1. Take a picture of Levine building. Include an object, or friend, with a known height in the picture. Make sure thebottom and top of the object (or your friend) are in the field of view, and the image plane is NOT perpendicularto the ground plane. In other words, the vertical vanishing point should NOT be at infinity.

2. Compute two vanishing points by intersecting parallel lines on the ground plane or on the building facade.

3. Compute and draw the horizon (a vanishing line) in the image.

4. Compute the vanishing point in the Z axis using vertical lines on the facade.

5. Compute the height of the front door of Levine building in mm by cross ratio.

4

4 Camera Rotation

(a) Two meanings of rotation matrix (b) Rotation Combination

Figure 6: Camera Rotation

Recall the camera projection equation is defined as

x = K [R|t] X, (1)

where R and t are the camera extrinsic parameters, and K is the camera intrinsic parameter.In this problem, we will familiarize ourselves with the concept of rotation matrix R . Mathematically, the rotation

matrix R can be used as following (3D example): xb

yb

zb

= R

xa

ya

za

This equation has two geometrical meanings for the same rotation action, as illustrated in Figure 6(a):

• Rotation of point a, (xa, ya, za), to point b, (xb, yb, zb), in the same coordinate system shown in Figure 6(a)(Left);

• Rotation of the coordinate system b to a, and transfer the point p coordinate in a, (xa, ya, za), to its coordinate inb, (xb, yb, zb), as shown in Figure 6(a)(Right).

The camera extrinsic parameter R represents the coordinate transformation from the world coordinate to the cameracoordinate. Geometrically the R corresponds to the rotation action that moves the world coordinate system to thecamera coordinate system. This is the inverse of the rotation action experienced by the camera. If we rotate thecamera by RC, the camera extrinsic parameters R = inv(RC) = RC

′.Another property of the rotation is that rotation actions can be composed in sequence (show in Figure 6(b)): if a

rotation a can be separated to two rotations: first rotation b, and then rotation c, we have:

Ra = RcRb

Through HW1 problem 4 on Dolly Zoom, we learned how the projected point position changes according to thecamera focal length and camera position. We will extend this example to a more general one including camera rotation.

We will use the same synthetic scene as HW1. Figure 7(a) illustrates the top view of the simple synthetic sceneand the camera placement. There are three objects in the scene, denoted as A (green cube), B (triangular pyramid) andC (blue cube).

We will use the following settings:

• The image size is 1920 × 1080, in square pixels, and the image center is aligned with the optical center ray.

• The image plane is perpendicular to the optical center ray.

5

(a) beginning status (b) rotate world

(c) rotate camera (d) rotate object

Figure 7: Top view of the synthetic scene.

• For the first frame, the image plane is parallel to the xy plane. The horizontal direction is x-axis. The verticaldirection is y-axis.

• For the first frame, the camera center, denoted as Oc, is located at the origin.

• For the first frame, the camera focal length is fo = 400 (in pixel unit).

1. Constructing intrinsic K. Given the 3D position of all the visible vertices, re-render the video of Dolly Zoom(similar to HW1 4.4, but shorter).

There are two functions need to be completed

[ K ] = intrinsic_para( f, alpha, principal_point , s)GOAL

construct camera intrinsic matrixINPUT

f: double, focal lengthalpha: 1*2 vector, pixel scaleprincipal_point: 1*2 vector, principal point positions: double, slant factor

OUTPUTK: 3*3 matrix, intrinsic K matrix

[ p2d ] = project( K, p3d )GOAL

use for compute vertex image position from given camera intrinsic matrixInput:

K: 3*3 matrix, intrinsic K matrixp3d: n by 3, 3D vertex position in world coordinate system

Output:p2d: n by 2, each row represents vertex image position , in pixel unit

Complete and use generate video 1.m to render the video.

In this question, you should re-use your compute f.m function in HW1. The 3D position (X,Y,Z) for eachvisible vertex in Figure 8 was given in the data.mat file, containing points A (n-by-3), points B and points Cmatrices.

2. Rotating objects about the world origin. Reset the camera to its initial position (keep focal length as 400 pixels),render the video for a rotating 3D world (all objects) about the y-axis. We will first rotate the objects by N◦ leftabout the “y”-axis, and then by a sequence of rotation right about the “y”-axis by M◦, as show in Figure 7(b).

6

Figure 8: Camera image rendered for the synthetic scene: the first frame with pos=0, and f=400.

Rotating N◦ left is given as Rstart, and incremental rotation of M◦ is given as Rdelta, both are stored in input R.mat.We also know the total frame number is 31.

There is one function needs to be completed[ p3d_new ] = rotate_world( frame, p3d )GOAL

rotate the 3D points about y axisINPUT

frame: frame numberp3d: n by 3, 3D vertex position in world coordinate system.

OUTPUTp3d_new: n by 3, 3D vertex position after rotation


3. Rotating camera. Reset the camera to its initial position (keep focal length as 400 pixels), render the video for arotating camera about the y-axis as shown in Figure 7(c). We will use the same rotation action: first rotating thecamera by N◦ left about the “y”-axis, and then a sequence of rotation right about the “y”-axis by M◦. Hint: youwould need to convert the camera rotation to the extrinsic parameters R and t first.

There are two functions need to be completed[ p3d_c ] = world2camera( R, t, p3d )GOAL

transform points from 3D world to camera localINPUT

R: 3 by 3, camera extrinsic parameterst: 1 by 3, camera extrinsic parametersp3d: n by 3, 3D vertex position in world coordinate system.


[ R, t ] = extrinsic_para( frame )GOAL

compute camera extrinsic parameters for specific frameINPUT

frame: frame numberOUTPUT

R: 3 by 3, camera extrinsic parameterst: 1 by 3, camera extrinsic parameters

Run generate video 3.m to render the video.

7

4. Spinning object A about itself. Reset the camera to its initial position (keep focal length as 400 pixels), renderthe video for a rotating object A about the axis form by A3A4 as shown in Figure 7(d). We will use the samerotation action: first rotating A by N◦ left about the A3A4, and then a sequence of rotation right about the A3A4by M◦.

There is one function needs to be completed

[ p3d_new ] = rotate_object( frame, p3d )GOAL

rotate the 3D points about specific lineINPUT

frame: frame numberp3d: n by 3, 3D vertex position in world coordinate system.



Hint: Separate the rotation about A3A4 into a translation and a rotation in the world coordinate system first.

8

CIS 580, Machine Perception, Spring 2016 Homework 2 Due ...

Documents

Transcript of CIS 580, Machine Perception, Spring 2016 Homework 2 Due ...