CS231a PSET 3 Review - Stanford University

CS231a PSET 3 ReviewKrishnan Srinivasan

Overview

1. Problem 1: Image Rectification

2. Problem 2: Space Carving

3. Problem 5: Tracking

4. Problem 3: Representation Learning (colab)

5. Problem 4: Monocular Depth Estimation (colab)

Problem 1 - Image Rectification

Space Carving: Overview

Problem 2 - Space Carving

Problem 2 - Space Carving (d)

Problem 2 - Space Carving (e)

Problem 5 - Tracking and Optical Flow

Luca-Kanade point feature optical flow:

● cv2.goodFeaturesToTrack(): finds N strongest corners to track in the image for optical flow

● For our problem, use it to find N = 200 points (i.e. features, maxCorners in function)

● OpenCV2 Tutorial for LK Optical Flow

https://docs.opencv.org/3.4/d4/dee/tutorial_optical_flow.html


Track a pixel in the first image frame (at timestep t0): (x, y, t0):

● Assume 5hat intensity does not change between frames: ● Optical flow equation (FO Taylor approx): where: ● Lucas-Kanade is used to compute u, v (i.e., pixel movement)● Steps:

1) Detect Shi-Tomasi corners (p0) using cv2.goodFeatures

2) iterate through frames, track points from original frame using cv2.calcOpticalFlowPyrLK

Example: Increasing maxCorners

http://www.youtube.com/watch?v=9-tNXR2IiNE

http://www.youtube.com/watch?v=ke-8ss6Nm5w

Example: Increasing qualityLevel

http://www.youtube.com/watch?v=plKxyuHjg6k

http://www.youtube.com/watch?v=hBTSPtMOOwE

Example: Increasing windowSize

http://www.youtube.com/watch?v=MWGsm8UO6bQ

http://www.youtube.com/watch?v=iIyixc14C2s


Steps:

1) Write points from Part A in homogeneous coordinates (with depth = 1)2) Scale points by depth from depth map (can check !np.isnan(depth[x, y]))3) Multiply by inverted intrinsic camera matrix to get point in 3D

In this notebook, we will be using the MNIST dataset to showcase how self-supervised representation learning can be utilized for more efficient training in downstream tasks. We will do the following things:

1. Train a classifier from scratch on the MNIST dataset and observe how fast and well it learns

2. Train useful representations via predicting digit rotations, rather than classifying digits

3. Transfer our rotation pretraining features to solve the classification task with much less data than in step 1

Problem 3 - Representation Learning

https://en.wikipedia.org/wiki/MNIST_database

PyTorch Training basics:

● torch.DataLoader and Dataset to load datasets and make batches● Defining models using the nn.Module class● using torch.optim to take gradient steps


MNISTDataset(Dataset)

● __init__: load pct% of images from processed .pt file● __getitem__: randomly rotate an image from self.imgs. Hint: use PIL.Image.rotate

to rotate image, and then return to torch.Tensor type● Hint: Use torch.tensor(rotation_idx).long() to generate rotation labels

nn.Sequential(...)

● Creates a stack of layers that pass input data through a model● nn.Linear(...) layers form weights and biases for a single



Training example (from pytorch-examples repo)

● opt.zero_grad to zero gradients before update

● loss.backward to backpropagate gradients● opt.step to update model params

https://github.com/jcjohnson/pytorch-examples

● Need to implement:○ RandomHorizontalFlip and

RandomChannelSwap in data.py○ depth_loss() in losses.py○ DenseDepth model in models.py○ prediction and l1_loss in training.py

Problem 4 - Monocular Depth Estimation

● Adapted from High Quality Monocular Depth Estimation via Transfer Learning [Alhashim and Wonka 2018]

● Training for 7 epochs on colab takes around 7/8 hours, so try to running at night (when there is less demand)

Problem 4 - Monocular Depth Estimation

Example of monocular depth estimate prediction

CS231a PSET 3 Review - Stanford University

Documents

Transcript of CS231a PSET 3 Review - Stanford University