CS231a PSET 3 Review - Stanford University
Transcript of CS231a PSET 3 Review - Stanford University
CS231a PSET 3 ReviewKrishnan Srinivasan
Overview
1. Problem 1: Image Rectification
2. Problem 2: Space Carving
3. Problem 5: Tracking
4. Problem 3: Representation Learning (colab)
5. Problem 4: Monocular Depth Estimation (colab)
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Problem 1 - Image Rectification
Space Carving: Overview
Space Carving: Overview
Problem 2 - Space Carving
Problem 2 - Space Carving
Problem 2 - Space Carving (d)
Problem 2 - Space Carving (e)
Problem 5 - Tracking and Optical Flow
Luca-Kanade point feature optical flow:
● cv2.goodFeaturesToTrack(): finds N strongest corners to track in the image for optical flow
● For our problem, use it to find N = 200 points (i.e. features, maxCorners in function)
● OpenCV2 Tutorial for LK Optical Flow
Problem 5 - Tracking and Optical Flow
Track a pixel in the first image frame (at timestep t0): (x, y, t0):
● Assume 5hat intensity does not change between frames: ● Optical flow equation (FO Taylor approx): where: ● Lucas-Kanade is used to compute u, v (i.e., pixel movement)● Steps:
1) Detect Shi-Tomasi corners (p0) using cv2.goodFeatures
2) iterate through frames, track points from original frame using cv2.calcOpticalFlowPyrLK
Problem 5 - Tracking and Optical Flow
Example: Increasing maxCorners
Example: Increasing qualityLevel
Example: Increasing windowSize
Problem 5 - Tracking and Optical Flow
Steps:
1) Write points from Part A in homogeneous coordinates (with depth = 1)2) Scale points by depth from depth map (can check !np.isnan(depth[x, y]))3) Multiply by inverted intrinsic camera matrix to get point in 3D
In this notebook, we will be using the MNIST dataset to showcase how self-supervised representation learning can be utilized for more efficient training in downstream tasks. We will do the following things:
1. Train a classifier from scratch on the MNIST dataset and observe how fast and well it learns
2. Train useful representations via predicting digit rotations, rather than classifying digits
3. Transfer our rotation pretraining features to solve the classification task with much less data than in step 1
Problem 3 - Representation Learning
PyTorch Training basics:
● torch.DataLoader and Dataset to load datasets and make batches● Defining models using the nn.Module class● using torch.optim to take gradient steps
Problem 3 - Representation Learning
MNISTDataset(Dataset)
● __init__: load pct% of images from processed .pt file● __getitem__: randomly rotate an image from self.imgs. Hint: use PIL.Image.rotate
to rotate image, and then return to torch.Tensor type● Hint: Use torch.tensor(rotation_idx).long() to generate rotation labels
nn.Sequential(...)
● Creates a stack of layers that pass input data through a model● nn.Linear(...) layers form weights and biases for a single
Problem 3 - Representation Learning
Problem 3 - Representation Learning
Training example (from pytorch-examples repo)
● opt.zero_grad to zero gradients before update
● loss.backward to backpropagate gradients● opt.step to update model params
● Need to implement:○ RandomHorizontalFlip and
RandomChannelSwap in data.py○ depth_loss() in losses.py○ DenseDepth model in models.py○ prediction and l1_loss in training.py
Problem 4 - Monocular Depth Estimation
● Adapted from High Quality Monocular Depth Estimation via Transfer Learning [Alhashim and Wonka 2018]
● Training for 7 epochs on colab takes around 7/8 hours, so try to running at night (when there is less demand)
Problem 4 - Monocular Depth Estimation
Example of monocular depth estimate prediction