COMP 776: Computer Vision - Computer Sciencelazebnik/spring09/lec01_intro.pdf• Obstacle detection,...

COMP 776: Computer VisionCOMP 776: Computer Vision

Basic InfoBasic Info

• Instructor: Svetlana Lazebnik (lazebnik@cs.unc.edu)

• Office hours: By appointment, FB 244

• Textbook (recommended): Forsyth & Ponce, Computer Vision: A Modern Approach

• Class webpage:http://www.cs.unc.edu/~lazebnik/spring09

TodayToday

• Introduction to computer vision• Course overview• Course requirements

The goal of computer visionThe goal of computer vision

• To perceive the story behind the picture

What we see What a computer seesSource: S. Narasimhan

The goal of computer visionThe goal of computer vision

• To perceive the story behind the picture

• What exactly does this mean?– Vision as a source of metric 3D information– Vision as a source of semantic information

Vision as measurement deviceVision as measurement device

Real-time stereo Structure from motion

NASA Mars Rover

Pollefeys et al.

Multi-view stereo forcommunity photo collections

Goesele et al.

Vision as a source of semantic information

slide credit: Fei-Fei, Fergus & Torralba

Object categorization

building

wallbanner

street lamp

Scene and context categorization• outdoor• city• traffic• …

Qualitative spatial information

slanted

rigid moving object

horizontal

vertical

rigid moving object

non-rigid moving object

Why study computer vision?Why study computer vision?

Personal photo albums

Surveillance and security

Movies, news, sports

Medical and scientific images

• Vision is useful: Images and video are everywhere!

Why study computer vision?Why study computer vision?

• Vision is useful• Vision is interesting• Vision is difficult

– Half of primate cerebral cortex is devoted to visual processing– Achieving human-level visual perception is probably “AI-complete”

Why is computer vision difficult?Why is computer vision difficult?

Challenges: viewpoint variation

Michelangelo 1475-1564 slide credit: Fei-Fei, Fergus & Torralba

Challenges: illumination

image credit: J. Koenderink

Challenges: scale

Challenges: deformation

Xu, Beihong 1943

Challenges: occlusion

Magritte, 1957 slide credit: Fei-Fei, Fergus & Torralba

Challenges: background clutter

Challenges: Motion

Challenges: object intra-class variation

Challenges: local ambiguity

Challenges or opportunities?Challenges or opportunities?

• Images are confusing, but they also reveal the structure of the world through numerous cues

• Our job is to interpret the cues!

Image source: J. Koenderink

Depth cues: Linear perspectiveDepth cues: Linear perspective

Depth cues: Aerial perspectiveDepth cues: Aerial perspective

Depth ordering cues: OcclusionDepth ordering cues: Occlusion

Source: J. Koenderink

Shape cues: Texture gradientShape cues: Texture gradient

Shape and lighting cues: ShadingShape and lighting cues: Shading

Position and lighting cues: Cast shadowsPosition and lighting cues: Cast shadows

Grouping cues: Similarity (color, texture,Grouping cues: Similarity (color, texture,proximity)proximity)

Grouping cues: Grouping cues: ““Common fateCommon fate””

Image credit: Arthus-Bertrand (via F. Durand)

Bottom lineBottom line

• Perception is an inherently ambiguous problem– Many different 3D scenes could have given rise to a particular 2D picture

Image source: F. Durand

Bottom lineBottom line

• Perception is an inherently ambiguous problem– Many different 3D scenes could have given rise to a particular 2D picture

• Possible solutions– Bring in more constraints (more images)– Use prior knowledge about the structure of the world

• Need a combination of different methodsImage source: F. Durand

Connections to other disciplinesConnections to other disciplines

Computer Vision

Image Processing

Machine Learning

Artificial Intelligence

Robotics

PsychologyNeuroscience

Computer Graphics

Origins of computer visionOrigins of computer vision

L. G. Roberts, Machine Perception of Three Dimensional Solids,Ph.D. thesis, MIT Department of Electrical Engineering, 1963.

Progress to dateThe next slides show some examples of what

current vision systems can do

Earth viewers (3D modeling)

Image from Microsoft’s Virtual Earth(see also: Google Earth)

Source: S. Seitz

Photosynth

http://labs.live.com/photosynth/

NOTE: Noah Snavely talk at GLUNCH tomorrow!

Source: S. Seitz

Optical character recognition (OCR)

Digit recognition, AT&T labshttp://www.research.att.com/~yann/

Technology to convert scanned docs to text• If you have a scanner, it probably came with OCR software

License plate readershttp://en.wikipedia.org/wiki/Automatic_number_plate_recognition

Source: S. Seitz

Face detection

Many new digital cameras now detect faces• Canon, Sony, Fuji, …

Source: S. Seitz

Smile detection?

Sony Cyber-shot® T70 Digital Still Camera Source: S. Seitz

Object recognition (in supermarkets)

LaneHawk by EvolutionRobotics“A smart camera is flush-mounted in the checkout lane, continuously watching for items. When an item is detected and recognized, the cashier verifies the quantity of items that were found under the basket, and continues to close the transaction. The item can remain under the basket, and with LaneHawk,you are assured to get paid for it… “

Source: S. Seitz

Face recognition

Who is she? Source: S. Seitz

Vision-based biometrics

“How the Afghan Girl was Identified by Her Iris Patterns” Read the story

Source: S. Seitz

Login without a password…

Fingerprint scanners on many new laptops,

other devices

Face recognition systems now beginning to appear more widely

http://www.sensiblevision.com/

Source: S. Seitz

Object recognition (in mobile phones)

This is becoming real:• Microsoft Research• Point & Find

Source: S. Seitz

iPhone Apps: (www.kooaba.com)

iPhone Apps: (www.snaptell.com)

The Matrix movies, ESC Entertainment, XYZRGB, NRC

Special effects: shape capture

Source: S. Seitz

Pirates of the Carribean, Industrial Light and Magic

Special effects: motion capture

Source: S. Seitz

Sports

Sportvision first down lineNice explanation on www.howstuffworks.com

Source: S. Seitz

Smart cars

Mobileye• Vision systems currently in high-end BMW, GM, Volvo models • By 2010: 70% of car manufacturers.

Slide content courtesy of Amnon Shashua

Source: S. Seitz

Vision-based interaction (and games)

Nintendo Wii has camera-based IRtracking built in. See Lee’s work atCMU on clever tricks on using it tocreate a multi-touch display!

Source: S. Seitz

Assistive technologies

Sony EyeToy

Vision in space

Vision systems (JPL) used for several tasks• Panorama stitching• 3D terrain modeling• Obstacle detection, position tracking• For more, read “Computer Vision on Mars” by Matthies et al.

NASA'S Mars Exploration Rover Spirit captured this westward view from atop a low plateau where Spirit spent the closing months of 2007.

Source: S. Seitz

Robotics

http://www.robocup.org/NASA’s Mars Spirit Roverhttp://en.wikipedia.org/wiki/Spirit_rover

Source: S. Seitz

The computer vision industryThe computer vision industry

• A list of companies here:

http://www.cs.ubc.ca/spider/lowe/vision.html

Course overviewCourse overview

I. Early vision: Image formation and processingII. Mid-level vision: Grouping and fittingIII. Multi-view geometryIV. RecognitionV. Advanced topics

I. Early visionI. Early vision

Cameras and sensorsLight and color

Linear filteringEdge detection

Feature extraction: corner and blob detection

• Basic image formation and processing

II. II. ““MidMid--level visionlevel vision””

Fitting: Least squaresHough transform

RANSAC

Alignment

• Fitting and grouping

III. MultiIII. Multi--view geometryview geometry

Projective structure from motion

Stereo

Affine structure from motion

Tomasi & Kanade (1993)

Epipolar geometry

IV. RecognitionIV. Recognition

Patch description and matching Clustering and visual vocabularies

Bag-of-features models Classification

Sources: D. Lowe, L. Fei-Fei

V. Advanced TopicsV. Advanced Topics• Time permitting…

Segmentation

Articulated models

Face detection

Motion and tracking

Course requirementsCourse requirements• Philosophy: computer vision is best experienced hands-on

• Programming assignments: 50%– Three or four assignments– Expect the first one in the next couple of classes– Brush up on your MATLAB skills (see web page for tutorial)

• Final assignment: 30% – Recognition competition– Winner gets a prize!

• Participation: 20% – Come to class regularly– Ask questions– Answer questions

Collaboration policyCollaboration policy

• Feel free to discuss assignments with each other, but coding must be done individually

• Feel free to incorporate code or tips you find on the Web, provided this doesn’t make the assignment trivial and you explicitly acknowledge your sources

• Remember: I can Google too!

For next timeFor next time

• Self-study: MATLAB tutorial • Reading: cameras and image formation (F&P chapter 1)

COMP 776: Computer Vision - Computer Sciencelazebnik/spring09/lec01_intro.pdf• Obstacle detection,...

Documents

Transcript of COMP 776: Computer Vision - Computer Sciencelazebnik/spring09/lec01_intro.pdf• Obstacle detection,...

Last time: Binocular stereo - Computer Sciencelazebnik/spring09/lec14_multiview_stereo.pdf · What is stereo vision? • Generic problem formulation: given several images of the same

AMaximumEntropyFrameworkforPart ... - Computer Sciencelazebnik/publications/iccv05.pdf · University of Illinois, USA slazebni@uiuc.edu Cordelia Schmid INRIA Rhone-Alpesˆ Montbonnot,

Face detection and recognition - Computer Sciencelazebnik/spring09/lec22_eigenfaces.pdfFace Recognition using Eigenfaces, CVPR 1991 Eigenfaces example Training images x 1,…, x N

Fitting: The Hough transform - Computer Sciencelazebnik/spring11/lec10_hough.pdf · Hough transform • An early type of voting scheme • General outline: • Discretize parameter

Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Feature-Based Image Metamorphosis - Computer Sciencelazebnik/research/fall08/qi_mo.pdf · Image Morphing History Morphing is turning one image into another through a seamless transition

First-Order LogicOrder Logic - Computer Sciencelazebnik/fall10/lec13_fol.pdf · · 2010-10-26First-order logicorder logic • Propositional logic assumes the worldlogic assumes

Review: Support vector machines - Computer Sciencelazebnik/fall09/lec06_svm.pdf · 2009-09-14 · Review: Support vector machines Margin optimization min (w;w 0) 1 2 kwk2 subject

Blob detection - Computer Sciencelazebnik/spring11/lec08_blob.pdfimage circle Laplacian. Characteristic scale • We define the characteristic scale of a blob as the scale that produces

lec23 face detection - Computer Sciencelazebnik/spring11/lec23_face_detection.pdf• A seminal approach to real-time object detection • Training is slow, but detection is very fast

Linear filtering - Computer Sciencelazebnik/spring11/lec05_filter.pdf · Microsoft PowerPoint - lec05_filter Author: lazebnik Created Date: 2/1/2011 4:55:54 PM ...

Linear filtering - Computer Sciencelazebnik/spring09/lec05_filter.pdf · Alternative idea: Median filtering • A median filter operates over a window by selecting the median intensity

Review: Image formation - Computer Sciencelazebnik/spring11/lec04_color.pdf · Review: Image formation • What determines the brightness of an image pixel? Light sourceLight source

Image segmentation - Computer Sciencelazebnik/spring09/lec24_segmentation.pdf · Mean shift clustering and segmentation • An advanced and versatile technique for clustering-based

Poisson Image Editing - Computer Sciencelazebnik/research/fall08/jia_pan.pdfComposition Image I is generated by foreground and background with alpha matte I = alpha*F + (1-alpha)*B

More details on presentations - Computer Sciencelazebnik/research/fall08/lec02_texture.pdfMore details on presentations • Aim to speak for ~50 min (after 15 min review, leaving 10

Two-view geometry - Computer Sciencelazebnik/spring09/lec12_epipolar.pdf · • Center the image data at the origin, and scale it so the mean squared distance between the origin and

Blob detection - Computer Sciencelazebnik/spring11/lec08_blob.pdf · Blob detection in 2D Laplacian of Gaussian: Circularly symmetric operator for blob detection in 2D ...

The Camera - Computer Sciencelazebnik/research/spring08/lec02_camera.pdf · Let’s design a camera • Idea 1: put a piece of film in front of an object ... Masaccio, Trinity, Santa

Tour Into the Picture - Computer Sciencelazebnik/research/fall08/haohan...YOUICHI HORRY, KEN-ICHI ANJYO, KIYOSHI ARAI PROCEEDINGS OF THE 24TH ANNUAL CONFERENCE ON COMPUTER GRAPHICS

Poisson Image Editing - Computer Sciencelazebnik/research/fall08/jia_pan.pdfComposition Image I is generated by foreground and background with alpha matte I = alphaF + (1-alpha)B