Are Categories Necessary in a Data-Rich World? Alexei (Alyosha) Efros CMU Joint work with Tomasz...

Are Categories Necessaryin a Data-Rich World?

Alexei (Alyosha) EfrosCMU

Joint work with Tomasz Malisiewicz

Acknowledgements

Talks by Moshe Bar; writings of Shimon Edelman

Murphy Big Book of Concepts

Weinberger Everything is Miscellaneous

Many great discussions with many colleagues, especially Tomasz Malisiewicz, James Hays, and Derek Hoiem

“Unreasonable Effectiveness of Data”

• Parts of our world can be explained by elegant mathematics:– physics, chemistry, astronomy, etc.

• But much cannot:– psychology, genetics, economics, etc.

• Enter: The Magic of Big Data– Great advances in several fields:

• e.g. speech recognition, machine translation, search engines

[Halevy, Norvig, Pereira 2009]

Categorization vs. The Data

Philosophy and PsychologyPhilosophy and Psychology

LanguageLanguageArts and recreationArts and recreation

LiteratureLiterature

TechnologyTechnology ReligionReligion

Weinberger, Everything is Miscellaneous

categorization is losing…

What’s in a name?

“…That which we call a roseBy any other name would smell as sweet.”

“chair” category (PASCAL VOC)

“train” category (PASCAL VOC)

Why Categorize?

1. Knowledge Transfer

2. Communication

Tigercat

Leopard

Classical View of Categories

•Dates back to Plato & Aristotle 1. Categories are defined by a

list of properties shared by all elements in a category

2. Category membership is binary

3. Every member in the category is equal

Problems with Classical View• Humans don’t do this!

– People don’t rely on abstract definitions / lists of shared properties (Wittgenstein 1953, Rosch 1973)• e.g. define the properties shared by all “games” • e.g. are curtains furniture? Are olives fruit?

– Typicality• e.g. Chicken -> bird, but bird -> eagle, pigeon, etc.

– Language-dependent• e.g. “Women, Fire, and Dangerous Things” category is

Australian aboriginal language (Lakoff 1987)

– Doesn’t work even in human-defined domains• e.g. Is Pluto a planet?

Visual Problems with Categories• A lot of categories are

functional

• Categories are 3D, but images are 2D

• World is too varied

Typical HOG car detector

Felzenszwalb et al, PASCAL 2007

Why not?

submitted to CVPR 2011

Semantic -> Visual Categories

Aspect ratiosplits

“parts” Poselets

All use a priori domain information

Fundamental Problem with Categorization

Making decisions too early!Like Amazinn.com, can we just categorize at

run-time, once we know the task!

On-the-fly Categorization?

1. Knowledge Transfer

2. Communication

Association instead of categorization

Ask not “what is this?”, ask “what is this like”

– Moshe Bar

• Exemplar Theory (Medin & Schaffer 1978, Nosofsky 1986, Krushke 1992)–categories represented in terms of remembered objects

(exemplars)–Similarity is measured between input and all exemplars–think non-parametric density estimation

“What is this?”

CarCarCar

Building

Input Image

He 2004, Tu 2004, Shotton 2006, Galleguillos 2008, Fei-Fei 2009, Gould 2009, etc.

“What is this like?”

Malisiewicz & Efros, CVPR’08

What is the ultimate goal?

• Understanding / Parsing Images

• A “what is it like?” machine

Our Contributions

• Posing Recognition as Association–Use large number of object exemplars

•Learning Object Similarity–Different distance function per exemplar

•Recognition-Based Object Segmentation– Use multiple segmentation approach

Visual Associations

• How are objects similar?

Distance “Similarity” Functions

• Positive Linear Combinations of Elementary Distances Computed Over 14 Features

Building e Distance Function

Building e

Learning Object Similarity

• Learn a different distance function for each exemplar in training set

• Formulation is similar to Frome et al [1,2][1] Andrea Frome, Yoram Singer, Jitendra Malik. "Image Retrieval and Recognition Using Local Distance Functions." In NIPS, 2006.

[2] Andrea Frome, Yoram Singer, Fei Sha, Jitendra Malik. "Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification." In ICCV, 2007.

Non-parametric density estimation

Color Dimension

Class 1Class 2

Class 3

Color Dimension

Class 1Class 2

Class 3

Color Dimension

Class 1Class 2

Class 3

Learning Distance Functions

28Dshape

Dcolor

Focal Exemplar

“similar” side

DecisionDecisionBoundaryBoundary

“dissimilar” side

Don’t Care

Visualizing Distance Functions (Training Set)

Top Neighbors with Tex-Hist Dist

Top Neighbors with Learned Dist

Visualizing Distance Functions (Training Set)

Labels Crossing Boundary

Object Segmentation via Recognition

• Generate Multiple Segmentations (Hoiem 2005, Russell 2006, Malisiewicz 2007)

– Mean-Shift and Normalized Cuts

– Use pairs and triplets of adjacent segments

– Generate about 10,000 segments per image

• Enhance training with bad segments

• Apply learned distance functions to bottom-up segments

Example AssociationsBottom-Up Segments

Quantitative Evaluation

Object hypothesis is correct if labels match and OS > .5

*We do not penalize for multiple correct overlapping associations

OS(A,B) = Overlap Score = intersection(A,B) / union(A,B)

Towards Image Parsing

Need for Context 35

Image Parsing with Context

Bush’s Memex (1945)• Store publications, correspondence, personal work, on

microfilm• Items retrieved rapidly using index codes

–Builds on “rapid selector”• Can annotate text with margin notes, comments• Can construct a trail through the material and save it

–Roots of hypertext• Acts as an external memory

Visual Memex, a proposal[Malisiewicz & Efros]

Nodes = instancesEdges = associations

types of edges:• visual similarity• spatial, temporal co-occurrence• geometric structure• language• geography•..

New object

Torralba’s Context Challenge

Slide by Antonio Torralba

Torralba’s Context Challenge

Chance ~ 1/30000 Slide by Antonio Torralba

Our Challenge Setup

Malisiewicz & Efros, NIPS’09

3 models

• Visual Memex: exemplars, non-parametric object-object relationships– Recurse through the graph

• Baseline: CoLA: categories, parametric object-object relationships

• Reduced Memex: categories, non-parametric relationships

Qual. results

Quant. results

Next Step: top-down segmentation

Visual Memex

Take Home Message

• Categorization is not a goal in itself–Rather, it is a means for transferring knowledge onto a new instance

• Skipping explicit categorization might make things easier, not harder–The “harder intermediate problem” syndrome

• Keeping around all your data isn’t so bad… –you never know when you will need it

Questions?

Are Categories Necessary in a Data-Rich World? Alexei (Alyosha) Efros CMU Joint work with Tomasz...

Documents

Transcript of Are Categories Necessary in a Data-Rich World? Alexei (Alyosha) Efros CMU Joint work with Tomasz...

Learning Category-Specific Mesh Reconstruction from Image ... · Angjoo Kanazawa ∗, Shubham Tulsiani , Alexei A. Efros, Jitendra Malik University of California, Berkeley {kanazawa,shubhtuls,efros,malik}@eecs.berkeley.edu

Wrap Up 15-463: Rendering and Image Processing Alexei Efros.

CS143, Brown James Hays Stereo and Structure from Motion Many slides by Kristen Grauman, Robert Collins, Derek Hoiem, Alyosha Efros, and Svetlana Lazebnik.

Image Mosaics and Panoramas Building a Panorama - IDC · 1 1 Image Mosaics and Panoramas Lots of slides stolen from Bill Freeman, who stole them from Alyosha Efros, who stole them

Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

HOGgles: Visualizing Object Detection Features...HOGgles: Visualizing Object Detection Features∗ Carl Vondrick, Aditya Khosla, Tomasz Malisiewicz, Antonio Torralba Massachusetts

Richard Zhang Phillip Isola Alexei A. Efros Berkeley AI Research … · 2016. 11. 30. · Richard Zhang Phillip Isola Alexei A. Efros Berkeley AI Research (BAIR) Laboratory University

16-824: Learning-based Methods in Vision Instructors: Alexei (Alyosha) Efros efros@cs.cmu.edu, 225 Smith Hall efros@cs.cmu.edu Leon Sigal lsigal@disneyresearch.com,

Recognizing Action at a Distance - Computer Graphicsgraphics.cs.cmu.edu/people/efros/research/action/efros... · 2005-06-03 · motion descriptors nearest appearance neighbor(s) Retrieve

Holiness and Glory in the Bible- Efros, Israel

CS 194/294-26: Image Manipulation and Computational ...cs194-26/fa16/Lectures/... · A bit about me Alexei (Alyosha) Efros Moved (back) to UC Berkeley in 2013, after decade at CMU

Fcv scene efros

Learning and Inference in Vision: from Features to Scene Understanding Jonathan Huang, Tomasz Malisiewicz MLD Student Research Symposium, 2009.

Matting & Compositingvision.cse.psu.edu/courses/CompPhoto/basicmatting.pdf · – Page layout: extract objects, magazine covers. Motivation Slide from Alyosha Efros. Motivation From

Automatic Photo Pop-up Derek Hoiem Alexei A. Efros Martial Hebert.

Image Warping Computational Photography Derek Hoiem, University of Illinois 09/27/12 Many slides from Alyosha Efros + Steve Seitz Photo by Sean Carroll.

Face Collections 15-463: Rendering and Image Processing Alexei Efros.

Texture Synthesis by Non-parametric Sampling / Image Quilting for Texture Synthesis & Transfer by Efros and Leung / Efros and Freeman ICCV ’99 / SIGGRAPH.

Image Morphing 15-463: Rendering and Image Processing Alexei Efros.

Computational Photography CS498dwh - Course Websites 10... · Image Warping Computational Photography Derek Hoiem, University of Illinois 09/28/17 Many slides from Alyosha Efros +