Class 6: Attributes and Semantic Features

62
Rogerio Feris, March 6, 2014 EECS 6890 – Topics in Information Processing Spring 2014, Columbia University http://rogerioferis.com/VisualRecognitionAndSearch2014 Class 6: Attributes and Semantic Features

Transcript of Class 6: Attributes and Semantic Features

Rogerio Feris, March 6, 2014 EECS 6890 – Topics in Information Processing

Spring 2014, Columbia University http://rogerioferis.com/VisualRecognitionAndSearch2014

Class 6: Attributes and Semantic Features

Visual Recognition And Search Columbia University, Spring 2014

Paper Review Reminder

Paper review due March 11 (solo, no groups):

Perronnin et al, Improving the Fisher Kernel for Large-Scale Image Classification, ECCV 2010

You can use up to 3 late days over the course of the semester

Required content (1-2 pages):

Summary Strengths and Weaknesses Experimental Analysis Proposed Extensions

Check more details at:

http://rogerioferis.com/VisualRecognitionAndSearch2014/PaperReviews.html

Visual Recognition And Search Columbia University, Spring 2014

Project Update Reminder

Project Update Presentation: March 25/27

Milestones, preliminary results.

More information about the project update requirements coming soon.

Visual Recognition And Search Columbia University, Spring 2014

What we have seen so far

Low-Level Features

SIFT, SURF, HOG, BRISK, etc.

Feature Coding and Pooling

Bag-of-words, Sparse coding, Fisher vector coding, etc.

Encoding Structure: Part-based Models

Deformable Part-based Models, Poselets, etc.

Attributes And Semantic Features [Today]

Part I: From Low-level to Semantic Visual Representations

Visual Recognition And Search Columbia University, Spring 2014

Introduction to Semantic Features

Use the scores of semantic classifiers as high-level features

Semantic Features

Off-the-shelf Classifiers

Compact / powerful descriptor with semantic meaning (allows “explaining” the decision)

Score Score Score

Water Classifier Sand Classifier Sky Classifier

Input Image

Beach Classifier

Visual Recognition And Search Columbia University, Spring 2014

Semantic Features (Frame-Level) Illustration of Early IBM work (multimedia community) describing

this concept

[John Smith et al, Multimedia Semantic Indexing Using Model Vectors, ICME 2003]

Concatenation / Dimensionality Reduction

Visual Recognition And Search Columbia University, Spring 2014

Semantic Features (Frame-level)

System evolved to the IBM Multimedia Analysis and Retrieval System (IMARS)

Ensemble Learning

Rapid event modeling, e.g., “accident with high-speed skidding”

Discriminative semantic basis [Rong Yan et al, Model-Shared Subspace Boosting for Multi-label Classification, KDD 2007]

Visual Recognition And Search Columbia University, Spring 2014

Classemes (Frame-level)

[L. Torresani et al, Efficient Object Category Recognition Using Classemes, ECCV 2010]

Noisy Labels

Images used to train the “table” classeme (from Google image search)

Descriptor is formed by concatenating the outputs of weakly trained classifiers called classemes (trained with noisy labels)

Visual Recognition And Search Columbia University, Spring 2014

Classemes (Frame-level)

Compact and Efficient Descriptor , useful for large-scale classification

Features are not really semantic!

Visual Recognition And Search Columbia University, Spring 2014

Semantic Features (Object Level)

Object Bank

http://vision.stanford.edu/projects/objectbank/

[Li-Jia Li et al, Object Bank: A High-Level Image Representation for Scene Classification and Semantic Feature Sparsification]

Source code available (~7 seconds per image)

Visual Recognition And Search Columbia University, Spring 2014

Shifting from Naming to Describing:

Representations based on Semantic Attributes

Visual Recognition And Search Columbia University, Spring 2014

Semantic Attributes

Modifiers rather than (or in addition to) nouns

Semantic properties that are shared among objects

Attributes are category independent and transferrable

Bald

Beard

Red Shirt ?

Naming Describing

Visual Recognition And Search Columbia University, Spring 2014

Examples of Semantic Attributes

http://whatbird.com

Visual Recognition And Search Columbia University, Spring 2014

Examples of Semantic Attributes

[Lampert et al, Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer, CVPR 2009]

Visual Recognition And Search Columbia University, Spring 2014

Examples of Semantic Attributes

[Farhadi et al, Describing Objects by their Attributes, 2009]

Visual Recognition And Search Columbia University, Spring 2014

Examples of Semantic Attributes

[Berg et al, Automatic Attribute Discovery and Characterization, ECCV 2010]

Visual Recognition And Search Columbia University, Spring 2014

Examples of Semantic Attributes

[Chen et al, Describing Clothing by Semantic Attributes, ECCV 2012]

Visual Recognition And Search Columbia University, Spring 2014

Examples of Semantic Attributes

http://www.galaxyzoo.org/

Visual Recognition And Search Columbia University, Spring 2014

Attribute Models

Slide credit: Devi Parikh

[Kumar et al., Describable Visual Attributes for Face Verification and Image Search, PAMI 2011]

(Or confidence)

Binary Attributes

Visual Recognition And Search Columbia University, Spring 2014

Attribute Models

Slide credit: Devi Parikh

> natural

< smiling

Parikh and Grauman, Relative Attributes, ICCV 2011

Max-margin learning to rank formulation of Joachims 2002

Relative Attributes

Visual Recognition And Search Columbia University, Spring 2014

Attribute-Based Classification

Scalable Learning

Visual Recognition And Search Columbia University, Spring 2014

Attribute-based Classification

[Lampert et al, Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer, CVPR 2009]

Recognition of Unseen Classes (Zero-Shot Learning)

1) Train semantic attribute classifiers

2) Obtain a classifier for an unseen object (no training samples) by just specifying which attributes it has

Visual Recognition And Search Columbia University, Spring 2014

Zero-Shot Learning

Unseen categories

Unseen categories

Semantic Attribute Classifiers

Flat multi-class classification

Attribute-based classification

Visual Recognition And Search Columbia University, Spring 2014

Class-Attribute Associations

Manual Specification of Class-Attribute Associations

Visual Recognition And Search Columbia University, Spring 2014

Class-Attribute Associations

Associations may be extracted automatically from other sources

Rohrbach et al . "What Helps Where – And Why? Semantic Relatedness for Knowledge Transfer", CVPR 2010

Visual Recognition And Search Columbia University, Spring 2014

Label Embedding

Label Embedding Framework

Manual Specification of Attributes

Akata et al . “Label Embedding for Attribute-based Classification", CVPR 2013

Visual Recognition And Search Columbia University, Spring 2014

Label Embedding

Frome et al . "DeViSE: A Deep Visual-Semantic Embedding Model", NIPS 2013

Label Embedding Framework

Automatic Discovery of word associations

Visual Recognition And Search Columbia University, Spring 2014

Label Embedding

Language Model Source Code: https://code.google.com/p/word2vec/

Zero-Shot Learning / Semantically close mistakes

Label Embedding Framework

Automatic Discovery of word associations

Visual Recognition And Search Columbia University, Spring 2014

Attributes as mid-level features

Face verification [Kumar et al, ICCV 2009]

Action recognition [Liu al, CVPR2011]

Semantic attributes + discriminative (non-semantic) features

Visual Recognition And Search Columbia University, Spring 2014

Attributes as mid-level features

Person Re-identification [Layne et al, BMVC 2012]

Bird Categorization [Farrell et al, ICCV 2011]

Visual Recognition And Search Columbia University, Spring 2014

Attributes as mid-level features Dhar et al, High Level Describable Attributes for Predicting

Aesthetics and Interestingness, CVPR 2011

Visual Recognition And Search Columbia University, Spring 2014

Attributes as mid-level features

Slide credit: Tamara Berg

Detecting Interesting Insects

Visual Recognition And Search Columbia University, Spring 2014

Attributes as mid-level features

Slide credit: Tamara Berg

Detecting Interesting Beaches

Visual Recognition And Search Columbia University, Spring 2014

Attributes as mid-level features

Note: Several recent methods use the term “attributes” to refer to non-semantic model outputs In this case attributes are just mid-level features, like PCA, hidden layers in neural nets, … (non-interpretable splits)

Visual Recognition And Search Columbia University, Spring 2014

Attributes for Fine-Grained Categorization

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization Visipedia (http://http://visipedia.org/)

Machines collaborating with humans to organize visual knowledge, connecting text to

images, images to text, and images to images

Easy annotation interface for experts (powered by computer vision)

Picture credit: Serge Belongie

Visual Query: Fine-grained Bird Categorization

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization

Slide Credit: Christoph Lampert

African Indian Is it an African or Indian Elephant?

Example-based Fine-Grained Categorization is Hard!!

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization

African Indian Is it an African or Indian Elephant?

Visual distinction of subordinate categories may be quite subtle, usually based on Parts and Attributes

Larger Ears Smaller Ears

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization

Codebook

Standard classification methods may not be suitable because the variation between classes is small …

[B. Yao, CVPR 2012]

… and intra-class variation is still high.

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization

Humans rely on field guides!

Field guides usually refer to parts and attributes of the object

Slide Credit: Pietro Perona

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization [Branson et al, Visual Recognition with Humans in the Loop, ECCV 2010]

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization

[Branson et al, Visual Recognition with Humans in the Loop, ECCV 2010]

Computer vision reduces the amount of human-interaction (minimizes the number of questions)

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization

[Wah et al, Multiclass Recognition and Part Localization with Humans in the Loop, ICCV 2011]

Localized part and attribute detectors.

Questions include asking the user to localize parts.

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization

Video Demo

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization

http://www.vision.caltech.edu/visipedia/CUB-200-2011.html

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization

Check the fine-grained visual categorization workshop: http://www.fgvc.org/

Visual Recognition And Search Columbia University, Spring 2014

Fine-Grained Categorization

Is fine-grained recognition different? Check https://sites.google.com/site/fgcomp2013/

Visual Recognition And Search Columbia University, Spring 2014

Attribute-Based Search

Visual Recognition And Search Columbia University, Spring 2014

People Search in Surveillance Videos

Traditional Approaches: Face Recognition (“Naming”)

Face recognition is very challenging under lighting changes, pose variation, and low-resolution imagery (typical conditions in surveillance scenarios)

Attribute-based People Search (“Describing”)

[Vaquero et al, Attribute-based People Search in Surveillance Environments, WACV 2009]

Rather than relying on face recognition only, a complementary people search framework based on semantic attributes is provided

Query Example:

“Show me all bald people at the 42nd street station last month with dark skin, wearing sunglasses, wearing a red jacket”

Visual Recognition And Search Columbia University, Spring 2014

People Search in Surveillance Videos

Visual Recognition And Search Columbia University, Spring 2014

People Search in Surveillance Videos

Feris et al, ICMR 2014

Visual Recognition And Search Columbia University, Spring 2014

People Search in Surveillance Videos

Boston Bombing Event “Show me all images of people matching the suspect description from

time X to time Y from all cameras in area Z.”

Ability to spot a person with e.g., a white hat in a crowded scene

Suspect #1 found in 4 images in top 8 results Suspect #2 found in 3 images in top page

1071 detected faces from 50 high-res Boston images (all from Flickr)

Visual Recognition And Search Columbia University, Spring 2014

People Search in Surveillance Videos

People Search based on textual descriptions - It does not require training images for the target suspect.

Robustness: attribute detectors are trained using lots of training images covering different lighting conditions, pose variation, etc.

Works well in low-resolution imagery (typical in video surveillance scenarios)

Visual Recognition And Search Columbia University, Spring 2014

People Search in Surveillance Videos

[Siddiquie, Feris and Davis, “Image Ranking and Retrieval Based on

Multi-Attribute Queries”, CVPR 2011]

Modeling attribute correlations

Visual Recognition And Search Columbia University, Spring 2014

MugHunt Demo

http://mughunt.securics.com/

Visual Recognition And Search Columbia University, Spring 2014

Whittle Search

Slide credit: Kristen Grauman

Visual Recognition And Search Columbia University, Spring 2014

Whittle Search Check Whittle Search demo at: http://godel.ece.vt.edu/whittle/

Visual Recognition And Search Columbia University, Spring 2014

Resources

http://rogerioferis.com/VisualRecognitionAndSearch2014/Resources.html

Visual Recognition And Search Columbia University, Spring 2014

http://rogerioferis.com/PartsAndAttributes/

http://pub.ist.ac.at/~chl/PnA2012/