Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by...
Transcript of Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by...
![Page 1: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/1.jpg)
Large-Scale Visual Recognition Powered by Big Data and Big Crowd
Fei-Fei Li
Stanford University
![Page 2: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/2.jpg)
Prof. Kai Li Princeton U.
Prof. Alex Berg Stony Brook U.
Jonathan Krause Stanford U.
Sanjeev Satheesh Stanford U.
Zhiheng Huang Stanford U.
Olga Russakovsky Stanford U.
Dr. Jia Deng Stanford U. -> U. Michigan
![Page 3: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/3.jpg)
Build a computer to recognize EVERYTHING
![Page 4: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/4.jpg)
Recognition Engine
Surveillance Robotics Assistive tools
Wearable devices Driverless cars
Mining social media Image search Smart photo album
![Page 5: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/5.jpg)
What can computers already recognize?
![Page 6: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/6.jpg)
![Page 7: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/7.jpg)
![Page 8: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/8.jpg)
But when it comes to generic objects in the world…
![Page 9: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/9.jpg)
What about Gas Pumps!
But when it comes to generic objects in the world…
![Page 10: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/10.jpg)
20 object classes: PASCAL VOC [Everingham et al. 2006-2012]
Airplane Bird Boat Bike Bottle Bus Car Cat Chair Cow
Dining table Dog Horse Motorbike Person Potted plant Sheep Sofa Train TV monitor
But when it comes to generic objects in the world…
![Page 11: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/11.jpg)
How many things are there?
3.5M+ unique tags [Sigurbjörnsson & Zwol ’08]
WordNet
80K+ English nouns [Miller ’95; Fellbaum ’98]
60K+ product categories
4.1M+ articles
10K+ [Biederman ’87]
20 [Everingham ’06-’12]
PASCAL VOC
Animate the axis so that we show PASCAL 20 on this scale, and then show the large-scale end
![Page 12: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/12.jpg)
PASCAL VOC [Everingham et al. 2006-2012]
From PASCAL’s 20 classes to Millions?
![Page 13: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/13.jpg)
![Page 14: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/14.jpg)
Agenda
How to build a large-scale recognition engine using big data
STEP 1:
STEP 2:
STEP 3:
?
?
?
![Page 15: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/15.jpg)
Agenda
How to build a large-scale recognition engine using big data
STEP 1:
STEP 2:
STEP 3:
Build a Large Knowledge Base
?
?
![Page 16: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/16.jpg)
Get a list of everything
Crawl the web
WordNet
80K nouns
• Expert constructed • Rich structure
• Taxonomy, Partonomy • Widely used
[Torralba, Fergus, Freeman ’08] [Yao, Yang, Zhu ’07] [Everingham et al ’06] [Russell et al ’05] [Griffin & Perona ’03] [Fei-Fei, Fergus, Perona ’03]
![Page 17: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/17.jpg)
Change to Bing search
![Page 18: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/18.jpg)
Change to Bing search
![Page 19: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/19.jpg)
Crawl the web
WordNet
80K nouns
• Expert constructed • Rich structure
• Taxonomy, Partonomy • Senses disambiguated • Widely used
[Torralba, Fergus, Freeman ’08] [Yao, Yang, Zhu ’07] [Everingham et al ’06] [Russell et al ’05] [Griffin & Perona ’03] [Fei-Fei, Fergus, Perona ’03]
Clean up
Get a list of everything
![Page 20: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/20.jpg)
Graduate Students The Crowd
Very few of them
Good at complex tasks
Good quality
High cost
Estimate: 20 Years, $2M+
![Page 21: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/21.jpg)
Graduate Students The Crowd
Very few of them
Good at complex tasks
Good quality
High cost
![Page 22: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/22.jpg)
Graduate Students The Crowd
Very few of them Many of them
Low cost
Good at complex tasks Good at simple tasks
Good quality Mixed quality
High cost
… …
![Page 23: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/23.jpg)
Change to Bing search
![Page 24: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/24.jpg)
![Page 25: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/25.jpg)
22,000 categories and 14,000,000+ images
www.image-net.org [Deng et al. 2009]
• Animals • Bird • Fish • Mammal • Invertebrate
• Plants • Tree • Flower
• Food • Materials
• Structures • Artifact
• Tools • Appliances • Structures
• Person • Scenes
• Indoor • Geological Formations
• Sport Activities
![Page 26: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/26.jpg)
ImageNet, 14M [Deng et al. ’09]
Caltech101, 9K [Fei-Fei, Fergus, Perona, ‘03]
PASCAL VOC, 30K [Everingham et al. ’06-’12]
LabelMe, 37K [Russell et al. ’07]
Number of Labeled Images
SUN, 131K [Xiao et al. ‘10]
![Page 27: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/27.jpg)
Jan-08 May-08 Sep-08 Jan-09 May-09 Sep-09 Jan-10 May-10 Sep-10 Jan-11 May-11
3M
10M 11M
12M 14M
0M
Number of images in ImageNet
hired 50K+ AMT workers
who looked at 160M+ images
and made 550M+ binary decisions
U.S. economy outlook (Gallup)
![Page 28: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/28.jpg)
Le et al. Building high-level features using large scale unsupervised learning. ICML 2012.
Kuettel, Guillaumin, Ferrari. Segmentation Propagation in ImageNet. ECCV 2012
ECCV 2012 Best paper Award
Krizhevsky, Sutskever, Hinton. ImageNet classification with deep convolutional neural networks. NIPS 2012
![Page 29: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/29.jpg)
![Page 30: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/30.jpg)
Agenda
How to build a large-scale recognition engine using big data
STEP 1:
STEP 2:
STEP 3:
Build a Large Knowledge Base (ImageNet)
?
?
![Page 31: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/31.jpg)
• 9 Million images
• 4 methods – SPM+SVM [Lazebnik et al. ’06]
– BOW+SVM [Csurka et al. ’04]
– BOW+NN
– GIST+NN [Oliva et al. ’01]
Learn to Classify 10K Classes
Deng, Berg, Li, & Fei-Fei, ECCV2010
• 6.4% for 10K categories
![Page 32: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/32.jpg)
Deng, Berg, Li, & Fei-Fei, ECCV2010
Learn to Classify 10K Classes
![Page 33: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/33.jpg)
Fine-grained categories are a lot harder
Deng, Berg, Li, & Fei-Fei, ECCV2010
Vehicle
Artifact
Entity
Vehicle
Artifact
Entity
Finer Coarser
Average Semantic Distance
![Page 34: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/34.jpg)
Agenda
How to build a large-scale recognition engine using big data
STEP 1:
STEP 2:
STEP 3:
Build a Large Knowledge Base (ImageNet)
Fine-Grained Recognition
?
Summarize the rest of the bubble section Into about 3-ish slides. Advertise CVPR’13 oral.
![Page 35: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/35.jpg)
?
What breed is this dog?
Why is Fine-Grained Recognition Difficult?
![Page 36: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/36.jpg)
Cardigan Welsh Corgi
…
Pembroke Welsh Corgi
…
Why is Fine-Grained Recognition Difficult?
?
What breed is this dog?
Key: Find the right features.
![Page 37: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/37.jpg)
Cardigan Welsh Corgi
…
Pembroke Welsh Corgi
…
Learning
Existing Work
[Branson et al. '10]
[Farrell et al. '11]
[Yao et al. ’12]
[Yao et al. ’11]
[Bo et al. '10]
Why is Fine-Grained Recognition Difficult?
![Page 38: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/38.jpg)
Cardigan Welsh Corgi
…
Pembroke Welsh Corgi
…
Learning
Why is Fine-Grained Recognition Difficult?
Existing Work
[Branson et al. '10]
[Farrell et al. '11]
[Yao et al. ’12]
[Yao et al. ’11]
[Bo et al. '10]
![Page 39: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/39.jpg)
Why is Fine-Grained Recognition Difficult?
Cardigan Welsh Corgi
…
Pembroke Welsh Corgi
…
Learning
How to help computers select features?
![Page 40: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/40.jpg)
Machine Crowd
Machine-Crowd Collaboration
KNOWLEDGE
![Page 41: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/41.jpg)
Answer
Question VS
VS
VS VS
VS
VS
Machine-Crowd Collaboration
Baseline Model Confusing Class
Pairs
Learning with New Knowledge
Annotation Task
![Page 42: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/42.jpg)
Machine-Crowd Collaboration
VS VS
VS VS
VS
VS
Baseline Model Confusing Class
Pairs
Learning with New Knowledge
Annotation Task
![Page 43: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/43.jpg)
Deng, Krause, & Fei-Fei, CVPR2013
![Page 44: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/44.jpg)
Answer
Question VS
VS
VS VS
VS
VS
Machine-Crowd Collaboration
Baseline Model Confusing Class
Pairs
Learning with New Knowledge
Bubbles Game
![Page 45: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/45.jpg)
KMeans [Ball & Hall ‘67]
Sparse coding[Olshausen &
Field ‘96] Random[Coates & Ng ’11, Yao
et al. 12].
BubbleBank
Machine Learning with Crowd-picked Bubbles Classifier (SVM)
+
+ +
+
+ -
- -
-
-
Training Images
?
… Linear SVM
Deng, Krause, & Fei-Fei, CVPR2013
Test Image
The BubbleBank Representation
![Page 46: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/46.jpg)
18 19 19.2 22.4
26.2 26.7
32.8
26.5
0
5
10
15
20
25
30
35
Accuracy on CUB-200 [Welinder et al. 10]
37.02 40.05 44.73
58.47
43.72
0
10
20
30
40
50
60
70
mAP on CUB-14 [Welinder et al. 10]
Deng, Krause, & Fei-Fei, CVPR2013
MKL [Branson et al. '10] Birdlet [Farrell et al. '11] CFAF [ Yao et al.'12]
MKL [Branson et al. ‘10] LLC [Wang et al. ‘09] RF [Yao et al. '11] MultiCue [Khan et al.'11] KDES [Bo et al. ’10] Tricos [Chai ’12]
b
b
![Page 47: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/47.jpg)
Deng, Krause, & Fei-Fei, CVPR2013
Top Activated Bubbles (successful predictions)
![Page 48: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/48.jpg)
Agenda
How to build a large-scale recognition engine using big data
STEP 1:
STEP 2:
STEP 3:
Build a Large Knowledge Base (ImageNet)
Fine-Grained Recognition (Bubbles)
?
![Page 49: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/49.jpg)
Agenda
How to build a large-scale recognition engine using big data
STEP 1:
STEP 2:
STEP 3:
Build a Large Knowledge Base (ImageNet)
Fine-Grained Recognition (Bubbles)
Putting a label on “everything”
![Page 50: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/50.jpg)
The Current State of the Art
10K classes 32.6% Krizhevsky et al. NIPS 2012
20K classes 15% Le et al. NIPS 2012
Not quite practical yet…
But we are measuring the very fine-grained level
![Page 51: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/51.jpg)
Hedging: Be as informative as possible with few mistakes
…..
Entity
….. Mammal
Zebra Kangaroo
Kangaroo
Mammal …..
Entity
….. Mammal
Zebra Kangaroo
Deng, Krause, Berg, Fei-Fei, CVPR2012
![Page 52: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/52.jpg)
Deng, Krause, Berg, Fei-Fei, CVPR2012
entity
mammal vehicle
kangaroo zebra car boat
Formal Problem Statement
![Page 53: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/53.jpg)
Deng, Krause, Berg, Fei-Fei, CVPR2012
entity
mammal vehicle
kangaroo zebra car boat
All Correct
Formal Problem Statement
![Page 54: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/54.jpg)
Deng, Krause, Berg, Fei-Fei, CVPR2012
entity
mammal vehicle
kangaroo zebra car boat
All Correct
$0
$1
$2
Formal Problem Statement
![Page 55: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/55.jpg)
Deng, Krause, Berg, Fei-Fei, CVPR2012
𝑟: rewards of the nodes. Reward 𝑅(𝑓, 𝑟) : rewards of the classifier Accuracy Φ 𝑓 : accuracy of the classifier
Maximizef
R( f )
Subject to A( f ) ³1-e
Formal Problem Statement
Assumptions • Same distribution for training and test. • A base classifier g that gives posterior probability on the hierarchy.
Goal • Find a decision rule f
• Expected accuracy A(f) is at least 1-ε • Maximize expected reward R(f)
posterior for all nodes
g f
Test image
![Page 56: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/56.jpg)
Deng, Krause, Berg, Fei-Fei, CVPR2012
Ours
LEAF-GT
MAX-REW
MAX-EXP
![Page 57: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/57.jpg)
![Page 58: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/58.jpg)
61
![Page 59: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/59.jpg)
![Page 60: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/60.jpg)
![Page 61: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/61.jpg)
Agenda
How to build a large-scale recognition engine using big data
STEP 1:
STEP 2:
STEP 3:
Build a Large Knowledge Base (ImageNet)
Fine-Grained Recognition (Bubbles)
Putting a label on “everything” (Hedging)
![Page 62: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/62.jpg)
Conclusion & Future Work
Harvesting Knowledge Crowd-Machine Collaboration Visual Representation Active Learning
Visual Turing Test Vision and Language Visual Reasoning
Managing Big Visual Data Large-Scale Learning Indexing and Retrieval
Knowledge Transfer Exploiting Data Biases Domain Adaptation
Mining Big Visual Data Visual Knowledge Graph Social Media
Create a 1-page conclusion/future work slides; Emphasize on the knowledge graph project by Tanya
![Page 63: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/63.jpg)
Conclusion & Future Work
Harvesting Knowledge Crowd-Machine Collaboration Visual Representation Active Learning
Visual Turing Test Vision and Language Visual Reasoning
Managing Big Visual Data Large-Scale Learning Indexing and Retrieval
Knowledge Transfer Exploiting Data Biases Domain Adaptation
Mining Big Visual Data Visual Knowledge Graph Social Media
Create a 1-page conclusion/future work slides; Emphasize on the knowledge graph project by Tanya
![Page 64: Visual Recognition Powered by Big Data - microsoft.com · Large-Scale Visual Recognition Powered by Big Data and Big Crowd Fei-Fei Li Stanford ... How to build a large-scale recognition](https://reader036.fdocuments.us/reader036/viewer/2022070715/5ed7d8878e533201dc4aed69/html5/thumbnails/64.jpg)
Thank you!
Prof. Kai Li Princeton U.
Prof. Alex Berg Stony Brook U.
Jonathan Krause Stanford U.
Sanjeev Satheesh Stanford U.
Zhiheng Huang Stanford U.
Olga Russakovsky Stanford U.
Dr. Jia Deng Stanford U.