MediaEval 2017 Retrieving Diverse Social Images Task: Exploiting Visual-based Intent Classification...
-
Upload
multimediaeval -
Category
Science
-
view
46 -
download
4
Transcript of MediaEval 2017 Retrieving Diverse Social Images Task: Exploiting Visual-based Intent Classification...
![Page 1: MediaEval 2017 Retrieving Diverse Social Images Task: Exploiting Visual-based Intent Classification for Diverse Social Image Retrieval](https://reader033.fdocuments.us/reader033/viewer/2022051710/5a6dc6577f8b9add228b4839/html5/thumbnails/1.jpg)
Exploiting Visual-based Intent Classification for Diverse Social Image Retrieval
Bo Wang1, Martha Larson1,2 Delft University of Technology, the Netherlands1
Radboud University, the Netherlands1,2
![Page 2: MediaEval 2017 Retrieving Diverse Social Images Task: Exploiting Visual-based Intent Classification for Diverse Social Image Retrieval](https://reader033.fdocuments.us/reader033/viewer/2022051710/5a6dc6577f8b9add228b4839/html5/thumbnails/2.jpg)
Query Ambiguity Topic Coverage Sub-topic Retrieval,IA-Select…
Redundancy Novelty
Maximal Marginal Relevance,
Varies visual-feature based unsupervised learning algorithms
![Page 3: MediaEval 2017 Retrieving Diverse Social Images Task: Exploiting Visual-based Intent Classification for Diverse Social Image Retrieval](https://reader033.fdocuments.us/reader033/viewer/2022051710/5a6dc6577f8b9add228b4839/html5/thumbnails/3.jpg)
P@20 CR@20 F1@20
Pearson’s coefficient 0.049 0.044 0.061
p-value 0.653 0.690 0.575
Pearson’s coefficient between query clarity score and Flickr Baseline
![Page 4: MediaEval 2017 Retrieving Diverse Social Images Task: Exploiting Visual-based Intent Classification for Diverse Social Image Retrieval](https://reader033.fdocuments.us/reader033/viewer/2022051710/5a6dc6577f8b9add228b4839/html5/thumbnails/4.jpg)
Broad Latent Aspects: 1. Broad latent aspects apply to a broad set of queries. 2. User queries frequently leave these aspects unspecified.
![Page 5: MediaEval 2017 Retrieving Diverse Social Images Task: Exploiting Visual-based Intent Classification for Diverse Social Image Retrieval](https://reader033.fdocuments.us/reader033/viewer/2022051710/5a6dc6577f8b9add228b4839/html5/thumbnails/5.jpg)
The sum of the choices made by photographers on exactly how to portray the subject matter that they have decided to photograph.
— Riegler et al.
A sailing boat within vast,
unending space.A sailing boat as an
object.
The characteristics of a group of
people on sailing boat
Other information source related to
sailing boat
![Page 6: MediaEval 2017 Retrieving Diverse Social Images Task: Exploiting Visual-based Intent Classification for Diverse Social Image Retrieval](https://reader033.fdocuments.us/reader033/viewer/2022051710/5a6dc6577f8b9add228b4839/html5/thumbnails/6.jpg)
Tag Based Search Engine based on YFCC100M
81 NUS-Wide concepts
Top 200 Documents
15618 Images
Examine in turn
Preliminary Intent Class
Exist?Yes
No
Introduce new intent class.
![Page 7: MediaEval 2017 Retrieving Diverse Social Images Task: Exploiting Visual-based Intent Classification for Diverse Social Image Retrieval](https://reader033.fdocuments.us/reader033/viewer/2022051710/5a6dc6577f8b9add228b4839/html5/thumbnails/7.jpg)
VGG Net Chop off classification layer
Softmax classifier with cross-entropy loss
on 15618 images
Intent class: Candid Probability: 73.60/%
Intent class: Social Event Public Probability: 89.36/%
71% Accuracy
![Page 8: MediaEval 2017 Retrieving Diverse Social Images Task: Exploiting Visual-based Intent Classification for Diverse Social Image Retrieval](https://reader033.fdocuments.us/reader033/viewer/2022051710/5a6dc6577f8b9add228b4839/html5/thumbnails/8.jpg)
![Page 9: MediaEval 2017 Retrieving Diverse Social Images Task: Exploiting Visual-based Intent Classification for Diverse Social Image Retrieval](https://reader033.fdocuments.us/reader033/viewer/2022051710/5a6dc6577f8b9add228b4839/html5/thumbnails/9.jpg)
Runs TF_IDF Reranking Feature Clustering
Visual (run1) FALSE CNN_Features K-means
Text_rerank + Text (run2) TRUE
Weighted Word Embedding Aggregation
K-means
Text_rerank + Visual (run3) TRUE CNN_Features K-means
Text_Rerank + Intent (run4) TRUE CNN_Feature Intent
![Page 10: MediaEval 2017 Retrieving Diverse Social Images Task: Exploiting Visual-based Intent Classification for Diverse Social Image Retrieval](https://reader033.fdocuments.us/reader033/viewer/2022051710/5a6dc6577f8b9add228b4839/html5/thumbnails/10.jpg)
Data Set Evaluation Visual (1) Text-rerank + text (2)
Text-rerank + visual (3)
Text-rerank + intent (4)
dev P@20 61.52% 67.72% 67.72% 67.69%
dev CR@20 49.29% 52.36% 53.61% 55.61%
dev F1@20 54.73% 59.05% 59.83% 61.07%
test P@20 66.01% 70.36% 70.71% 72.62%
test CR@20 56.98% 61.42% 58.09% 61.25%
test F1@20 58.30% 63.43% 61.21% 64.62%
![Page 11: MediaEval 2017 Retrieving Diverse Social Images Task: Exploiting Visual-based Intent Classification for Diverse Social Image Retrieval](https://reader033.fdocuments.us/reader033/viewer/2022051710/5a6dc6577f8b9add228b4839/html5/thumbnails/11.jpg)
Pros and Cons• Intent-based diversification has the advantage of better understandability. • Do not necessarily need to fine-tune the hyper parameters. • Faster than unsupervised approaches.
• Single annotator bring subjectivity of intent classes.
![Page 12: MediaEval 2017 Retrieving Diverse Social Images Task: Exploiting Visual-based Intent Classification for Diverse Social Image Retrieval](https://reader033.fdocuments.us/reader033/viewer/2022051710/5a6dc6577f8b9add228b4839/html5/thumbnails/12.jpg)
Conclusions
• We point out ambiguity and redundancy removal might not work. • Broad latent aspects might help. • Proposed intent-based approach. • Intent-based search result diversification is able to bring high performance
with several extra benefits.
• http://www.wangbo.info/pdf/intent.pdf • http://www.wangbo.info/ACMMM-MUSA-2017/