Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University,...
-
Upload
hailee-coulbourne -
Category
Documents
-
view
216 -
download
0
Transcript of Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University,...
![Page 1: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/1.jpg)
Distance Metric Learning from Uncertain Side Information with Application to Automated Photo
Tagging
Lei Wu *†, Steven C.H. Hoi*, Rong Jin#, Jianke Zhu‡, Nenghai Yu†
*Nanyang Technological University, †University of Sci. & Tech. of China, #Michigan State University, ‡ETH Zurich
![Page 2: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/2.jpg)
BACKGROUND Annotation/tagging is essential to making
images accessible to Web users Billions of images on the Web lack proper
annotation/tags Automatic image annotation has been actively
studied in multimedia community
![Page 3: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/3.jpg)
BACKGROUND (cont’) Social media data in social websites enjoy rich
tagging information provided by Web users
Can we resolve the challenge of auto-photo annotation by leveraging the emerging huge amount of rich social media data?
![Page 4: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/4.jpg)
BACKGROUND (cont’) Annotation by Search from Social Images
SunBirdSkyBlue…
BirdFlyWhiteCloud…
SunCloudHawkFly…
HawkBirdSkyEagle…
![Page 5: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/5.jpg)
MOTIVATION Annotation by Search
Find similar image from social image DB Annotate the image by the tags of high frequency
Research Challenges Visual feature representation Tag data mining Scalable search & indexing Distance/similarity measure
Distance Metric Learning
![Page 6: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/6.jpg)
MOTIVATION (cont’) Related Work of Automated Photo Tagging
Built a large collection of web images with ground truth labels for helping object recognition research (Russell et al. 2008)
A fast search-based approach for image annotation by some efficient hashing technique (Wang et al. 2006)
Utilized visual and text modalities simultaneously in clustering images (Rege et al. 2008)
Efficient image search and scene matching techniques for exploring a large-scale web image repository. (Torralba et al. 2008)
Learning based method for improving the efficiency of manual image annotation (Yan et al. 2008)Adopt Hamming or Euclidian distance
![Page 7: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/7.jpg)
MOTIVATION (cont’) Distance Measure
Hamming distance Euclidian distance Mahanalobis distance
Distance Metric Learning Learning to optimize the metric M Side Information (a.k.s. “Pairwise Constraints”)
Similar pairs S(x1, x2) : x1 and x2 belong to the same category
Dissimilar pairs D(x1, x2): x1 and x2 belong to different categories
![Page 8: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/8.jpg)
MOTIVATION (cont’) Related Work on Distance Metric Learning
Probablistic Global Distance Metric Learning (PGDM) (Xing et al. 2002)
Neighbourhood Components Analysis (NCA) (Goldberger et al. 2005)
Relevance Component Analysis (RCA) (Bar-Hillel et al. 2005) Discriminative Component Analysis (DCA) (Hoi et al. 2006) Large Margin Nearest Neighbor (LMNN) (Weinberger et al.
2006) Regularized Distance Metric Learning (RDML) (Si et al. 2006) Information-Theoretic Metric Learning (ITML) (Davis et al.
2007)Clean side information is given
explicitly
![Page 9: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/9.jpg)
MOTIVATION (cont’) Annotation by Search from Social Media
NO explicit pairwise side information available But rich information is available with social images
Ideas of our research To discover implicit pairwise relationship between
social images via a probabilistic approach To learn effective distance metrics from
uncertain side information that is discovered from social images implicitly
![Page 10: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/10.jpg)
METRIC LEARNING FRAMEWORK FOR
AUTOMATED PHOTO TAGGING Overview of Our Approaches
Discovery of probabilistic side information A Graphical Model Approach
Learning distance metrics from probabilistic side information
A Probabilistic RCA Method Automated photo tagging by applying the
optimized metric in visual similarity search
![Page 11: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/11.jpg)
METRIC LEARNING FRAMEWORK FOR
AUTOMATED PHOTO TAGGING
![Page 12: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/12.jpg)
Latent Chunklet Estimation for Probabilistic Side Info.
Problem Formulation Latent Chunklets
i.e., the hidden topics
Assumption both visual images and associated textual
metadata are generated from the hidden topic Calculation
Multi-model hidden topic analysis
![Page 13: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/13.jpg)
Graphical Model ForLatent Chunklet Estimation
Text Modality
Visual ModalityHidden
Topic
![Page 14: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/14.jpg)
Graphical Model ForLatent Chunklet Estimation (cont’)
Generation Process
Inference
Probabilistic Side Info., as Prior Prob. Matrix
![Page 15: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/15.jpg)
Probabilistic Distance Metric Learning
Problem Definition and Notations Probabilistic Side Info.:
Centers/Means for the Latent Chunklets
Membership Probability
Given the estimation of latent chunklets P0, how to formulate the DML problem to find the optimal metric M?
Propose an extension of RCA with Prob. Side Info.
![Page 16: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/16.jpg)
Probabilistic Relevance Component Analysis (pRCA)
The objective function of pRCA:
Corollary 1. When fixing the means of chunklets μ and the matrix of probability assignments P (assuming with hard assignments of 0 and 1), the Probabilistic Relevance Component Analyasis (pRCA) formulation reduces to the regular RCA learning.
Minimize Sum of square distances of examples from their chunklet’s
centers
regularization preventing the trivial
solution
![Page 17: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/17.jpg)
Probabilistic Relevance Component Analysis (pRCA)
Iterative algorithm Fixing P and μ to optimize M:
Fixing M and μ to optimize P:
Fixing P and M to find μ:
![Page 18: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/18.jpg)
Probabilistic Relevance Component Analysis (pRCA)
pRCA Algorithm
![Page 19: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/19.jpg)
Automated Photo Tagging Query image Steps of Auto Photo Tagging via Search
Distance/Similarity Measure
To retrieve a set of visually similar social photos Set of k-Nearest Neighbor Images
Set of images with distance less than some threshold
![Page 20: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/20.jpg)
Automated Photo Tagging (cont’)
Annotating the query photo by the relevant tags associated with the set of similar images A tag is more preferred if it has a higher frequency
among the set of similar social images A tag is more preferred if its associated social image
are visually more similar to the query photo Our tagging approach
Frequency of tag w among
the retrieved social images
Average distance from the query photo to the tag’s associated
social images
![Page 21: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/21.jpg)
EXPERIMENTS Experimental Testbed
Totally 205,442 photos from Flickr Distance Metric Learning: 16,588 photos + tags Knowledge Database: 186,854 photos + tags Query Image: 2,000 random photos
Compared Schemes: Relevance Component Analysis (RCA) Discriminative Component Analysis (DCA) Information-Theoretic Metric Learning (ITML) Large Margin Nearest Neighbor (LMNN) Neighbourhood Components Analysis (NCA) Regularized Distance Metric Learning (RDML)
![Page 22: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/22.jpg)
EXPERIMENTS (cont’) Settings:
500 latent chunklets 1,000 visual words 10,000 tags Learning rate γ=0.5 Top k nearest photos, k=30 Top t relevant tags for annotation, t=1,…,10
![Page 23: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/23.jpg)
Average Precision Fixed the number of nearest neighbors k to
30 for all compared methods
![Page 24: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/24.jpg)
Average Recall Fixed the number of nearest neighbors k to
30 for all compared methods
![Page 25: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/25.jpg)
Precision-Recall Curves Fixed the number of nearest neighbors k to
30 for all compared methods
![Page 26: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/26.jpg)
Empirical Observations DML techniques are beneficial and critical to
the retrieval-based photo tagging tasks In general, pRCA algorithm considerably
outperformed other approaches in most cases. For some cases, some DML methods did not
perform well, which could be even worse than the Euclidean method. Noisy (uncertain) side information issue Robustness is important to DML
![Page 27: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/27.jpg)
Evaluation of Varied k And t Examine the annotation performance of
pRCA by varying the value of k from 10 to 50
![Page 28: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/28.jpg)
Empirical Observations The number of nearest neighbors parameter
k can influence the annotation performance In our case, when k equals to 30, the resulting
performance is generally better than others Too large k, lots of noisy tags may be included as
there may not exist many relevant images in the database.
Too small k, some relevant tags may not appear, which again may degrade the performance
![Page 29: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/29.jpg)
Time Cost For Metric Learning To evaluate the time efficiency performance of the proposed
DML algorithm on the same dataset
Findings The most efficient method is the regular RCA approach The most time-consuming one is NCA pRCA is quite competitive, which is worse than RCA,DCA, and RDML,
but is considerably better than ITML, LMNN, and NCA
![Page 30: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/30.jpg)
Some Good ExamplesQuery Photo Top Recommended Tags
autumn, fall, forest , trees, nature , tree wood, germany , path , creative
sunset, clouds, sky, sea, beach, abigfave,sun, water, landscape, ocean
tiger, zoo , specanimal, impressedbeauty, abigfave, nature, animal , cat, animals, aplusphoto
garden, flowers, yellow, nature, hdr,nikon, spring, festival, impressed beauty
![Page 31: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/31.jpg)
Some Poor ExamplesQuery Photo Top Recommended Tags
macro, nikon, bokeh, nature, flower,canon, storm, eos, plane, flickrsbest
nikon, street, water, sport, blue, bike,lebanon, kids, eric mckenna, krissy mckenna
winter, photography, art , beach usa, fashion, portrait , travel, party, snow
park, river, travel, trees, lake, hiking,winter, green, vacation, water
![Page 32: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/32.jpg)
CONCLUSIONS Contributions:
Study DML from uncertain side information that exploits probabilistic side information
Propose a two-step probabilistic distance metric learning (PDML) framework
Present an effective probabilistic RCA (pRCA) algorithm
Apply the algorithm to the auto photo annotation by search task
Encouraging results showed that our technique is effective and promising
![Page 33: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/33.jpg)
Future Work To improve visual feature representation,
especially for annotating objects. To expand the scale of database To improve large scale search & indexing To filter spam and irrelevant tags To adopt user’s feedback to improve
automated tagging performance on APT.
![Page 34: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/34.jpg)
Q&A More information is available:
Http://www.cais.ntu.edu.sg/~chhoi/APT/
Online demo of Auto Photo Tagging (APT) is available:Http://msm.cais.ntu.edu.sg/APT/
Contact: WU Lei [email protected] Steven CH Hoi [email protected] School of Computer EngineeringNanyang Technological UniversitySingapore 639798 Email: [email protected] Tel: (+65) 6513-8040 Fax: (+65) 6792-6559 Http://www.ntu.edu.sg/home/chhoi/
![Page 35: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/35.jpg)
![Page 36: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/36.jpg)
GRAPHICAL MODEL FOR LATENT CHUNKLET ESTIMATION
Inference Joint probability on documents and topics
Conditional probability on tags, visual words and topics
Gibbs sampling estimation
![Page 37: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.](https://reader035.fdocuments.us/reader035/viewer/2022062417/55167887550346a2698b5a8e/html5/thumbnails/37.jpg)
Appendix