Nonnegative Shared Subspace Learning and IAlii SilIts Application … · 2011-03-24 · Previous...

Nonnegative Shared Subspace Learning and I A li i S i lIts Application to Social M di R t i lMedia RetrievalPresenter: Andy Limy

Paper TopicPaper Topic• Folksonomy

• Social media sharing platforms

The ProblemThe Problem• Rise in popularity of social image and video sharing platforms

• Precision of tag‐based media retrieval

• Tags are• Noisyy• Ambiguous• Incomplete• Subjective• Subjective

• Lack of constraints

Tags: hotdog, chinese, trololol, aidjishi, sandwich, bread

• Free‐text tags (i.e. “djfja;sldfkj”)

Previous Research(Internal)• Improving tag relevance

• Sigurbjornsson and ZwolD l d th d f di t f l t t• Developed a method of recommending a set of relevant tags based on tag popularity

• Li et al.• List all images for a given tag and determine tag relevance from visual similarity

• All are confined to noisy tags within the primary dataset

The ApproachThe Approach• Internal vs. External

• Leverage external auxiliary sources of information to improve target tagging systems (presumably much noisier)target tagging systems (presumably much noisier)

• Exploit disparate characteristics of target domain using auxiliary source

• Note: What is the optimal level of joint modeling such that the• Note: What is the optimal level of joint modeling such that the target domain still benefits from the auxiliary source?

AssumptionsAssumptions• There is a common underlying subspace shared by the primary and secondary domains

• The primary domain is much nosier than the secondary• The primary domain is much nosier than the secondary domains

Nonnegative Matrix Factorization

• X (M x N data matrix) where N = documents in terms of MX (M x N data matrix) where N documents in terms of M vocabulary words

• F (M x R nonnegative matrix) represents R basis vectors

• H (R x N nonnegative matrix) contains coordinates of eachH (R x N nonnegative matrix) contains coordinates of each document

Joint Shared Nonnegative Matrix Factorization (JSNMF)• Input:p

• X (target domain), Y (auxiliary domain), R1 and R2 (dimensionality of underlying subspaces of X and Y), K (basis vectors)

• Output:• W (joint shared subspace), U (remaining subspace in target domain), V (remaining subspace in(remaining subspace in auxiliary domain), H (coordinate matrix for target domain), L ( di t t i f(coordinate matrix for auxiliary domain)

Retrieval using JSNMFRetrieval using JSNMF• Input: W, U, H, query sentence S number ofsentence SQ, number of images (or videos) to be retrieved N and image (or video) datasetvideo) dataset

• Output: Return top N retrieved images (or videos)

ExperimentExperiment• Use LabelMe tags (auxiliary) to improve

• Image retrieval in Flickr• Video retrieval in Youtube

• Why LabelMe?• Object image tagging• Controlled vocabulary

Flickr DatasetFlickr Dataset• Downloaded 50,000 images from Flickr

• Average number of distinct tags = 8

• Removed• Rare tags (appears less than 5 times)g ( pp )• Images with no tags and non‐English tags

• Obtained 20 000 labeled images• Obtained 20,000 labeled images

• 7,000 examples are kept for investigating internal auxiliary dataset

YouTubeDatasetYouTube Dataset• Downloaded 18,000 videos’ metadata (tags, URL, category, title, comments, etc.)

• Average number of distinct tags = 7g g

• Removed• Rare tags (appearing less than 2 times)• Rare tags (appearing less than 2 times)• Videos with no tags or non‐English tags

Obt i d d t t di t 12 000 id• Obtained dataset corresponding to 12,000 videos

• Again, kept 7,000 examples to be used as an internal auxiliary dataset

LabelMeDatasetLabelMeDataset• Added 7,000 images with tags from LabelMe

• Average number of distinct tags = 32

• Removed• Rare tags (appearing less than 2 times)g ( pp g )

• Cleanup does not reduce dataset

EvaluationMeasuresEvaluation Measures• Defined query set Q

• {cloud, man, street, water, road, leg, table, plant, girl, drawer, lamp, bed, cable, bus, pole, laptop, plate, kitchen, river, pool, flower}

• Manually annotated the two datasets (Flickr and YouTube) with respect to the query set (no benchmark datasetwith respect to the query set (no benchmark dataset available)

• Query term and an image is relevant if the concept is clearly visible in the image (or video)

Results with JSNMFResults with JSNMF• Precision‐Scope Curve

• Fix recall at 0.1U ll l i t t d i• Users are usually only interested in first few results

• 10% improvement

Results with JSNMFResults with JSNMF• Under‐representation

• Shares very few basis vectors

O t ti• Over‐representation• Forces many basis vectors to represent both datasets

• Appropriate level of representationp

Flickr Retrieval ResultsFlickr Retrieval Results

• Results are better with LabelMeResults are better with LabelMe

• As recall increases, precision decreases

• When K=0 (no sharing) or K=40 (fully sharing), precision is lower compared to K=15lower compared to K 15

YouTube Retrieval ResultsYouTube Retrieval Results

• Similar to Flickr Results

Extra Notes &Questions?Extra Notes & Questions?

• Can be extended to multiple datasets (not just 2)

• Can use generic model to apply to other data mining tasksCan use generic model to apply to other data mining tasks

Nonnegative Shared Subspace Learning and IAlii SilIts Application … · 2011-03-24 · Previous...

Documents

Transcript of Nonnegative Shared Subspace Learning and IAlii SilIts Application … · 2011-03-24 · Previous...