MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster

1
A. Castellanos 1 , X. Benavent 2 , A. García-Serrano 1 , E. De Ves 2 , J. Cigarrán 1 (1): ETSI Informática, UNED. (2): Dept. d’Informàtica, ETSE, Univ. de València. {acastellanos, agarcia, juanci}@lsi.uned.es {xaro.benavent, esther.deves}@uv.es ABSTRACT This paper details the participation of the UNED-UV group at the 2015 Retrieving Diverse Social Images Task. This year, our proposal is based on a multi-modal approach that firstly applies a textual algorithm based on Formal Concept Analysis (FCA) and Hierarchical Agglomerative Clustering (HAC) to detect the latent topics addressed by the images to diversify them according to these topics. Secondly, a Local Logistic Regression model, which uses the low level features and some relevant and non-relevant samples, is adjusted and estimates the relevance probability for all the images in the database. CONTENT-BASED INFORMATION RETRIEVAL Content-Based Image retrieval models the user preferences by using a relevance feedback algorithm. The general methodology involves five steps: STEP 1: Reduction of the data dimensionality: the low-level features, x, are reduced by a Principal Component Analysis, PCA, which retains only the first components that account for 80% of the data variability, x r- STEP 2: Selecting the relevant and non-relevant sets: Set of relevant/positive images, I P : One-topic: images given by Wikipedia Multi-topic: first five image result list. Set of non-relevant/negative images, I s : a generated set taken into account the guidelines (photos with regular people as main subject, photos with riots and protests). STEP 3: Parameter estimation of the Local Logistic Regression Models: The reduced feature vector, x r is dynamically splitted in m groups of non- fixed size. The x r of the I P and I s images are the inputs of the models. The output is the relevance probability, p, of image I j . STEP 4: Fusion of the probability vector p in a value (scalar) for each image using a weighted average: The weights, w, for a given probability are obtain by the amount of variance accounted for the group of components used to adjust the model. STEP 5: Ranking of the database: The final diversity similarity ranks is generated by selecting the highest probability image at each textual cluster (run3, run5). For run1, automatic using only visual information, the clusters are made by a k-means (k=50) over the PCA components xr. UNED-UV @ Retrieving Diverse Social Images Task METHODOLOGY Our multimedia system has two sub-systems: Textual-Based Information Retrieval system, TBIR, that works with textual information and generates clusters for diversity Content-Based Information Retrieval system, CBIR, that estimates the relevance of each of the images of the generated clusters. TEXTUAL-BASED INFORMATION RETRIEVAL STEP 1: Kullback Leiber Divergence (KLD) to obtain the image representation using the most divergent terms: weights in the first tertile (not only the most relevant but also the most “different and original”). STEP 2: A theory (Formal Concept Analysis, FCA) to organize the images in a lattice according its shared KLD-features (for 123 locations). STEP 3: HAC approach to identify the most Formal Concepts, as they will contain the most diverse images: The Zero-Induces measure used to identify the similarity between lattice nodes (Formal Concepts). Regrouping of the lattice nodes maximizing diversity. Metrics run1 run2 run3 run5 One-topic P@20 0.6362 0.7094 0.7051 0.7645 CR@20 0.3704 0.4082 0.3995 0.4194 F1@20 0.4618 0.5068 0.4988 0.5240 Multi-topic P@20 0.7300 0.6393 0.7207 0.7886 CR@20 0.4257 0.4407 0.4116 0.4491 F1@20 0.5130 0.5025 0.5001 0.5519 Overall P@20 0.6835 0.6741 0.7129 0.7766 CR@20 0.3983 0.4246 0.4056 0.4344 F1@20 0.4876 0.5046 0.4994 0.5380 Table 2. Official Metrics for Retrieving Diverse Social Images Task. Best result for each topic set is in bold, and best result for the automatic runs is in blue. First block results are for the one-topic subset, the second for the multi-topic subset, and the third for the overall set. RESULTS CONCLUSSIONS AND FUTURE WORK The best results are achieved by the manual version of the multimedia approach. Our challenge: is to make the automatic approach to be able to select the relevant and non-relevant images as a human being. Our best result for all metrics is with the multimedia human-based approach, run5. For the automatic runs, the best result for run2 at the one-topic subsest, run1 and run2 for multi-topic, and run2 and run3 for the Overall. Runs/ Algorithms Details of the two algorithms run1 run2 run3 run5 TBIR Clusters with FCA Ranking by selecting the first ranked image at each FCA cluster CBIR Clusters with k-means using PCA visual low-level features Automated image selection sets: Relevant: Wikipedia images or first five ranked images Non-relevant: generated set from non-relevance guidelines. Manual selection of relevant and non-relevant image sets. Ranking by selecting highest probability at each FCA cluster Ranking by selecting highest probability at each K-means cluster Table 1. Details of the TBIR and CBIR algorithms used for the submitted runs. EXPERIMENTS ACKNOWLEDGEMENTS This work has been partially supported by VOXPOPULI (TIN2013-47090C3-1-P) and DPI2013- 47279-C2-1-R Spanish projects. EXPERIMENT RESULT: (run5, multimedia with relevance feedback algorithm) Topic 249 Pamplona bull run (P@20=0.9000, CR@20=0.8333, F1@20= 0.8654)

Transcript of MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster

Page 1: MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster

A. Castellanos1, X. Benavent2, A. García-Serrano1, E. De Ves2, J. Cigarrán1 (1): ETSI Informática, UNED.

(2): Dept. d’Informàtica, ETSE, Univ. de València. {acastellanos, agarcia, juanci}@lsi.uned.es

{xaro.benavent, esther.deves}@uv.es

ABSTRACT This paper details the participation of the UNED-UV group at the 2015 Retrieving Diverse Social Images Task. This year, our proposal is based on a multi-modal approach that firstly applies a textual algorithm based on Formal Concept Analysis (FCA) and Hierarchical Agglomerative Clustering (HAC) to detect the latent topics addressed by the images to diversify them according to these topics. Secondly, a Local Logistic Regression model, which uses the low level features and some relevant and non-relevant samples, is adjusted and estimates the relevance probability for all the images in the database.

CONTENT-BASED INFORMATION RETRIEVAL Content-Based Image retrieval models the user preferences by using a relevance feedback algorithm. The general methodology involves five steps: STEP 1: Reduction of the data dimensionality: the low-level features, x, are reduced by a Principal Component Analysis, PCA, which retains only the first components that account for 80% of the data variability, xr- STEP 2: Selecting the relevant and non-relevant sets: •  Set of relevant/positive images, IP:

•  One-topic: images given by Wikipedia •  Multi-topic: first five image result list.

•  Set of non-relevant/negative images, Is: a generated set taken into account the guidelines (photos with regular people as main subject, photos with riots and protests).

STEP 3: Parameter estimation of the Local Logistic Regression Models: •  The reduced feature vector, xr is dynamically splitted in m groups of non-

fixed size. •  The xr of the IP and Is images are the inputs of the models. •  The output is the relevance probability, p, of image Ij. STEP 4: Fusion of the probability vector p in a value (scalar) for each image using a weighted average: •  The weights, w, for a given probability are obtain by the amount of

variance accounted for the group of components used to adjust the model.

STEP 5: Ranking of the database: The final diversity similarity ranks is generated by selecting the highest probability image at each textual cluster (run3, run5). For run1, automatic using only visual information, the clusters are made by a k-means (k=50) over the PCA components xr.

UNED-UV @ Retrieving Diverse Social Images Task

METHODOLOGY Our multimedia system has two sub-systems: •  Textual-Based Information Retrieval system, TBIR, that works with textual information

and generates clusters for diversity •  Content-Based Information Retrieval system, CBIR, that estimates the relevance of

each of the images of the generated clusters.   TEXTUAL-BASED INFORMATION RETRIEVAL STEP 1: Kullback Leiber Divergence (KLD) to obtain the image representation using the most divergent terms: weights in the first tertile (not only the most relevant but also the most “different and original”). STEP 2: A theory (Formal Concept Analysis, FCA) to organize the images in a lattice according its shared KLD-features (for 123 locations). STEP 3: HAC approach to identify the most Formal Concepts, as they will contain the most diverse images: •  The Zero-Induces measure used to identify the similarity between lattice nodes (Formal Concepts). •  Regrouping of the lattice nodes maximizing diversity.

Metrics run1 run2 run3 run5

One-topic

P@20 0.6362   0.7094   0.7051   0.7645  CR@20 0.3704   0.4082   0.3995   0.4194  F1@20 0.4618   0.5068   0.4988   0.5240  

Multi-topic

P@20 0.7300   0.6393   0.7207   0.7886  CR@20 0.4257   0.4407   0.4116   0.4491  F1@20 0.5130   0.5025   0.5001   0.5519  

Overall

P@20 0.6835   0.6741   0.7129   0.7766  CR@20 0.3983   0.4246   0.4056   0.4344  F1@20 0.4876   0.5046   0.4994   0.5380  

Table 2. Official Metrics for Retrieving Diverse Social Images Task. Best result for each topic set is in bold, and best result for the automatic runs is in blue. First block results are for the one-topic subset, the second for the multi-topic subset, and the third for the overall set.

RESULTS

CONCLUSSIONS AND FUTURE WORK •  The best results are achieved by the manual version of the multimedia

approach. •  Our challenge: is to make the automatic approach to be able to select

the relevant and non-relevant images as a human being.

•  Our best result for all metrics is with the multimedia human-based approach, run5.

•  For the automatic runs, the best result for run2 at the one-topic subsest, run1 and run2 for multi-topic, and run2 and run3 for the Overall.

Runs/ Algorithms Details of the two algorithms run1 run2 run3 run5

TBIR

Clusters with FCA ✓ ✓ ✓

Ranking by selecting the first ranked image at each FCA cluster ✓

CB

IR

Clusters with k-means using PCA visual low-level features ✓

Automated image selection sets: •  Relevant: Wikipedia images or first five ranked images •  Non-relevant: generated set from non-relevance guidelines.

✓ ✓

Manual selection of relevant and non-relevant image sets. ✓

Ranking by selecting highest probability at each FCA cluster ✓ ✓

Ranking by selecting highest probability at each K-means cluster ✓

Table 1. Details of the TBIR and CBIR algorithms used for the submitted runs.

EXPERIMENTS

ACKNOWLEDGEMENTS This work has been partially supported by VOXPOPULI (TIN2013-47090C3-1-P) and DPI2013- 47279-C2-1-R Spanish projects.

EXPERIMENT RESULT: (run5, multimedia with relevance feedback algorithm) Topic 249 Pamplona bull run (P@20=0.9000, CR@20=0.8333, F1@20= 0.8654)