MediaEval 2016 - Placing Task Overview

Post on 09-Jan-2017

16 views 0 download

Transcript of MediaEval 2016 - Placing Task Overview

PLACING TASK 2016

Bart Thomee (Google, San Bruno) Olivier Van Laere (Blueshift Labs, San Francisco)

Claudia Hauff (TU Delft, Netherlands) Jaeyoung Choi (ICSI, Berkeley / TU Delft, Netherlands)

Oct. 20th, 2016 Hilversum, Netherlands

TASK DESCRIPTION• given a video or a photo, how accurately can it be

placed on a map, e.g., give longitude and latitude coordinates or selecting a neighborhood

TASK OVERVIEW• Two sub-tasks

• locales-based subtask, mobility-based subtask (2015)

• estimation-based sub-task

• verification-based sub-task

• Organizer baselines provided

• Live leaderboard

ESTIMATION-BASED SUBTASK• participants are given a hierarchy of places across the world

• place hierarchy from YFCC100M Places expansion pack

• Country - State - City - Neighborhood

• for each photo/video, participants can:

• estimate the GPS-coordinate

• choose a node (i.e. a place) from a hierarchy in which they most confidently believe it was taken

VERIFICATION-BASED SUBTASK• Verify whether or not the media item was really

captured in the given place

• Requires a notion of confidence

Paris, France?

Yes / No

ORGANIZER BASELINES

• Two open-source baselines are provided

• Estimation-based sub-task

• Verification-based sub-task

http://bit.ly/2dnggcg

LIVE LEADERBOARD

• Participants can submit runs and view their relative standing toward others

• Evaluated on a dev set (part of test set)

TASK DATASETTraining Testing

#Photos #Videos #Photos #Videos

4,991,679 24.955 1,497,464 29.934

• Drawn from Yahoo Flickr Creative Commons 100 Million (YFCC100M) dataset

• photos and videos that are successfully reverse geocoded

PRECOMPUTED FEATURES• textual metadata

• as included in YFCC100M

• visual features

• LIRE, GIST, SIFT

• CNN Codes (HybridNet, VGG)

• audio features

• MFCC, Pitch (Kaldi, SAcC)

TASK EVALUATION• estimation-based sub-task

• geographic distance between ground truth coordinate and the predicted coordinate or place from the hierarchy

• verification-based sub-task

• classification accuracy is measured

• Karney’s formula is used to calculate distance between the ground truth and the estimated location

RUNS• run1 - Only the provided textual metadata

• run2 - Only the provided visual & aural features

• run3 - Only the provided textual metadata as well as the visual & aural features

• run4 & 5 - Everything is allowed (e.g., gazetteers, dictionaries, Web corpora)

• Except for crawling the exact items contained in the test set

PARTICIPANT STATISTICSestimation-based subtask verification-based subtask

run1 run2 run3 run4/5 run1 run2 run3 run4/5

CERTH/CEA LIST

O O O O

RECOD O O O O

CSUA O O O O O O O O

Two ‘veterans’ andOne new participants

RESULT - RUN1 TEXTUAL METADATA ONLY

CERTH/CEALIST RECOD CSUAPhoto Video Photo Video Photo Video

10m 0.59 0.55 0.59 0.45 0.27 0.27100m 6.42 6.86 6.07 5.74 2.88 3.031km 24.55 22.73 21.06 18.69 14.13 13.510km 43.32 40.6 38 33.57 35.28 33.48100km 51.26 48.24 46.23 41.56 50.28 47.61000km 64.06 60.84 59.69 54.51 64.17 60.06

0 10 20 30 40 50 60 70

Photo

Video

Photo

Video

Photo

Video

CERTH/CEALIST

RECO

DCSUA

1000km 100km 10km 1km 100m 10m

0 5 10 15 20 25 30

Photo

Video

Photo

Video

Photo

Video

CERTH/CEALIST

RECO

DCSUA

1km 100m 10m

CERTH/CEALIST RECOD CSUAPhoto Video Photo Video Photo Video

10m 0.59 0.55 0.59 0.45 0.27 0.27100m 6.42 6.86 6.07 5.74 2.88 3.031km 24.55 22.73 21.06 18.69 14.13 13.5

RESULT - RUN1 TEXTUAL METADATA ONLY

RESULT - RUN2 VISUAL & AUDIO ONLY

CERTH/CEALIST RECOD CSUAPhoto Video Photo Video Photo Video

10m 0.08 0 0.09 0 0 0100m 1.84 0.06 0.87 0.03 0 01km 5.62 0.5 2.36 0.15 0.42 0.14

10km 8.16 2.48 4.47 1.15 2.13 0.81100km 10.21 4.97 5.88 2.46 4 1.77

1000km 26.31 22.1 21.46 13.54 22.97 6.95

0 5 10 15 20 25 30

Photo

Video

Photo

Video

Photo

Video

CERTH/CEALIST

RECO

DCSUA

1000km 100km 10km 1km 100m 10m

RESULT - RUN2 VISUAL & AUDIO ONLY

CERTH/CEALIST RECOD CSUAPhoto Video Photo Video Photo Video

10m 0.08 0 0.09 0 0 0100m 1.84 0.06 0.87 0.03 0 01km 5.62 0.5 2.36 0.15 0.42 0.14

0 1 2 3 4 5 6

Photo

Video

Photo

Video

Photo

Video

CERTH/CEALIST

RECO

DCSUA

1km 100m 10m

RESULT - RUN3TEXT + VISUAL + AUDIO

Table 1

CERTH/CEALIST RECOD CSUAEstimation Photo Video Photo Video Photo Video

10m 0.56 0.55 0.56 0.51 0.27 0.27100m 6.58 6.86 5.97 5.82 2.89 3.031km 25.03 22.73 20.83 18.46 14.13 13.510km 43.73 40.6 37.72 33.38 35.26 33.48100km 51.69 48.24 46.04 41.2 50.25 47.61000km 64.58 60.84 59.89 54.77 64.03 60.08

0 10 20 30 40 50 60 70

Photo

Video

Photo

Video

Photo

Video

CERTH/CEALIST

RECO

DCSUA

1000km 100km 10km 1km 100m 10m

RESULT - RUN3TEXT + VISUAL + AUDIO

0 5 10 15 20 25 30

Photo

Video

Photo

Video

Photo

Video

CERTH/CEALIST

RECO

DCSUA

1km 100m 10m

Table 1

CERTH/CEALIST RECOD CSUAEstimation Photo Video Photo Video Photo Video

10m 0.56 0.55 0.56 0.51 0.27 0.27100m 6.58 6.86 5.97 5.82 2.89 3.031km 25.03 22.73 20.83 18.46 14.13 13.5

RUN 4 & 5USE ANYTHING

0 10 20 30 40 50 60 70

Photo

Video

Photo

Video

Photo

Video

Photo

Video

CERTH/CEALIST

-run

4CERTH/CEALIST

-run

5RECO

DCSUA

1000km 100km 10km 1km 100m 10m

CERTH/CEALIST - run4 CERTH/CEALIST - run5 RECOD CSUAEstimation Photo Video Photo Video Photo Video Photo Video

10m 0.7 0.72 0.72 0.71 0.71 0.37 0.27 0.27100m 7.96 8.27 8.27 8.19 8.19 4.03 2.94 3.361km 27.82 28.54 28.54 26.16 26.16 13.51 13.24 13.29

10km 46.52 46.45 46.45 43.62 43.62 25.76 33.02 32.61100km 53.96 53.5 53.5 50.44 50.44 33.02 51.14 49.351000km 66.11 65.32 65.32 61.93 61.93 47.67 64.58 61.18

RUN 4 & 5USE ANYTHING

0 5 10 15 20 25 30

Photo

Video

Photo

Video

Photo

Video

Photo

Video

CERTH/CEALI

ST-run4

CERTH/CEALI

ST-run5

RECO

DCSUA

1km 100m 10m

CERTH/CEALIST - run4 CERTH/CEALIST - run5 RECOD CSUAEstimation Photo Video Photo Video Photo Video Photo Video

10m 0.7 0.72 0.72 0.71 0.71 0.37 0.27 0.27100m 7.96 8.27 8.27 8.19 8.19 4.03 2.94 3.361km 27.82 28.54 28.54 26.16 26.16 13.51 13.24 13.29

0 10 20 30 40 50 60 70 80 90

run1

run2

run3

run4

baseline

neighborhood city state country

VERIFICATION TASKPHOTO

VERIFICATION TASKVIDEO

0 10 20 30 40 50 60 70 80 90

run1

run2

run3

run4

baseline

neighborhood city state country

WHAT WE LEARNED• Participants’ system

• No visible trend; many different approaches

• language model, similarity search, genetic programming, etc

• Fusion - heuristic, confidence based, ranking fusion

• External data (gazetteer) helps. More data helps!

• Photos were better geo-located than videos (but not always)

THANK YOU!!!