Search and Hyperlinking Overview @MediaEval2014

Search and Hyperlinking2014

Overview

Maria Eskevich, Robin Aly, David Nicolás Racca

Roeland Ordelman, Shu Chen,Gareth J.F. Jones

Find what you were (not) looking for

Search & Explore

Jump-in points!

X

Users

Researchers & Educators

Journalists Research

Academic researchers & students

Investigate

Academic educators Educate

Public users Citizens Entertainment, Infotainment

Main group User Target

Media Professionals BroadcastProfessionals

Reuse

Media Archivists Annotate

Recommendation (Linking)

Not what we want

Linking Audio-Visual Content

1998 2002 2008

20132010 2015

DATA

BIG DATA?

not representative

representative

Search & Hyperlinking task

• User oriented: aim to explore the needs of real users expressed as queries. – How: UK citizens and crowd sourcing for retrieval

assessment

• Temporal aspect: seek to direct users to the relevant parts of retrieved video (“jump-in point”).– How: segmentation, segment overlap, transcripts.

prosodic, visual (low-level, high-level; keyframes)

• Multimodal: want to investigate technologies for addressing variety in user needs and expectations– varied visual and audio contributions, intentional gap

between query and multimodal descriptors in content

ME Search & Hyperlinking taskin development: 2012 – 2014

Search Hyperlinking

2012 2013 2014 2012 2013 2014

Dataset BlipTv BBC BlipTv BBC

Features released:

Transcripts 2 ASR 3 ASR 2 ASR 3 ASR

Prosodic features no yes no yes

Visual clues for queries yes no no

Concept detection yes yes

Type of the task Known-item Ad-hoc Ad-hoc

Query/Anchors creation PC iPad PC iPad

Number of queries/anchors

30/30 4/50 50/30 30/30 11/ 98/30

Relevance assessment MTurk users (BBC) MTurk MTurk

Numbers of assessed cases 30 50 9 900 3 517 9 975 13 141

Evaluation metrics MRR, MASP, MASDWP MAP(-bin/tol),

MAP MAP(-bin/tol), P@5/10

Dataset: Video collection

• BBC copyright cleared broadcast material:– Videos:

• Development set: 6 weeks between 01.04.2008 and 11.05.2008 (1335 hours/2323 videos)

• Test set: 11 weeks between 12.05.2008 and 31.07.2008 (2686 hours, 3528 videos)

– Manually transcribed subtitles

– Metadata

• Additional data:– ASR: LIMSI/Vocapia, LIUM, NST-Sheffield

– Shot boundaries, keyframes

– Output of visual concept detectors by University of Leuven, and University of Oxford

Dataset: Query• 28 Users

- Policemen, Hair dresser, Bouncer, Sales manger, Student, Self-employed

• Two hour session on iPads:

– Search the archive (document level)

– Define clips (segment level)

– Define anchors (anchor level)

Statement of Information Need

SearchRefine

Relevant ClipsDefine

Anchors

Data cleaning: Usable Information Need

• Description clearly specifies what is relevant

• A query with a suitable title exists

• Sufficient relevant segments exist (try query)

Data cleaning: Process

• For each information need in batch

1. check if usable

2. If in doubt use search to search for relevant data

3. reword & spellcheck description

4. select the first suitable query

5. Save

Data cleaning: Usable Anchor

• Longer than 5 seconds

• Destination description clearly identifies the material the user wants to see when he would activate the anchor described by label

• It is likely that there are some relevant items in the collection

Data cleaning: Process

• For each information need in assigned batch

– Go through anchors

• check if usable

• reword & spellcheck description

• Assess whether it is like to find links in the collection (possibly using search)

– Save

Dataset: outcome (1/2)

• 30 queries<top>

<queryId>query_6</queryId> <refId>53b3cf9d42b47e4c32545510</refId> <queryText>saturday kitchen cocktails</queryText>

</top>

<top> <queryId>query_1</queryId> <refId>53b3c64b42b47e4a362be4ce</refId> <queryText>sightseeing london</queryText>

</top>

Dataset: outcome (2/2)

• 30 anchors:

<anchor>

<anchorId>anchor_1</anchorId> <refId>53b3c46f42b47e459265d06f</refId> <startTime>16.38</startTime> <endTime>17.35</endTime> <fileName>v20080629_184000_bbctwo_killer_whales_in_the</fileName>

</anchor>

Ground truth creation

• Queries/Anchors: user studies at BBC:

- 28 users with following profile: Age: 18-30 years old

Use of search engines and services on iPads on the daily basis

• Relevance assessment: via crowdsourcing on Amazon MTurk platform:

– Top 10 results from 58 search and 62 hyperlinking submissions

– 1 judgment per query or anchor that was accepted/rejected based on an automated algorithm, special cases of users typos checked manually

– Number of evaluated HITs:

9 900 for search, and 13 141 for hyperlinking

• P@5/10/20• MAP based:

• MAP: taking into account any overlapping segment:

• MAP-bin: relevant segments are binned for relevance:

• MAP-tol: only start times of the segments are considered:

Evaluation metrics

RESULTS

Results: Search sub-task: MAP

0

2

4

6

8

10

12

14

16

18

LIMSI/Vocapia Manual No ASR NST/Sheffield LIUM

Results: Search sub-task: MAP_bin

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45


Results: Search sub-task: MAP_tol

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35


Results: Hyperlinking sub-task: MAP

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

CUNI_F_M_N

oOverlap

Au…

CUNI_F_M_N

oOverlap

KSI…

CUNI_F_M_N

oOverlap

KSI…

CUNI_F_M_N

oOverlap

No…

CUNI_F_M_O

verlap

KSIWe…

CUNI_F_N_NoOverlap

Aud…

CUNI_F_N_NoOverlap

KSI…

CUNI_F_N_NoOverlap

No…

CUNI_O_M

_NoOverlap

KSI…

DC

Lab

_Sh

_N_C

on

cep

t2

DCLab_Sh_N

_ConceptEnri…

IRIS

AK

UL_

Ss_N

_HTM

IRIS

AK

UL_

Ss_N

_NG

RA

M

IRIS

AK

UL_

Ss_N

_TM

1

IRIS

AK

UL_

Ss_N

_TM

2

IRISAKUL_Ss_O

_NGRAMN…

JRS_

F_M

V_A

Text

Vis

R

JRS_

F_M

V_A

wC

on

cep

t

JRS_

F_M

V_C

Text

Vis

R

JRS_

F_M

V_C

wC

on

cep

t

JRS_

F_M

_ATe

xt

JRS_

F_M

_CTe

xt

JRS_

F_V

_AcO

nly

JRS_

F_V

_CcO

nly

LIN

KED

TV2

01

4_O

_O_K

LIN

KED

TV2

01

4_O

_VO

_KC

7S

LINKED

TV2014_O

_VO_K

C…

LIN

KED

TV2

01

4_S

s_N

_ALL

LIN

KED

TV2

01

4_S

s_N

_TEX

T


Results: Hyperlinking sub-task: MAP_bin

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

CUNI_F_M_N

oOverlap

A…

CUNI_F_M_N

oOverlap

K…

CUNI_F_M_N

oOverlap

K…

CUNI_F_M_N

oOverlap

N…

CUNI_F_M_O

verlap

KSI…

CUNI_F_N_NoOverlap

Au…

CUNI_F_N_NoOverlap

KS…

CUNI_F_N_NoOverlap

No…

CUNI_O_M

_NoOverlap

K…

DC

Lab

_Sh

_N_C

on

cep

t2

DCLab_Sh_N

_ConceptEn…

IRIS

AK

UL_

Ss_N

_HTM

IRIS

AK

UL_

Ss_N

_NG

RA

M

IRIS

AK

UL_

Ss_N

_TM

1

IRIS

AK

UL_

Ss_N

_TM

2

IRISAKUL_Ss_O

_NGRAM…

JRS_

F_M

V_A

Text

Vis

R

JRS_

F_M

V_A

wC

on

cep

t

JRS_

F_M

V_C

Text

Vis

R

JRS_

F_M

V_C

wC

on

cep

t

JRS_

F_M

_ATe

xt

JRS_

F_M

_CTe

xt

JRS_

F_V

_AcO

nly

JRS_

F_V

_CcO

nly

LIN

KED

TV2

01

4_O

_O_K

LINKED

TV2014_O

_VO_K

…

LINKED

TV2014_O

_VO_K

…

LIN

KED

TV2

01

4_S

s_N

_ALL

LINKED

TV2014_Ss_N_TE…


Results: Hyperlinking sub-task: MAP_tol

0

0.05

0.1

0.15

0.2

0.25

0.3

CUNI_F_M_N

oOverlap

A…

CUNI_F_M_N

oOverlap

KS…

CUNI_F_M_N

oOverlap

KS…

CUNI_F_M_N

oOverlap

N…

CUNI_F_M_O

verlap

KSIW…

CUNI_F_N_NoOverlap

Au…

CUNI_F_N_NoOverlap

KS…

CUNI_F_N_NoOverlap

No…

CUNI_O_M

_NoOverlap

K…

DC

Lab

_Sh

_N_C

on

cep

t2

DCLab_Sh_N

_ConceptEn…

IRIS

AK

UL_

Ss_N

_HTM

IRIS

AK

UL_

Ss_N

_NG

RA

M

IRIS

AK

UL_

Ss_N

_TM

1

IRIS

AK

UL_

Ss_N

_TM

2

IRISAKUL_Ss_O

_NGRAM…

JRS_

F_M

V_A

Text

Vis

R

JRS_

F_M

V_A

wC

on

cep

t

JRS_

F_M

V_C

Text

Vis

R

JRS_

F_M

V_C

wC

on

cep

t

JRS_

F_M

_ATe

xt

JRS_

F_M

_CTe

xt

JRS_

F_V

_AcO

nly

JRS_

F_V

_CcO

nly

LIN

KED

TV2

01

4_O

_O_K

LINKED

TV2014_O

_VO_K

…

LINKED

TV2014_O

_VO_K

…

LIN

KED

TV2

01

4_S

s_N

_ALL

LIN

KED

TV2

01

4_S

s_N

_TEX

T


Lessons learned

1. iPad vs PC = different user behaviour and expectation from the system.

2. Prosodic features broaden the scope of the search sub-task.

3. Use of shot segmentation based units achieves the worst scores for both sub-tasks.

4. Use of metadata improves results for both sub-tasks.

The Search and Hyperlinking task was supported by

We are grateful to

Jana Eggink and

Andy O'Dwyer

from the BBC for preparing the collection and hosting the user trials.

... and of course Martha for advise & crowdsourcing access.

JRS at Search and Hyperlinking of Television Content Task

Werner Bailer, Harald Stiegler MediaEval Workshop, Barcelona, Oct. 2014

Linking sub-task

• Matching terms from textual resources

• Reranking based on visual similarity (VLAT)

• Using visual concepts (only/in addition)

• Results

– Differences between different text resources

– Context helped only in few of the cases

– Visual reranking provides small improvement

– Visual concepts did not provide improvements

35

Solution with concept enrichment

• Concept enrichment: the set of words is extended with their synonyms or other conceptually connected words.

• Top 10 vs top 50 conceptually connected words for each word

• Conclusion: the results show that concept enrichment with less words give better precision because at the opposite case the noise is greater.

Zsombor Paróczi, Bálint Fodor, Gábor Szűcs

Television Linked To The Web

www.linkedtv.eu

H.A. Le1, Q.M. Bui1, B. Huet1, B. Cervenková2, J. Bouchner2, E. Apostolidis3,

F. Markatopoulou3, A. Pournaras3, V. Mezaris3, D. Stein4, S. Eickeler4, and M. Stadtschnitzer4

1 - Eurecom, Sophia Antipolis, France. 2 - University of Economics, Prague, Czech Republic.

3 - Information Technologies Institute, CERTH, Thessaloniki, Greece. 4 - Fraunhofer IAIS, Sankt Augustin, Germany.

16-17 Oct 2014

LinkedTV @ MediaEval 2014

Search and Hyperlinking Task

• Different granularities: video level, scene level (visual/topic) and sentence level.

• Different features: text (subtitles / transcripts), visual concepts, keywords, etc…

Reasons to visit the LinkedTV poster

LinkedTV @ MediaEval 2014 Search and Hyperlinking Task

• How to incorporate visual information to the search?

• Visual concept detection in the search query:Mapping between query keywords and visual concepts (151 semantic concepts from TRECVID 2012) – Semantic word distance based on WordNet

– Identification of salient visual concepts from Google Image search results (query keywords)



• How to incorporate visual information to the search?

• Integration of detected visual concepts to the search:

– Designing an enriched query, based on textual (text query) and visual information (range query)

– Fusion of text score (Solr) and visual concepts scores



Search and Hyperlinking Overview @MediaEval2014

Science

Transcript of Search and Hyperlinking Overview @MediaEval2014