2016 MediaEval - Interestingness Task Overview

Predicting Media Interestingness Task Overview

Claire-Hélène Demarty – TechnicolorMats Sjöberg – University of Helsinki

Bogdan Ionescu – University Polytehnica of BucharestThanh-Toan Do – Singapor University of Science

Hanli Wang – Tongji UniversityNgoc Q.K. Duong, TechnicolorFrédéric Lefebvre, Technicolor

MediaEval 2016 WorkshopOctober, 20-21st 2016

Interestingness?

Are these interesting images?

2

?

Interestingness?


3

?

Definition?

Interestingness?


4

?

Definition?

Subjective

SemanticPerceptual

n Derives from a use case at Technicolorn Helping professionals to illustrate a Video on Demand (VOD) web site by

selecting some interesting frames and/or video excerpts for the posted movies.

n The frames and excerpts should be suitable in terms of helping a user to make his/her decision about whether he/she is interested in watching the underlying movie.

n Two subtasks -> Image and Videon Image subtask: given a set of keyframes extracted from a movie, …

n Video subtask: given the video shots of a movie, …

… automatically identify those images/shots that viewers report to be the most interesting in the given movie.

n Binary classification task on a per movie basis…

… but confidence values are also required.

5

Task definition

12/7/16

n From Hollywood-like movie trailersn Manual segmentation of shots

n Extraction of middle key-frame of each shot

6

Dataset & additional features

12/7/16

Development Set Test Set

Trailer # 52 26Total % interesting Total % interesting

Shot # 5054 8.3 2342 9.6

Key-frame # 5054 9.4 2342 10.3

n Precomputed content descriptors:n Low-level: denseSift, HoG, LBP, GIST, HSV color histograms, MFCC, fc7 and

prob layers from AlexNet

n Mid-level: face detection and tracking-by-detection

7

Manual annotations

12/7/16

Thank youMats!Thanks to allof you!

Binary decision(manual

thresholding)

Pair comparisonprotocol

Aggregation intorankings

pairs rankings

Annotators:>310 persons for video>100 persons for imageFrom 29 countries

Image subtask: Visual information only, no external data

Video subtask: Audio and visual information, no external data

External data IS:n Additional datasets and annotations dedicated to the interestingness

prediction

n Pre-trained models, features, detectors obtained from such dedicateddatasets

n Additional metadata that could be found on the internet on the providedcontent

External data IS NOT:n CNN features generated on generic datasets not dedicated to interestingness

prediction

8

Required runs

12/7/16

n Official measure:

Ø Mean Average Precision (over all trailers)

n Additional metrics are computed:n False alarm rate, miss detection rate, precision, recall, F-measure, etc.

9

Evaluation metrics

12/7/16

10

Task participation

12/7/16

0

5

10

15

20

25

30

35

Registrations Returned agreements Submitting teams Workshop

Task Participation

n Registrations:n 31 teams

n 16 countries

n 3 ‘experienced’ teams

n Submissions: 12 teamsn 9 teams submitted on both substasks

n 2 teams on image subtask

n 1 team on video subtask

11

Official results – Image subtask – 27 runs

12/7/16

Runs MAP Official rankingme16in_tudmmc2_image_histface 0.2336 TUDMMCme16in_technicolor_image_run1_SVM_rbf* 0.2336 Technicolorme16in_technicolor_image_run2_DNNresampling06_100* 0.2315 Technicolorme16in_MLPBOON_image_run5 0.2296 MLPBOONme16in_BigVid_image_run5FusionCNN 0.2294 BigVidme16in_MLPBOON_image_run1 0.2205 MLPBOONme16in_tudmmc2_image_hist 0.2202 TUDMMCme16in_MLPBOON_image_run4 0.217 MLPBOONme16in_HUCVL_image_run1 0.2125 HUCVLme16in_HUCVL_image_run2 0.2121 HUCVLme16in_UITNII_image_FA 0.2115 UITNIIme16in_RUC_image_run2 0.2035 RUCme16in_MLPBOON_image_run2 0.2023 MLPBOONme16in_HUCVL_image_run3 0.2001 HUCVLme16in_RUC_image_run3 0.1991 RUCme16in_RUC_image_run1 0.1987 RUCme16in_ethcvl1_image_run2 0.1952 ETHCVLme16in_MLPBOON_image_run3 0.1941 MLPBOONme16in_HKBU_image_baseline 0.1868 HKBUme16in_ethcvl1_image_run1 0.1866 ETHCVLme16in_ethcvl1_image_run3 0.1858 ETHCVLme16in_HKBU_image_drbaseline 0.1839 HKBUme16in_BigVid_image_run4SVM 0.1789 BigVidme16in_UITNII_image_V1 0.1773 UITNIIme16in_lapi_image_runf1* 0.1714 LAPIme16in_UNIGECISA_image_ReglineLoF 0.1704 UNIGECISABASELINE (on testset) 0.1655me16in_lapi_image_runf2* 0.1398 LAPI

* organizers

12

Official results – Video subtask – 28 runs

12/7/16

* organizers

Runs MAP Official rankingme16in_recod_video_run1 0.1815 RECODme16in_recod_video_run1_old 0.1753 RECODme16in_HKBU_video_drbaseline 0.1735 HKBUme16in_UNIGECISA_video_RegsrrLoF 0.171 UNIGECISAme16in_RUC_video_run2 0.1704 RUCme16in_UITNII_video_A1 0.169 UITNIIme16in_recod_video_run4 0.1656 RECODme16in_RUC_video_run1 0.1647 RUCme16in_UITNII_video_F1 0.1641 UITNIIme16in_lapi_video_runf5 0.1629 LAPIme16in_technicolor_video_run5_CSP_multimodal_80_epoch7 0.1618 Technicolorme16in_recod_video_run2 0.1617 RECODme16in_recod_video_run3 0.1617 RECODme16in_ethcvl1_video_run2 0.1574 ETHCVLme16in_lapi_video_runf3 0.1574 LAPIme16in_lapi_video_runf4 0.1572 LAPIme16in_tudmmc2_video_histface 0.1558 TUDMMCme16in_tudmmc2_video_hist 0.1557 TUDMMCme16in_BigVid_video_run3RankSVM 0.154 BigVidme16in_HKBU_video_baseline 0.1521 HKBUme16in_BigVid_video_run2FusionCNN 0.1511 BigVidme16in_UNIGECISA_video_RegsrrGiFe 0.1497 UNIGECISABASELINE (on testset) 0.1496me16in_BigVid_video_run1SVM 0.1482 BigVIdme16in_technicolor_video_run3_LSTM_U19_100_epoch5 0.1465 Technicolorme16in_recod_video_run5 0.1435 RECODme16in_UNIGECISA_video_SVRloAudio 0.1367 UNIGECISAme16in_technicolor_video_run4_CSP_video_80_epoch9 0.1365 Technicolorme16in_ethcvl1_video_run1 0.1362 ETHCVL

n On the task itself?n Image interestingness is NOT video interestingness

n Issue with the video dataset (needs more interations? needs more data samples?)

n Overall low map values: room for improvment!

n On the participants systems?n This year’s trend? No trend!

n Classic machine learning, deep learning systems… but also rule-based systems

n Some multimodal (audio, video, text), some temporal… and some not.

n (Mostly) No use of external data

n Simple systems did as well (better…) than sophisticated systems

n Dataset unbalance: an issue?

n Dataset size: penalizing deep learning systems?

13

What we have learned

12/7/16

14 12/7/16

Thank you!

2016 MediaEval - Interestingness Task Overview

Science

Transcript of 2016 MediaEval - Interestingness Task Overview