MediaEval 2012 SED Opening

13
Social Event Detection (SED): Challenges, Dataset and Evaluation Raphaël Troncy <[email protected] > Vasileios Mezaris <[email protected] > Symeon Papadopoulos <[email protected] > Emmanouil Schinas <[email protected] > Ioannis Kompatsiaris <[email protected] >

description

Opening presentation of the Social Event Detection (SED) task at MediaEval 2012 October 2012, Pisa, Italia

Transcript of MediaEval 2012 SED Opening

Page 1: MediaEval 2012 SED Opening

Social Event Detection (SED): Challenges, Dataset and Evaluation

Raphaël Troncy <[email protected]> Vasileios Mezaris <[email protected]> Symeon Papadopoulos <[email protected]> Emmanouil Schinas <[email protected]> Ioannis Kompatsiaris <[email protected]>

Page 2: MediaEval 2012 SED Opening

What are Events?

Events are observable occurrences grouping

04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 2

Experiences documented by Media

People Places Time

Page 3: MediaEval 2012 SED Opening

SED: bigger, longer, harder

In 2011 2 challenges 73k photos (2,43 Gb) No training dataset 18 teams interested 7 teams submitted runs

Considered easy F-measure = 85%

(challenge 1) F-measure = 69%

(challenge 2)

In 2012 3 challenges 1 from SED 2011

167k photos (5,5 Gb) cc licence check

Training dataset = SED 2011

21 teams interested … from 15 countries

5 teams submitted runs

Much harder !

- 3 04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy

Page 4: MediaEval 2012 SED Opening

Three challenges (type and venue)

1. Find all technical events that took place in Germany in the test collection.

2. Find all soccer events taking place in Hamburg (Germany) and Madrid (Spain) in the collection.

3. Find all demonstration and protest events of the Indignados movement occurring in public places in Madrid in the collection For each event, we provided relevant and non relevant

example photos

Task = detect events and provide all illustrating photos

04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 4

Page 5: MediaEval 2012 SED Opening

Dataset Construction

Collect 167332 Flickr Photos (Jan 2009-Dec 2011) 4,422 unique Flickr users, all in CC licence All geo-tagged in 5 cities: Barcelona (72255), Cologne

(15850), Hannover (2823), Hamburg (16958), Madrid (59043) + 0,22 % (403) from EventMedia

Altered metadata: geo-tags removed for 80% of the photos (random) 33466 photos still geo-tagged

Provide only metadata … but real media were available to participants if they asked (5,5 Gb)

04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 5

Page 6: MediaEval 2012 SED Opening

Ground Truth and Evaluation Measures

CrEve annotation tool: http://www.clusttour.gr/creve/

For each of the 6 collections, review all photos and associate them to events (that have to be created)

Search by text, geo-coordinates, date and user Review annotations made by others Use EventMedia and machine tags (upcoming:event=xxx)

Evaluation Measures: Harmonic mean (F-score of Precision and Recall) Normalized Mutual Information (NMI): jointly consider the

goodness of the photos retrieved and their correct assignment to different events

04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 6

Page 7: MediaEval 2012 SED Opening

What ideally should be found

Challenge 1: 19 events, 2234 photos (avg = 117) Baseline precision (random): 0,01%

Challenge 2: 79 events, 1684 photos (avg = 21) Baseline precision (random): 0,01%

Challenge 3: 52 events, 3992 Photos (avg = 77) Baseline precision (random): 0,02%

04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 7

Page 8: MediaEval 2012 SED Opening

Who Has Participated ?

21 Teams registered (18 in 2011)

5 Teams cross the lines (7 in 2011, 2 overlaps)

One participant missing at the workshop!

04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 8

Page 9: MediaEval 2012 SED Opening

Quick Summary of Approaches

2011: all but 1 participants use background knowledge Last.fm (all), Fbleague (EURECOM), PlayerHistory (QMUL) DBpedia, Freebase, Geonames, WordNet

2012: all but 2 participants use a generic approach IR approach: query matching clusters (metadata, temporal, spatial):

MISIMIS Classification approach:

Topic detection with LDA, city classification with TF-IDF, event detection using peaks in timeline using the query topics: AUTH-ISSEL

Learning model using the training data and SVM: CERTH-ITI

Background knowledge: QMUL, DISI

2012: all approaches are NOT fully automatic Manual selection of some parameters (e.g. topics)

04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 9

Page 10: MediaEval 2012 SED Opening

Results – Challenge 1 (Technical Events)

04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 10

Precision Recall F-score NMI AUTHISSEL_4 76,29 94,9 84,58 0,7238 CERTH_1 43,11 11,91 18,66 0,1877 DISI_1 86,23 59,13 70,15 0,6011 MISIMS_2 2,52 1,88 2,15 0,0236 QMUL_4 3,86 12,85 5,93 0,0475

84,58

18,66

70,15

2,15 5,93

0

10

20

30

40

50

60

70

80

90

Runs

AUTHISSEL_4 CERTHITI_1 DISI_1 MISIMS_2 QMUL_4

Page 11: MediaEval 2012 SED Opening

Results – Challenge 2 (Soccer Events)

04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 11

Precision Recall F-score NMI AUTHISSEL_4 88,18 93,49 90,76 0,8499 CERTH_1 85,57 66,19 74,64 0,6745 DISI_1 MISIMS_2 34,49 17,25 22,99 0,1993 QMUL_4 79,04 67,12 72,59 0,6493

90,76

74,64

22,99

72,59

0

10

20

30

40

50

60

70

80

90

100

Runs

AUTHISSEL_4 CERTHITI_3 DISI_1 MISIMS_2 QMUL_1

Page 12: MediaEval 2012 SED Opening

Results – Challenge 3 (Indignados Events)

04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 12

Precision Recall F-score NMI AUTHISSEL_4 88,91 90,78 89,83 0,738 CERTH_1 86,24 54,61 66,87 0,4654 DISI_1 86,15 47,17 60,96 0,4465 MISIMS_2 48,3 46,87 47,58 0,3088 QMUL_4 22,88 33,48 27,19 0,1988

89,83

66,87 60,96

47,58

27,19

0

10

20

30

40

50

60

70

80

90

100

Runs

AUTHISSEL_4 CERTHITI_3 DISI_1 MISIMS_2 QMUL_4

Page 13: MediaEval 2012 SED Opening

Conclusion

Lessons learned Clear winner for all tasks: generic approach but manual

selection of the topics Use of background knowledge still useful if well-used

Looking at next year SED Shlomo Geva (Queensland University of Technology) +

Philipp Cimiano (University of Bielefeld) Dataset: bigger, more diverse Media: photos and videos ? (at least 10% videos?) Metadata: include some social network relationships,

participation at events Evaluation measures: event granularity? Time/CPU?

04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 13