MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval...
-
Upload
multimediaeval -
Category
Education
-
view
115 -
download
0
Transcript of MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval...
![Page 1: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/1.jpg)
PERCOLATTEA multimodal person discovery system in TV broadcast for
the Medieval 2015 evaluation campaign
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, BenoitFavre, Mickael Rouvier, Frederic Bechet, Geraldine Damnati
Sept. 14-15, 2015Wurzen, Germany
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 1 / 16
![Page 2: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/2.jpg)
Task
Context
Percol, Percolator and Percolatte !
Task
Talking faces identification in TV broadcast
1 Search engine
2 No biometric systems
3 Identification evidence
4 Provided baseline modules
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 2 / 16
![Page 3: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/3.jpg)
Percolatte approach
Percolatte approach
Scene analysis features and restricted names propagation
1. Scene analysis features
Anchor name detection
Document chaptering: shot classification (Studio/Report)
Speaker role classification (Anchor/Reporter/Other)
2. Restricted names propagation
Prior knowledge about broadcast news structure
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 3 / 16
![Page 4: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/4.jpg)
Percolatte approach
Percolatte approach
Scene analysis features and restricted names propagation
1. Scene analysis features
Anchor name detection
Document chaptering: shot classification (Studio/Report)
Speaker role classification (Anchor/Reporter/Other)
2. Restricted names propagation
Prior knowledge about broadcast news structure
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 3 / 16
![Page 5: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/5.jpg)
Percolatte approach
Percolatte approach
Scene analysis features and restricted names propagation
1. Scene analysis features
Anchor name detection
Document chaptering: shot classification (Studio/Report)
Speaker role classification (Anchor/Reporter/Other)
2. Restricted names propagation
Prior knowledge about broadcast news structure
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 3 / 16
![Page 6: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/6.jpg)
Percolatte approach
Percolatte approach
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 4 / 16
![Page 7: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/7.jpg)
Percolatte approach
Percolatte approach
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 5 / 16
![Page 8: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/8.jpg)
Text processing
Text processingAnchor name detector
OCRa on the first 2 minutes
List of names: metadata from the INA website (2004-2009)
Soft mapping: Levenshtein distance on last names
Recall = 93%
ahttps://github.com/meriembendris/ADNVideo
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 6 / 16
![Page 9: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/9.jpg)
Audio processing
Audio processing
Speaker clustering [Barras et al., 2006]
BIC clustering + GMMs/CLR
Speaker role classification [Damnati and Charlet, 2011]
Anchor: regular speaker that maximizes temporal speech
Reporter/Other: GMM classification
Corpus: 38 broadcast news from 7 channels (Oct. 2008-Jan. 2009), 14.5hours, 1400 speakersTrain: 24 shows/test: 14 showsEER= 15%
Speaker identification
Propagate names to speaker turns that maximise temporal overlapping and to it’sspeaker-cluster within the same chapter
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 7 / 16
![Page 10: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/10.jpg)
Audio processing
Audio processing
Speaker clustering [Barras et al., 2006]
BIC clustering + GMMs/CLR
Speaker role classification [Damnati and Charlet, 2011]
Anchor: regular speaker that maximizes temporal speech
Reporter/Other: GMM classification
Corpus: 38 broadcast news from 7 channels (Oct. 2008-Jan. 2009), 14.5hours, 1400 speakersTrain: 24 shows/test: 14 showsEER= 15%
Speaker identification
Propagate names to speaker turns that maximise temporal overlapping and to it’sspeaker-cluster within the same chapter
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 7 / 16
![Page 11: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/11.jpg)
Audio processing
Audio processing
Speaker clustering [Barras et al., 2006]
BIC clustering + GMMs/CLR
Speaker role classification [Damnati and Charlet, 2011]
Anchor: regular speaker that maximizes temporal speech
Reporter/Other: GMM classification
Corpus: 38 broadcast news from 7 channels (Oct. 2008-Jan. 2009), 14.5hours, 1400 speakersTrain: 24 shows/test: 14 showsEER= 15%
Speaker identification
Propagate names to speaker turns that maximise temporal overlapping and to it’sspeaker-cluster within the same chapter
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 7 / 16
![Page 12: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/12.jpg)
Visual processing
Visual processing
Shot boundaries
Colour histogram peaks on sliding window
Shot boundaries mapping:
July 1st: overlapping shots over 0.5sJuly 8th: overlapping coverage above 50%
Shot similarities
Cosine-based distance on:
RGB histograms
HOG features on resized frames (128×64)
Image embeddings: feature vectors at the 3rd fully-connected layer of theAlexnet DNN [Krizhevsky et al., 2012] (1000 dimension vectors)
Shot clustering
Integer Linear Program clustering [Rouvier and Meignier, 2012].
No face-related processing (detection/identification) is used in our approach
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 8 / 16
![Page 13: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/13.jpg)
Visual processing
Visual processing
Shot boundaries
Colour histogram peaks on sliding window
Shot boundaries mapping:
July 1st: overlapping shots over 0.5sJuly 8th: overlapping coverage above 50%
Shot similarities
Cosine-based distance on:
RGB histograms
HOG features on resized frames (128×64)
Image embeddings: feature vectors at the 3rd fully-connected layer of theAlexnet DNN [Krizhevsky et al., 2012] (1000 dimension vectors)
Shot clustering
Integer Linear Program clustering [Rouvier and Meignier, 2012].
No face-related processing (detection/identification) is used in our approach
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 8 / 16
![Page 14: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/14.jpg)
Visual processing
Visual processing
Shot boundaries
Colour histogram peaks on sliding window
Shot boundaries mapping:
July 1st: overlapping shots over 0.5sJuly 8th: overlapping coverage above 50%
Shot similarities
Cosine-based distance on:
RGB histograms
HOG features on resized frames (128×64)
Image embeddings: feature vectors at the 3rd fully-connected layer of theAlexnet DNN [Krizhevsky et al., 2012] (1000 dimension vectors)
Shot clustering
Integer Linear Program clustering [Rouvier and Meignier, 2012].
No face-related processing (detection/identification) is used in our approach
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 8 / 16
![Page 15: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/15.jpg)
Visual processing
Visual processing
Shot boundaries
Colour histogram peaks on sliding window
Shot boundaries mapping:
July 1st: overlapping shots over 0.5sJuly 8th: overlapping coverage above 50%
Shot similarities
Cosine-based distance on:
RGB histograms
HOG features on resized frames (128×64)
Image embeddings: feature vectors at the 3rd fully-connected layer of theAlexnet DNN [Krizhevsky et al., 2012] (1000 dimension vectors)
Shot clustering
Integer Linear Program clustering [Rouvier and Meignier, 2012].
No face-related processing (detection/identification) is used in our approachMeriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 8 / 16
![Page 16: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/16.jpg)
Document chaptering
Shot annotations
8 videos, 4914 shots
4 labels: Studio, Report, Mixed, Other
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 9 / 16
![Page 17: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/17.jpg)
Document chaptering
Document chaptering
Shot classification
Train = 3688 shots / test=1226 shots
Liblinear classifier
Accuracy = 99.43 %
Chaptering
Successive shots having the same label
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 10 / 16
![Page 18: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/18.jpg)
Document chaptering
Document chaptering
Shot classification
Train = 3688 shots / test=1226 shots
Liblinear classifier
Accuracy = 99.43 %
Chaptering
Successive shots having the same label
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 10 / 16
![Page 19: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/19.jpg)
Talking faces identification
Secondary strategy
Speaker identification + rule-based speaker-face mapping
Name propagation
The speaker is visible when:
Name appears on screen
On studio shots
On report shots when the role is not a reporter
Scores
No scores function was developed (score=1)
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 11 / 16
![Page 20: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/20.jpg)
Talking faces identification
Secondary strategy
Speaker identification + rule-based speaker-face mapping
Name propagation
The speaker is visible when:
Name appears on screen
On studio shots
On report shots when the role is not a reporter
Scores
No scores function was developed (score=1)
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 11 / 16
![Page 21: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/21.jpg)
Talking faces identification
Primary strategyShot clusterings + chapter-restricted propagation
Name propagation
Direct propagation: names to overlapping shots
Within a chapter, to shot-clusters sharing the speaker-cluster
Anchor name:
Propagate anchor names to overlapping studio-shots and their shot-clustersPropagate anchor names if speaker role is anchor
Scores
Initialize with OCR scores + incrementally increase following the origin:
Direct propagation: OCR shot overlapping
Talking-face score > 0.8
Name pronounced around the shot (± 5s)
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 12 / 16
![Page 22: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/22.jpg)
Talking faces identification
Primary strategyShot clusterings + chapter-restricted propagation
Name propagation
Direct propagation: names to overlapping shots
Within a chapter, to shot-clusters sharing the speaker-cluster
Anchor name:
Propagate anchor names to overlapping studio-shots and their shot-clustersPropagate anchor names if speaker role is anchor
Scores
Initialize with OCR scores + incrementally increase following the origin:
Direct propagation: OCR shot overlapping
Talking-face score > 0.8
Name pronounced around the shot (± 5s)
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 12 / 16
![Page 23: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/23.jpg)
Talking faces identification
Submissions
Systems
Primary: primary strategy with DNN- and HOG-based shot clustering
Primary DNNOnly: primary strategy with DNN-based shot clustering
Primary RGBOnly: primary strategy with RGB-based shot clustering
Secondary: secondary strategy based on speaker identification andspeaker-face rule-based mapping
Evidences
For each name, select the provided OPN shot that maximizes the OCR result score
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 13 / 16
![Page 24: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/24.jpg)
Talking faces identification
Submissions
Systems
Primary: primary strategy with DNN- and HOG-based shot clustering
Primary DNNOnly: primary strategy with DNN-based shot clustering
Primary RGBOnly: primary strategy with RGB-based shot clustering
Secondary: secondary strategy based on speaker identification andspeaker-face rule-based mapping
Evidences
For each name, select the provided OPN shot that maximizes the OCR result score
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 13 / 16
![Page 25: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/25.jpg)
Results
Results
Metrics EwMAP MAP CBaseline 78.35 78.64 92.71Secondary on July 1st 85.89 86.12 97.6Secondary on July 8th 86.40 86.61 97.68Primary DNNOnly on July 1st 81.41 81.67 97.63Primary DNNOnly on July 8th 87.75 88.01 97.63Primary RGBOnly on July 1st 81.02 81.28 97.63Primary RGBOnly on July 8th 87.33 87.60 97.63
Primary on July 1st deadline 81.70 81.96 97.63Primary on July 8th 88.19 88.45 97.63
Table : Performances of PERCOLATTE 2015 runs
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 14 / 16
![Page 26: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/26.jpg)
Results
Results
Metrics EwMAP MAP CBaseline 78.35 78.64 92.71Primary 88.19 88.45 97.63
Primary DNNOnly 87.75 88.01 97.63Primary HOGOnly 88.04 88.30 97.63Primary RGBOnly 87.33 87.60 97.63
Primary without speaker restriction 88.49 88.75 97.63Primary without anchor process 88.05 88.31 97.39
Table : Performances of PERCOLATTE 2015 runs
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 15 / 16
![Page 27: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/27.jpg)
Conclusion
Conclusion
Talking faces identification
Without face-related processing
Easy-to-establish features:
Shots classificationSpeaker role classification
Minor prior knowledge about broadcast news:
Chapter restrictionList of journalists
+10% of MAP compared to the Baseline
TV programs = ambiguous context but regular structure
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 16 / 16
![Page 28: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/28.jpg)
Conclusion
Conclusion
Talking faces identification
Without face-related processing
Easy-to-establish features:
Shots classificationSpeaker role classification
Minor prior knowledge about broadcast news:
Chapter restrictionList of journalists
+10% of MAP compared to the Baseline
TV programs = ambiguous context but regular structure
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 16 / 16
![Page 29: MediaEval 2015 - PERCOLATTE: a multimodal person discovery system in TV broadcast for the Medieval 2015 evaluation campaign](https://reader031.fdocuments.us/reader031/viewer/2022022202/588145631a28abf65a8b6f63/html5/thumbnails/29.jpg)
Conclusion
Barras, C., Zhu, X., Meignier, S., and Gauvain, J.-L. (2006).Multi-stage speaker diarization of broadcast news.IEEE Transactions on Audio, Speech and Language Processing.
Damnati, G. and Charlet, D. (2011).Multi-view approach for speaker turn role labeling in tv broadcast news shows.In INTERSPEECH, pages 1285–1288. ISCA.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012).Imagenet classification with deep convolutional neural networks.In Pereira, F., Burges, C., Bottou, L., and Weinberger, K., editors, Advancesin Neural Information Processing Systems 25, pages 1097–1105. CurranAssociates, Inc.
Rouvier, M. and Meignier, S. (2012).A global optimization framework for speaker diarization.In Speaker Odyssey.
Meriem Bendris, Delphine Charlet, Gregory Senay, MinYoung Kim, Benoit Favre, Mickael Rouvier, Frederic Bechet, Geraldine DamnatiPERCOLATTE Sept. 14-15, 2015 Wurzen, Germany 16 / 16