CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on...
-
Upload
rodger-harvey -
Category
Documents
-
view
216 -
download
0
description
Transcript of CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on...
![Page 1: CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.](https://reader036.fdocuments.us/reader036/viewer/2022082600/5a4d1b417f8b9ab0599a0dac/html5/thumbnails/1.jpg)
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA1
Overview of QAST 2008
- Question Answering on Speech Transcriptions -
J. Turmo, P. Comas (1), L. Lamel, S. Rosset (2) , N. Moreau, D. Mostefa (3)
(1) UPC, Spain (2) LIMSI, France (3) ELDA, France
QAST Website : http://www.lsi.upc.edu/~qast/
![Page 2: CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.](https://reader036.fdocuments.us/reader036/viewer/2022082600/5a4d1b417f8b9ab0599a0dac/html5/thumbnails/2.jpg)
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA2
Outline
1. Objectives2. Description of the tasks3. Participants4. Results5. Future work
![Page 3: CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.](https://reader036.fdocuments.us/reader036/viewer/2022082600/5a4d1b417f8b9ab0599a0dac/html5/thumbnails/3.jpg)
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA3
Objectives of QAST 2008
- Development of robust QA for speech transcripts
- Measure loss due to ASR inaccuraciesmanual transcriptions, automatic transcriptions
- Measure loss at different ASR word error rates
- Test with different kinds of speechspontaneous speech, prepared speech
- Development of QA for languages other than EnglishEnglish, French, Spanish
![Page 4: CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.](https://reader036.fdocuments.us/reader036/viewer/2022082600/5a4d1b417f8b9ab0599a0dac/html5/thumbnails/4.jpg)
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA4
QAST 2008 Organization
Task jointly organized by :
- UPC, Spain (Coordinator)J. Turmo, P. Comas
- ELDA, FranceN. Moreau, D. Mostefa
- LIMSI-CNRS, FranceS. Rosset, L. Lamel
![Page 5: CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.](https://reader036.fdocuments.us/reader036/viewer/2022082600/5a4d1b417f8b9ab0599a0dac/html5/thumbnails/5.jpg)
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA5
Evaluation Data
Corpus Lang. Description Tasks WERCHIL
QAST 2007English Lectures (~25h) T1(a): Manual transcriptions -
T1(b): ASR transcriptions 20%
AMIQAST 2007
English Meetings (~100h) T2(a): Manual transcriptions -
T2(b): ASR transcriptions 38%
ESTER French Broadcast News (~10h)
T3(a): Manual transcriptions -
T3(b): ASR transcriptions 11.9% / 23.9% / 35.4%
EPPS-EN
English Sessions European Parliament (~3h)
T4(a): Manual transcriptions -
T4(b): ASR transcriptions 10.6% / 14.0% / 24.1%
EPPS-ES
Spanish Sessions European Parliament (~3h)
T5(a): manual transcriptions -
T5(b): ASR transcriptions 11.5% / 12.7% / 13.7%
![Page 6: CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.](https://reader036.fdocuments.us/reader036/viewer/2022082600/5a4d1b417f8b9ab0599a0dac/html5/thumbnails/6.jpg)
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA6
Development set Evaluation set
Task Data # questions Data # questionsT1 (CHIL, English) 10 seminars 50 15 seminars 100
T2 (AMI, English) 50 meetings 50 118 meetings 100
T3 (ESTER, French) 6 shows 50 12 shows 100
T4 (EPPS, English) 3 sessions 50 3 sessions 100
T5 (EPPS, Spanish) 1 session 50 5 sessions 100
Questions
• Factual questions: ~75%Expected answers = named entities (10 types: person, location, organization, language, system, measure, time, color, shape, material)
• Definition questions: ~25%4 types of answers: person, organization, object, other
• ‘NIL’ questions: ~10%
![Page 7: CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.](https://reader036.fdocuments.us/reader036/viewer/2022082600/5a4d1b417f8b9ab0599a0dac/html5/thumbnails/7.jpg)
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA7
• Participants could submit up to:– 2 submissions per task (and per WER)– 5 answers per question
• Answers for ‘manual transcriptions’ tasks:Answer_string + Doc_ID
• Answers for ‘automatic transcriptions’ tasks:Answer_string + Doc_ID + Time_start + Time_end
Submissions
![Page 8: CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.](https://reader036.fdocuments.us/reader036/viewer/2022082600/5a4d1b417f8b9ab0599a0dac/html5/thumbnails/8.jpg)
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA8
• Four possible judgments (as in QA@CLEF):Correct / Incorrect / Inexact / Unsupported
• ‘Manual transcriptions’ tasks:Manual assessment with the QASTLE interface
• ‘Automatic’ transcriptions tasksAutomatic assessment (script) + manual check
• 2 metrics:– Mean Reciprocal Rank (MRR)
measures how well right answers are ranked on average– Accuracy
fraction of correct answers ranked in the first position
Assessments
![Page 9: CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.](https://reader036.fdocuments.us/reader036/viewer/2022082600/5a4d1b417f8b9ab0599a0dac/html5/thumbnails/9.jpg)
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA9
49 submissions from 5 participants:
Participants
T1a T1b T2a T2b T3a T3b T4a T4b T5a T5b
2 - - - - - 2 - - -
- - - - - - 1 2 - -
1 1 1 1 2 3 1 3 2 3
- - - - - - 1 3 - -
1 2 1 2 - - 1 6 1 6
4 3 2 3 2 3 6 14 3 9
Univ. Chemnitz (CUT)
INAOE
LIMSI
Univ. Alicante (UA)
UPC
TOTAL:
![Page 10: CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.](https://reader036.fdocuments.us/reader036/viewer/2022082600/5a4d1b417f8b9ab0599a0dac/html5/thumbnails/10.jpg)
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA10
Best results for manual transcriptions
TaskT1aT2aT3aT4aT5a
FactualMRR Acc(%)0.53 47.40.47 37.80.50 45.30.44 40.00.32 29.3
DefinitionalMRR Acc(%)0.18 18.20.22 19.20.47 44.00.16 16.00.44 36.0
AllMRR Acc(%)0.45 41.00.40 33.00.49 45.00.37 34.00.35 31.0
![Page 11: CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.](https://reader036.fdocuments.us/reader036/viewer/2022082600/5a4d1b417f8b9ab0599a0dac/html5/thumbnails/11.jpg)
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA11
Best results for ASR transcriptions
Task WERT1b 20.0%T2b 38.0%T3b 11.9%
23.9%35.4%
T4b 10.6%14.0%24.1%
T5b 11.5%12.7%13.7%
AllMRR Acc(%)0.34 31.0
0.20 18.0
0.45 41.0
0.30 25.0
0.24 21.0
0.33 30.0
0.24 20.0
0.23 19.0
0.26 24.0
0.23 20.0
0.25 23.0
All (manual)
MRR Acc(%)
0.45 41.0
0.40 33.0
0.49 45.0
0.37 34.0
0.35 31.0
![Page 12: CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.](https://reader036.fdocuments.us/reader036/viewer/2022082600/5a4d1b417f8b9ab0599a0dac/html5/thumbnails/12.jpg)
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA12
• 5 participants (as in 2007)• 4 different countries (vs. 5 in 2007)
Germany, Spain, France, Mexico• 49 submitted runs (vs. 28 runs in 2007)• Loss in accuracy with ASR transcribed speech
(performance falls when WER rises)• QAST 2009: Written & Oral Questions...
Conclusion