Sacodeyl Birmingham 2007

27
Spoken multimedia corpora for pedagogical purposes Sabine Braun (University of Surrey) Pascual Pérez-Paredes (Universidad de Murcia) Ylva Berglund (Oxford University) Birmingham Corpus Linguistics Conference 2007

description

http://www.um.es/sacodeyl

Transcript of Sacodeyl Birmingham 2007

Page 1: Sacodeyl Birmingham 2007

Spoken multimedia corpora for pedagogical purposes

Sabine Braun (University of Surrey)

Pascual Pérez-Paredes (Universidad de Murcia)

Ylva Berglund (Oxford University)

Birmingham Corpus Linguistics Conference 2007

Page 2: Sacodeyl Birmingham 2007

Introduction

• The usefulness of corpora in language pedagogy is widely recognised.

• But there is a need for pedagogically relevant corpora, reflected e.g. in initiatives to create 'ad-hoc' corpora in pedagogical contexts.

• The creation of pedagogically relevant corpora raises challenges for corpus design.

• Past and current initiatives have largely focussed on written corpora; spoken discourse is becoming more important in pedagogical contexts.

• The creation of pedagogically relevant spoken corpora raises additional challenges for corpus design.

Page 3: Sacodeyl Birmingham 2007

The challenges (1)

CORPUS DESIGNTraditional reference corpora (content, size, data format,transcription, annotation, query)

CORPUS EXPLOITATIONData-Driven Learning (focus on non-linear reading: concordances and co-texts)

• Corpora contain textual records of discourse; their interpretation requires (re-)contextualisation.

• Learners may have difficulties analysing corpus data; they require pedagogical mediation.• Pedagogical corpus uses differ from linguistic description; this requires e.g. pedagogically

motivated query options.• Corpora need to be integrated with curricula; this requires e.g. complementarity of content

and effective delivery.

Do not fully support pedagogical requirements.

Page 4: Sacodeyl Birmingham 2007

The challenges (2)

CORPUS DESIGNTraditionally: representation in written format

CORPUS EXPLOITATIONWork with text-only data and e.g. conversational markup

• Spoken discourse is more dependent on shared physical contexts.• It is adjusted to aural and online perception (e.g. chunking)• It is affected by limitations of processing capacity (false starts, repair).• It is marked by accents.• It is multimodal.

Again, this does not fully support pedagogical requirements.

Page 5: Sacodeyl Birmingham 2007

Requirements

• Format: multimedia to retain multimodal character of spoken language

• Content: complementary with curriculum topics, more coherence than in traditional corpora

• Pedagogically motivated transcription, annotation and alignment (transcript-video)

• Combination of query methods: text-based exploration and application of corpus techniques

• Pedagogical enrichment of corpora with complementary resources (e.g. exercises, explanations)

• Effective delivery of corpora and additional resources to learners/teachers

Page 6: Sacodeyl Birmingham 2007

Corpus creation (1)

ELISA• Professional English• Accounts of professional life• Different varieties

SACODEYL• 7 European languages• Youth language corpora• Speakers 13-15 and 16-18

• Examples: ELISA and SACODEYL• Interview format• Video clips with transcript

• Communicatively relevant topics, e.g. in SACODEYL topics outlined in the Common European Framework

• Elicitation process: briefing informants and prompting them during the interview, ensuring naturally flowing discourse

Page 7: Sacodeyl Birmingham 2007

Corpus creation (1)

Topic Interview questions

Age CEF Gramm. functions

Holidays 1. Where did you spend your last holidays?

13-1516-18

A2 can describe past activities, personal experiences

Past tense

2. What are your plans for the next holidays?

13-15 B1 can describe dreams, hopes and ambitions

FutureConditonalModal verbs

Plans for the future

1. What are your plans for your career?

16-18 B1 can explain/give reasons for my plans, intentions and actions

Future

2. On what grounds do you decide?

16-18 B2 can speculate about causes, consequences, hypothetical situations

ConditionalModal verbs

Example of topics in SACODEYL

Page 8: Sacodeyl Birmingham 2007

Corpus creation (2)

Markup

Pedagogic annotationXML files

TEI-compliant corpora

Transcription

CONTINUUMRAW, ORTHOGRAPHIC TRANSCRIPTION – ANNOTATED CORPORA

Page 9: Sacodeyl Birmingham 2007

Corpus creation (2)

SACODEYL

TRANSCRIPTOR

SACODEYL

ANNOTATOR

Markup

Pedagogic annotationXML files

TEI-compliant corpora

Transcription

Page 10: Sacodeyl Birmingham 2007

Corpus creation (3)SACODEYL

TRANSCRIPTOR

Page 11: Sacodeyl Birmingham 2007

[METADATA]Title: La Unión Europea une a los

ciudadanosDate Recording:2006-11-05Date Transcription:2007-02-02Locale:I.E.S. Floridablanca,Murcia,

EspañaPrincipal Investigator: Pascual

Perez-ParedesResearcher:Pascual Perez-ParedesTranscriber: Encarnación Tornero

ValeroEditor:Autority: SACODEYL ProjectID:

Language:ESMediaFileName:ES02.aviParticipants: person:Chico name: role: Entrevistado sex: Hombre age: 16 description: person: E name: Andrés Mercader Rodríguez role: Entrevistador sex: Hombre age: 32 description:[/METADATA]

Corpus creation (2)

Page 12: Sacodeyl Birmingham 2007
Page 13: Sacodeyl Birmingham 2007
Page 14: Sacodeyl Birmingham 2007
Page 15: Sacodeyl Birmingham 2007
Page 16: Sacodeyl Birmingham 2007
Page 17: Sacodeyl Birmingham 2007
Page 18: Sacodeyl Birmingham 2007

Corpus query

• Query options will support text- and corpus-based exploration and include e.g.– Easy access to entire interviews– A topic index supporting the analysis of similar

sections across interviews ("topic concordances")– Other indices based on the annotation categories– Ready-made data (e.g. frequency lists of each

interview; selective concordances) – A concordancer for extended/advanced search;

adapted to pedagogical requirements

Page 19: Sacodeyl Birmingham 2007

Corpus query

Page 20: Sacodeyl Birmingham 2007

Pedagogical enrichment

• The corpora will be enriched with prototypical learning activities.

• These will focus on one interview section or one interview as a whole or sections across interviews…

• They will include e.g. – linguistic and cultural explanations and exercises

(form-focussed as well as communication-oriented),– (listening) comprehension and production tasks,– explorative tasks (concordance-based as well as

interview-based).• Use of authoring tool Telos Language Partner to create

learning packages with ranges of activities.

Page 21: Sacodeyl Birmingham 2007

Pedagogical enrichment

Page 22: Sacodeyl Birmingham 2007

Pedagogical enrichment

Page 23: Sacodeyl Birmingham 2007

Pedagogical enrichment

Page 24: Sacodeyl Birmingham 2007

Pedagogical enrichment

Page 25: Sacodeyl Birmingham 2007

Corpus delivery

• Effective delivery as a further prerequisite for integration into curriculum

• In SACODEYL, use of Moodle learning platform, giving access to:– Corpora (query interfaces)– Resources created in the project (different types of

learning activities)– Resources created by future corpus users

Page 26: Sacodeyl Birmingham 2007

Summary

• Method outlined is transferable to other pedagogical contexts, topics, languages

• Method helps to use corpora more efficiently in pedagogical contexts – from sporadically used resource to systematic exploitation

• Corpus creation complies with standards to facilitate reuse of corpora for other contexts (research)

Page 27: Sacodeyl Birmingham 2007

Contact

Sabine Braun:[email protected]

Pascual Pérez-Paredes:[email protected]

Ylva Berglund:[email protected]

And visit our poster session…

As well as our websites:

www.um.es/sacodeyl

www.corpora4learning.net/elisa