Calico 2014 intelligent call - def

49
CALICO 2014 Seven ways to make CALL more intelligent. Towards the effective integration of NLP techniques Piet Desmet in collaboration with Frederik Cornillie, Ruben Lagatie, Sien Moens, Maribel Montero Perez, Hans Paulussen & Serge Verlinde

Transcript of Calico 2014 intelligent call - def

Page 1: Calico 2014   intelligent call - def

CALICO 2014

Seven ways to make CALL more intelligent. Towards the effective integration of NLP techniques

Piet Desmet

in collaboration with

Frederik Cornillie, Ruben Lagatie, Sien Moens, Maribel Montero Perez, Hans Paulussen & Serge Verlinde

Page 2: Calico 2014   intelligent call - def

CALICO 2014

0. Introduction

0.1. Terminology

Parser-based CALL (Holland et al. 1993) NLP-enhanced CALL (Nerbonne 2005) Intelligent CALL or iCALL “ICALL – Intelligent CALL – is a field within CALL which applies concepts, techniques, algorithms and technologies from AI to CALL (…). Most relevant to CALL is research in four branches of AI: (1) natural language processing, (2) user modelling, (3) expert systems and (4) intelligent tutoring systems”. (Schulze 2010: 65)

-> progressively larger scope

Page 3: Calico 2014   intelligent call - def

CALICO 2014

0.2. Challenges and opportunities

Most of the CALL environments do not use NLP -> use of NLP in classrooms is hardly mainstream practice “The development of systems using NLP technology is not on the agenda of most CALL experts,

and interdisciplinary research projects integrating computational linguists and foreign language teachers remain very rare” (Amaral & Meurers 2011: 6). cf. also report by the Dutch Language Union: Onderzoek taal- en spraaktechnologie en onderwijs (Van den Heuvel, T’Sas & Verberne 2012)

Technological concerns Pedagogical concerns

Page 4: Calico 2014   intelligent call - def

CALICO 2014

Technological concerns Can NLP account for the full complexity of natural human languages

“I am pessimistic about the possibility of ICALL” (Nyns 1989: 46)

+ in educational settings, no room for erroneous analyses -> need for nearly error-free applications But: too limited accuracy of NLP-tools -> risk of mislearning of L2

http://www.flickr.com/photos/maphutha/1303854829/

Page 5: Calico 2014   intelligent call - def

CALICO 2014

“It is therefore of the utmost importance to warn the users of the limitations of the tools. Ideally, inadequate outputs provided by the NLP tools should make the learners reflect even more on the language they are learning and on its structure. Therefore, even NLP tool errors should help the learners master the target language because they require more thinking on their part. Thus, NLP tools can be useful for CALL and usefully used in CALL. The present, imperfect state of technology should not be a hindrance, although there is obviously room for improvement. However, improvements are often triggered by remarks and studies on how tools effectively work in context. Thus, if one waits to see improvements before using NLP tools in CALL software, one might have to wait for a long time, while using the tools in their present state will encourage improvements to be made according to the needs of the language learners” (Vandeventer 2003 – Linguistik online 17 – Learning and Teaching (in) Computational Linguistics)

-> users are intelligent and undemanding (BUT: for all language levels?) + ICALL stimulates reflective practice -> Perhaps the best is yet to come, but let’s choose not to wait…

Page 6: Calico 2014   intelligent call - def

CALICO 2014

The Penrose impossible stairs

Computerpower doubles every 2 years

Technological progress vs Pedagogical progress!

Pedagogical concerns

Page 7: Calico 2014   intelligent call - def

CALICO 2014

Pedagogical concerns Initial NLP-enhanced CALL is mainly form-focused <-> Also ‘focus on meaning’ & meaning-based activities in Foreign language learning and teaching (FLLT) ICALL allows for the use of authentic materials and skills-oriented activities (cf. communicative approach) BUT: still appropriate task design needed in ICALL “From a computational perspective, a well-defined task design with its clear set of relevant language constructions facilitates the restriction to a linguistic domain which is ‘manageable’ for a system’s natural language processing modules” (Schulze 2010: 79)

<-> In FLLT focus on authentic language tasks with fully open & unpredictable interaction in real life settings (cf. task-based language teaching) -> challenges for ICALL!

Page 8: Calico 2014   intelligent call - def

CALICO 2014

0.3. Towards a typology (a) Linguistic perspective: type of language involved

Written vs spoken, native vs learner language, etc.

(b) Technological perspective: NLP & AI-technologies involved Parsing, NER, topic detection, sentiment mining, text summarization, etc.

(c) Pedagogical perspective: type of learning activities involved Reception, production, interaction, mediation

(d) CALL perspective: type of technology-based learning or teaching activities

Page 9: Calico 2014   intelligent call - def

CALICO 2014

CALL perspective: type of technology-based learning or teaching activities From receptive to productive activities with focus on written language input & output

1. Input provider

2. Reading companion

3. Exercise and test generator

4. Error detector, feedback generator and automatic scoring tool

5. Writing aid

6. Adaptive item sequencer

7. Resource generator

-> Conceptual outline + applications from academic R&D

Page 10: Calico 2014   intelligent call - def

CALICO 2014

1. Input provider

What? (Semi-)automated selection of comprehensible & authentic text material How? a. analysis of readability and formal complexity + syntactic and lexical text simplification/elaboration b. analysis of meaning or text categorization (e.g. subject categorization, topic detection, text categorization, etc.)

Seven roles for ICALL applications

Page 11: Calico 2014   intelligent call - def

CALICO 2014

a. Text retrieval on the basis of readability evaluation of input REAP-project (Carnegie Mellon) http://reap.cs.cmu.edu “Reader-Specific Lexical Practice for Improved Reading Comprehension”: support learners in searching for texts that are well-suited for providing vocabulary and reading practice. + analysis of formal complexity of input SATO-project (François Daoust) http://www.ling.uqam.ca/sato Système d’analyse de texte par ordinateur + SATO-Calibrage Automated formal analysis of a corpus (existing or personal)

Page 12: Calico 2014   intelligent call - def

CALICO 2014

b. analysis of meaning & text categorization

© Sien Moens (2009)

Page 13: Calico 2014   intelligent call - def

CALICO 2014

2. Reading companion

What? helping learners understand foreign-language input How? annotation layers, both formal and semantic GLOSSER-project (John Nerbonne) http://www.let.rug.nl/glosser/Glosser lemmatization of the inflected forms dictionary entry (cf. Van Dale) examples of the word from corpora iRead+ project (ITEC) Lemmatization & POS-tagging Named entity recognition (persons, organisations, locations)

Page 14: Calico 2014   intelligent call - def

CALICO 2014

GLOSSER-project

Lemmatization of inflected forms

Dictionary entry (Van Dale)

Examples from corpora

Page 15: Calico 2014   intelligent call - def

CALICO 2014

Page 16: Calico 2014   intelligent call - def

CALICO 2014

Vb Anderlecht Taalk. /3

Page 17: Calico 2014   intelligent call - def

CALICO 2014

Page 18: Calico 2014   intelligent call - def

CALICO 2014

3. Exercise and test generator

What? (semi-)automated generation of exercise and test items How? based on the analysis of L1 text materials and/or on the analysis of learner errors - morpho-syntactical activities: iRead+ project (ITEC) VIEW-project (Detmar Meurers) http://sifnos.sfs.uni-tuebingen.de/VIEW Visual Input Enhancement of the Web

- lexical activities: Alfalex-project (Serge Verlinde) http://ilt.kuleuven.be/publicaties/tools.php - semantic activities: cf. Sien Moens - semantic frame labeling

Page 19: Calico 2014   intelligent call - def

CALICO 2014

iRead+ Exercise generator: Three step model

– Explore

1. Application highlights target items

2. Learner recognizes target items and clicks on target items

– Practice

• Fill gaps or multiple choice activities

• Feedback

– Play

• Target items, drill & practice

• Time constraint

Page 20: Calico 2014   intelligent call - def

CALICO 2014

Page 21: Calico 2014   intelligent call - def

CALICO 2014

Alfalex (KU Leuven – Serge Verlinde)

Page 22: Calico 2014   intelligent call - def

CALICO 2014

Page 23: Calico 2014   intelligent call - def

CALICO 2014

Semantic frame labeling © Sien Moens (2009)

Page 24: Calico 2014   intelligent call - def

CALICO 2014

4. Error detector, feedback generator and automatic scoring tool

What? Automated analysis of learner output and generation of feedback Not limited to closed exercises (MC, fill-in-the-blank, etc.) but also (semi-)open practice tasks (translation, correction, rephrasing, etc.) How? 1) Approximate (or fuzzy) string matching (ASM) -> anticipate all potential well-formed and ill-formed learner responses

(with inclusion of regular expressions)

2) Parser-based (malrules, constraint relaxation, etc.) and/or statistical and machine learning methods

Page 25: Calico 2014   intelligent call - def

CALICO 2014

0

5

10

15

20

197

8

197

9

198

0

198

1

198

2

198

3

198

4

198

5

198

6

198

7

198

8

198

9

199

0

199

1

199

2

199

3

199

4

199

5

199

6

199

7

199

8

199

9

200

0

200

1

200

2

200

3

200

4

200

5

200

6

200

7

200

8

200

9

201

0

201

1

201

2

Nu

mb

er o

f p

ub

lica

tio

ns

Year

Parser-based ICALL Statistical ICALL

Looking at the number of publications shown in this figure, parser-based CALL has dominated research up until 2005, but there seems to have been a shift in research interest and statistical error detection systems have since then taken the upper hand.

© PhD Ruben Lagatie

Page 26: Calico 2014   intelligent call - def

CALICO 2014

BUT: fully open-ended learner output hard to manage -> constrain learner output (reduce possible answers) by reducing the search space of the processing tools How? ° task: structured towards specific type of output - require the learner to use certain language material e-tutor-project (Trude Heift) http://www.e-tutor.sfu.ca translation Tagarela-project (Detmar Meurers) http://sifnos.sfs.uni-tuebingen.de/tagarela answer should include certain words - the task design offers clues towards a limited set of possible answers Dialog Dungeon (ITEC) gamified written dialog tasks ° correction: focus on specific set of problems (e.g. past tenses, subject-verb agreement, etc.)

Page 27: Calico 2014   intelligent call - def

CALICO 2014

E-tutor project

Page 28: Calico 2014   intelligent call - def

CALICO 2014

Tagarela-project

Page 29: Calico 2014   intelligent call - def

CALICO 2014

‘social’ log-in

Dialog Dungeon gamified written dialogue tasks

@Frederik Cornillie

semi-open written exercise

learning support: link to responses of peers

Page 30: Calico 2014   intelligent call - def

CALICO 2014

learning support: responses of peers

Page 31: Calico 2014   intelligent call - def

CALICO 2014

focused exercise on tenses

Page 32: Calico 2014   intelligent call - def

CALICO 2014

‘typical’ error

Page 33: Calico 2014   intelligent call - def

CALICO 2014

contrastive analysis learner reponse – closest correct match (based on approximate string matching, POS tagging, lemmatization)

Page 34: Calico 2014   intelligent call - def

CALICO 2014

supportive prompt with metalinguistic hints (based on POS tagging)

Page 35: Calico 2014   intelligent call - def

CALICO 2014

5. Writing aid

What? Support the learner in writing a functional, well-formed tekst How? No automated correction, but suggestions to help the learner improve himself his output Post-writing checks (or on-the-fly prompts) Assistance on lower-order skills (spelling) but also on lexical (including MWE), lexico-grammatical and grammatical skills Interactive Language Toolbox-project (Serge Verlinde) https://ilt.kuleuven.be/inlato Bon patron, SpellCheckPlus & SpanishChecker (Nadaclair Language Technologies) http://bonpatron.com; http://spellcheckplus.com; http://spanishchecker.com + semantic and pragmatic analysis Glosser-project (Univ. of Sydney) http://www.glosserproject.org/en/

Page 36: Calico 2014   intelligent call - def

CALICO 2014

Interactive Language Toolbox

Page 37: Calico 2014   intelligent call - def

CALICO 2014

Page 38: Calico 2014   intelligent call - def

CALICO 2014

Page 39: Calico 2014   intelligent call - def

CALICO 2014

Glosser project (Sydney)

Page 40: Calico 2014   intelligent call - def

CALICO 2014

6. Adaptive item sequencer

(a) Adaptivity: to what? - difficulty level of the tasks - learner profile (prior knowledge, motivation, cognitive load, interests & preferences) - context (time and place, device, etc.) (b) What to adapt? - adaptive form representation (form in which content is presented to the learner, e.g. dynamically

generated hypermedia pages) - adaptive content representation (e.g. with or without learner support) - adaptive curriculum sequencing (e.g. selection of items in function of difficulty level, learner

profile, etc.)

(c) How to implement adaptivity? - full program control (via reasoning component ‘if.. then…’ rules) - full learner control - shared control

Wauters, K., Desmet, P., Van Den Noortgate, W. (2010). Adaptive Item-Based Learning Environments Based on the Item Response Theory: Possibilities and Challenges. Journal of Computer Assisted Learning, 26 (6), 549-562.

Page 41: Calico 2014   intelligent call - def

CALICO 2014

What about AI? (a) Adaptivity: to what? - linguistic complexity -> language agent (expert system) - item difficulty -> IRT - learner profile (prior knowledge, motivation, cognitive load, interests &

preferences) -> student agent (b) What to adapt? - adaptive curriculum sequencing (e.g. selection of items in function of difficulty

level, learner profile, etc.) -> tutor agent

(c) How to implement adaptivity? full or shared control -> tutor agent

Beuls, Katrien (2013) Processing, learning and tutoring of Spanish verb morphology. Brussels: VUB. (PhD)

Page 42: Calico 2014   intelligent call - def

CALICO 2014

7. Resource generator

What? creating reference materials: concordancing on bilingual corpora REBECA-project (ITEC) Ressources électroniques bilingues extraites de corpus alignés & more advanced search engines DPC-project (ITEC) http://www.kuleuven-kulak.be/DPC/ Dutch Parallel Corpus

corpus-enriched learner dictionaries & grammars BLF-project (Serge Verlinde) http://ilt.kuleuven.be/blf/ Base Lexicale du Français

How? Annotation & exploitation of corpora

For whom? (Intermediate or) advanced learners with well developed linguistic awareness

Page 43: Calico 2014   intelligent call - def

CALICO 2014

REBECA-project

Page 44: Calico 2014   intelligent call - def

CALICO 2014

DPC-project

Page 45: Calico 2014   intelligent call - def

CALICO 2014 16 June 2012 45

Example: “en réalité” French-Dutch

Page 46: Calico 2014   intelligent call - def

CALICO 2014 16 June 2012 46

Example: “en réalité” French - Dutch

Page 47: Calico 2014   intelligent call - def

CALICO 2014

Extension: video search

Exploitation of XML-based captions (transcriptions and translations) to render video selection more versatile and attractive: lemmas are time tags indexes which form reference points to video selections

Page 48: Calico 2014   intelligent call - def

CALICO 2014

à 00:00:09.6800

a 00:00:24.8900|00:00:28.8400|00:00:59.9000|00:01:12.7000

alors 00:00:24.8900

assez 00:00:18.0300|00:00:31.9000

au 00:00:38.3000

aussi 00:00:38.3000

aux 00:00:09.6800|00:01:05.8100

bas 00:00:35.6500

beaucoup 00:00:24.8900|00:00:55.0000|00:00:59.9000

bonnes 00:00:31.9000

calais 00:00:51.1200

ce 00:01:05.8100

celle 00:00:42.0800

c’est 00:00:09.6800|00:00:24.8900

cette 00:00:06.6400|00:01:05.8100|00:01:12.7000

chefs 00:01:16.0000

chercher 00:00:18.0300|00:00:31.9000

chou 00:00:04.0000

choux 00:01:05.8100

club 00:01:16.0000

comme 00:00:06.6400

comment 00:00:18.0300

composer 00:00:42.0800

cuisine 00:00:04.0000|00:00:14.1800|00:01:12.7000

curiosité 00:00:38.3000

dans 00:00:18.0300|00:00:18.0300|00:00:24.8900|00:00:28.8400|

00:00:31.9000|00:00:35.6500|00:00:47.2800|00:01:16.0000|00:01:19.9500

Viewing video extracts based on lexical

search in transcription and/or translation

(challenge: search on semantically related

words)

Page 49: Calico 2014   intelligent call - def

CALICO 2014

Conclusion… Digital (learning) is the new normal a must trivial mainstream It’s not about technology. It’s about usage & added value! Usage: ICALL should enter our classrooms + be integrated into applications Added value: ICALL is a (potentially) powerful means to realise the main objective: improve learning -> qualified optimism about NLP-enhanced CALL