NLP-Experimente: Überblick und Workflow

21
NLP-Experimente: Überblick und Workflow HS Experimentelles Arbeiten in der Sprachverarbeitung Nils Reiter [email protected] 4. November 2021

Transcript of NLP-Experimente: Überblick und Workflow

NLP-Experimente: Überblick und WorkflowHS Experimentelles Arbeiten in der Sprachverarbeitung

Nils [email protected]

4. November 2021

Experimental Structure

Experimental Structure

ExperimentsCorpus

ManualAnnotation

Gold Standard

Program/Automatization

System output

Comparison/Evaluation

P R F

Programm 5.7 9.2 …v2 9.9 16.7 …v3 15.3 21.8 …

Program v2

System output

Program v3

System output

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 3 / 32

Experimental Structure

ExperimentsCorpus

ManualAnnotation

Gold Standard

Program/Automatization

System output

Comparison/Evaluation

P R F

Programm 5.7 9.2 …v2 9.9 16.7 …v3 15.3 21.8 …

Program v2

System output

Program v3

System output

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 3 / 32

Experimental Structure

ExperimentsCorpus

ManualAnnotation

Gold Standard

Program/Automatization

System output

Comparison/Evaluation

P R F

Programm 5.7 9.2 …v2 9.9 16.7 …v3 15.3 21.8 …

Program v2

System output

Program v3

System output

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 3 / 32

Experimental Structure

ExperimentsCorpus

ManualAnnotation

Gold Standard

Program/Automatization

System output

Comparison/Evaluation

P R F

Programm 5.7 9.2 …v2 9.9 16.7 …v3 15.3 21.8 …

Program v2

System output

Program v3

System output

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 3 / 32

Experimental Structure

Zwei Papiere

▶ Was waren die Hypothesen?▶ Konnten Sie bestätigt werden?

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 4 / 32

Experimental Structure

Experiments

▶ Reproducibility▶ Hypotheses about the operationalisation of language/text phenomena

Example▶ Position within a sentence is indicative for the part of speech▶ Meaning of a word depends on its context▶ The protagonist of a play is the character who talks the most

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 5 / 32

Experimental Structure

What do we need?

▶ Gold standard▶ Formal, ‘machine-readable truth’

▶ Program (rule-based, machine learning)▶ Encodes our hypotheses

▶ Evaluation metric▶ Formalised comparison of annotations

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 6 / 32

Experimental Structure

What do we learn?

▶ Directly▶ Prediction quality of the program on this corpus

▶ Indirectly▶ Insights, why the program works well (or not)▶ Estimation of the quality on other corpora

▶ Long term▶ Iterative improvement of the programs (e.g., in shared tasks)

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 7 / 32

Experimental Structure

Zwei Papiere

▶ Was waren die Hypothesen?▶ Konnten Sie bestätigt werden?

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 8 / 32

Manual Annotation

Manual Annotation

Annotation

▶ Interdisciplinary ‘false friend’▶ Different meanings in different disciplines

▶ Adding TEI/XML markup: DH community▶ Adding comments to page margins: Hermeneutic traditions

▶ Literary studies, bible studies▶ Assigning categories to textual material: (computational) linguistics

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 10 / 32

Manual Annotation

Annotation Workflow

Theory

Pilotannotation Analysis

Annotationguidelines v1

Corp

us

Annotation Analysis

Annotationguidelines v2

Annotation

Hovy/Lavid (2010); Pagel et al. (2018)

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 11 / 32

Manual Annotation

Annotation guidelines

▶ Describe the way to create the machine-readable truth▶ What is to be annotated (which words)▶ Working definitions or tests for categories▶ Living documents: Need to be iteratively improved▶ Community-wide accepted standards are needed

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 12 / 32

Manual Annotation

Annotation Analysis

▶ Multiple annotators annotate the same text(s)▶ Annotations are compared▶ Disagreements can be quantified (‘Inter-Annotator-Agreement’, IAA)

Cohen, 1960; Fleiss, 1971; Fournier, 2013; Mathet et al., 2015▶ Inter- und Intra-AA▶ … it’s also a good idea to talk to the annotators

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 13 / 32

Manual Annotation

Indirect Annotations

▶ Annotations as a by-product of games▶ https://www.artigo.org Kohle (2010)▶ https://anawiki.essex.ac.uk/phrasedetectives/ Chamberlain et al. (2008)

▶ Captchas for OCR correction

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 14 / 32

Manual Annotation

Indirect Annotations

▶ Annotations as a by-product of games▶ https://www.artigo.org Kohle (2010)▶ https://anawiki.essex.ac.uk/phrasedetectives/ Chamberlain et al. (2008)

▶ Captchas for OCR correction

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 14 / 32

Manual Annotation

Indirect Annotations

▶ Annotations as a by-product of games▶ https://www.artigo.org Kohle (2010)▶ https://anawiki.essex.ac.uk/phrasedetectives/ Chamberlain et al. (2008)

▶ Captchas for OCR correction

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 14 / 32

Manual Annotation

Learning without Annotations

▶ Train on things that are already there▶ word2vec: Is ‘dog’ a context word of ‘lazy’? Mikolov et al. (2013)▶ BERT Devlin et al. (2019)

▶ Can you fill in this blanked word? (“masked language modeling”, MLM)▶ Are these two sentences natural neighbours? (“next sentence prediction”, NSP)

▶ Training data available in abundance

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 15 / 32

Manual Annotation

Learning without Annotations

▶ Train on things that are already there▶ word2vec: Is ‘dog’ a context word of ‘lazy’? Mikolov et al. (2013)▶ BERT Devlin et al. (2019)

▶ Can you fill in this blanked word? (“masked language modeling”, MLM)▶ Are these two sentences natural neighbours? (“next sentence prediction”, NSP)

▶ Training data available in abundance

Reiter NLP-Experimente: Überblick und Workflow 4. November 2021 15 / 32