NLP_lectures_English

195
Introduction to Natural Language Processing in 3 Sessions Dr. Alexandra M. Liguori Incubio The Big Data Academy Barcelona, March - April, 2015 Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Transcript of NLP_lectures_English

Page 1: NLP_lectures_English

Introduction to Natural Language Processingin 3 Sessions

Dr. Alexandra M. Liguori

Incubio – The Big Data Academy

Barcelona, March - April, 2015

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 2: NLP_lectures_English

Outline: Lecture 1

1 Introduction2 Natural Language Processing3 Linguistic Ambiguities4 Definition of corpus5 Typical NLP tasks6 POS-tagging

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 3: NLP_lectures_English

Outline: Lecture 2

1 Recap: Typical NLP tasks → practical examples with GATE2 Def. of semantics3 Frames approach

1 FrameNet2 GATE for semantic/content analysis

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 4: NLP_lectures_English

Outline: Lecture 3

1 Recap: Typical NLP tasks2 Automatic Question Answering3 Reference resolution4 Named Entity Recognition (NER)5 Keyword / topic / information extraction

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 5: NLP_lectures_English

Welcome!

Here we go...!!!

Main references:

Text book: Speech and Language Processing by D. Jurafskyand J. H. Martin

English FrameNet: https://framenet.icsi.berkeley.edu/fndrupal/

Spanish FrameNet: http://sfn.uab.es:8080/SFN

GATE: https://gate.ac.uk/

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 6: NLP_lectures_English

Outline: Lecture 1

1 Introduction2 Natural Language Processing3 Linguistic Ambiguities4 Definition of corpus5 Typical NLP tasks6 POS-tagging

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 7: NLP_lectures_English

Introduction: Intelligent machines?

Video:https://www.youtube.com/watch?v=dSIKBliboIo

(Stanley Kubrick and Arthur C. Clarke,screenplay of 2001: A Space Odyssey )

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 8: NLP_lectures_English

Introduction: Intelligent machines?

Dave Bowman: Open the pod bay doors, HAL.

HAL: I’m sorry Dave, I’m afraid I can’t do that.

(Stanley Kubrick and Arthur C. Clarke,screenplay of 2001: A Space Odyssey )

https://www.youtube.com/watch?v=dSIKBliboIo

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 9: NLP_lectures_English

Introduction: Intelligent machines?

1 Phonetics and phonology2 Morphology → produce contractions I’m and can’t3 Syntax → cfr. Open the pod bay doors, HAL.

vs. HAL, the pod bay door is open.vs. HAL, is the pod bay door open?

4 Lexical semantics → meaning of component words5 Compositional semantics → knowledge of how

components combine to form larger meanings6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’t

vs. No, I won’t open the door.vs. No.

7 Discourse conventions → engaging in structuredconversation using reference that in I’m sorry Dave, I’mafraid I can’t do that

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 10: NLP_lectures_English

Introduction: Intelligent machines?

1 Phonetics and phonology

2 Morphology → produce contractions I’m and can’t3 Syntax → cfr. Open the pod bay doors, HAL.

vs. HAL, the pod bay door is open.vs. HAL, is the pod bay door open?

4 Lexical semantics → meaning of component words5 Compositional semantics → knowledge of how

components combine to form larger meanings6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’t

vs. No, I won’t open the door.vs. No.

7 Discourse conventions → engaging in structuredconversation using reference that in I’m sorry Dave, I’mafraid I can’t do that

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 11: NLP_lectures_English

Introduction: Intelligent machines?

1 Phonetics and phonology2 Morphology → produce contractions I’m and can’t

3 Syntax → cfr. Open the pod bay doors, HAL.vs. HAL, the pod bay door is open.vs. HAL, is the pod bay door open?

4 Lexical semantics → meaning of component words5 Compositional semantics → knowledge of how

components combine to form larger meanings6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’t

vs. No, I won’t open the door.vs. No.

7 Discourse conventions → engaging in structuredconversation using reference that in I’m sorry Dave, I’mafraid I can’t do that

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 12: NLP_lectures_English

Introduction: Intelligent machines?

1 Phonetics and phonology2 Morphology → produce contractions I’m and can’t3 Syntax → cfr. Open the pod bay doors, HAL.

vs. HAL, the pod bay door is open.vs. HAL, is the pod bay door open?

4 Lexical semantics → meaning of component words5 Compositional semantics → knowledge of how

components combine to form larger meanings6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’t

vs. No, I won’t open the door.vs. No.

7 Discourse conventions → engaging in structuredconversation using reference that in I’m sorry Dave, I’mafraid I can’t do that

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 13: NLP_lectures_English

Introduction: Intelligent machines?

1 Phonetics and phonology2 Morphology → produce contractions I’m and can’t3 Syntax → cfr. Open the pod bay doors, HAL.

vs. HAL, the pod bay door is open.vs. HAL, is the pod bay door open?

4 Lexical semantics → meaning of component words

5 Compositional semantics → knowledge of howcomponents combine to form larger meanings

6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’tvs. No, I won’t open the door.vs. No.

7 Discourse conventions → engaging in structuredconversation using reference that in I’m sorry Dave, I’mafraid I can’t do that

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 14: NLP_lectures_English

Introduction: Intelligent machines?

1 Phonetics and phonology2 Morphology → produce contractions I’m and can’t3 Syntax → cfr. Open the pod bay doors, HAL.

vs. HAL, the pod bay door is open.vs. HAL, is the pod bay door open?

4 Lexical semantics → meaning of component words5 Compositional semantics → knowledge of how

components combine to form larger meanings

6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’tvs. No, I won’t open the door.vs. No.

7 Discourse conventions → engaging in structuredconversation using reference that in I’m sorry Dave, I’mafraid I can’t do that

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 15: NLP_lectures_English

Introduction: Intelligent machines?

1 Phonetics and phonology2 Morphology → produce contractions I’m and can’t3 Syntax → cfr. Open the pod bay doors, HAL.

vs. HAL, the pod bay door is open.vs. HAL, is the pod bay door open?

4 Lexical semantics → meaning of component words5 Compositional semantics → knowledge of how

components combine to form larger meanings6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’t

vs. No, I won’t open the door.vs. No.

7 Discourse conventions → engaging in structuredconversation using reference that in I’m sorry Dave, I’mafraid I can’t do that

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 16: NLP_lectures_English

Introduction: Intelligent machines?

1 Phonetics and phonology2 Morphology → produce contractions I’m and can’t3 Syntax → cfr. Open the pod bay doors, HAL.

vs. HAL, the pod bay door is open.vs. HAL, is the pod bay door open?

4 Lexical semantics → meaning of component words5 Compositional semantics → knowledge of how

components combine to form larger meanings6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’t

vs. No, I won’t open the door.vs. No.

7 Discourse conventions → engaging in structuredconversation using reference that in I’m sorry Dave, I’mafraid I can’t do that

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 17: NLP_lectures_English

Natural Language Processing

NLP: techniques that process written human language aslanguage.

Applicationsword countingautomatic hyphenationautomated question answeringnamed entity extraction (NER)information/content extractionsemantic analysissentiment analysismachine translation

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 18: NLP_lectures_English

Natural Language Processing

NLP: techniques that process written human language aslanguage.

Applicationsword countingautomatic hyphenationautomated question answeringnamed entity extraction (NER)information/content extractionsemantic analysissentiment analysismachine translation

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 19: NLP_lectures_English

Natural Language Processing

NLP: techniques that process written human language aslanguage.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 20: NLP_lectures_English

Natural Language Processing

NLP: techniques that process written human language aslanguage.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 21: NLP_lectures_English

Natural Language Processing

An ideal NLP team is very interdisciplinary, including:Language experts (linguists)Maths experts (mathematicians, physicists, statisticians)Programmers (computer scientists)

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 22: NLP_lectures_English

NLP: Maths & Computer Science

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 23: NLP_lectures_English

NLP: Six categories of linguistic knowledge

1 Phonetics and phonology ↔ red - read - read ;sleigh - slay

2 Morphology ↔ I/you/we/you/they walk - he/she/it walks;walked; walking

3 Syntax ↔ She ate a mammoth breakfast - She eating amammoth breakfast

4 Semantics ↔ book (verb) - book (noun);duck (verb) - duck (noun)

5 Pragmatics ↔ open the door - can you open the door? -could you open the door, please?

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 24: NLP_lectures_English

NLP: Six categories of linguistic knowledge

1 Phonetics and phonology ↔ red - read - read ;sleigh - slay

2 Morphology ↔ I/you/we/you/they walk - he/she/it walks;walked; walking

3 Syntax ↔ She ate a mammoth breakfast - She eating amammoth breakfast

4 Semantics ↔ book (verb) - book (noun);duck (verb) - duck (noun)

5 Pragmatics ↔ open the door - can you open the door? -could you open the door, please?

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 25: NLP_lectures_English

NLP: Six categories of linguistic knowledge

1 Phonetics and phonology ↔ red - read - read ;sleigh - slay

2 Morphology ↔ I/you/we/you/they walk - he/she/it walks;walked; walking

3 Syntax ↔ She ate a mammoth breakfast - She eating amammoth breakfast

4 Semantics ↔ book (verb) - book (noun);duck (verb) - duck (noun)

5 Pragmatics ↔ open the door - can you open the door? -could you open the door, please?

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 26: NLP_lectures_English

NLP: Six categories of linguistic knowledge

1 Phonetics and phonology ↔ red - read - read ;sleigh - slay

2 Morphology ↔ I/you/we/you/they walk - he/she/it walks;walked; walking

3 Syntax ↔ She ate a mammoth breakfast - She eating amammoth breakfast

4 Semantics ↔ book (verb) - book (noun);duck (verb) - duck (noun)

5 Pragmatics ↔ open the door - can you open the door? -could you open the door, please?

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 27: NLP_lectures_English

NLP: Six categories of linguistic knowledge

1 Phonetics and phonology ↔ red - read - read ;sleigh - slay

2 Morphology ↔ I/you/we/you/they walk - he/she/it walks;walked; walking

3 Syntax ↔ She ate a mammoth breakfast - She eating amammoth breakfast

4 Semantics ↔ book (verb) - book (noun);duck (verb) - duck (noun)

5 Pragmatics ↔ open the door - can you open the door? -could you open the door, please?

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 28: NLP_lectures_English

NLP: Six categories of linguistic knowledge

1 Phonetics and phonology ↔ red - read - read ;sleigh - slay

2 Morphology ↔ I/you/we/you/they walk - he/she/it walks;walked; walking

3 Syntax ↔ She ate a mammoth breakfast - She eating amammoth breakfast

4 Semantics ↔ book (verb) - book (noun);duck (verb) - duck (noun)

5 Pragmatics ↔ open the door - can you open the door? -could you open the door, please?

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 29: NLP_lectures_English

NLP: Six categories of linguistic knowledge

6 Discourse

Gracie: Oh yeah... And then Mr. and Mrs. Jones werehaving matrimonial trouble, and my brother was hired towatch Mrs. Jones.George: Well, I imagine she was a very attractive woman.Gracie: She was, and my brother watched her day andnight for six months.George: Well, what happened?Gracie: She finally got a divorce.George: Mrs. Jones?Gracie: No, my brother’s wife.

John went to Bill’s car dealership to check out anAcura Integra. He looked at it for about an hour.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 30: NLP_lectures_English

NLP: Six categories of linguistic knowledge

6 DiscourseGracie: Oh yeah... And then Mr. and Mrs. Jones werehaving matrimonial trouble, and my brother was hired towatch Mrs. Jones.George: Well, I imagine she was a very attractive woman.Gracie: She was, and my brother watched her day andnight for six months.George: Well, what happened?Gracie: She finally got a divorce.George: Mrs. Jones?Gracie: No, my brother’s wife.

John went to Bill’s car dealership to check out anAcura Integra. He looked at it for about an hour.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 31: NLP_lectures_English

NLP: Six categories of linguistic knowledge

6 DiscourseGracie: Oh yeah... And then Mr. and Mrs. Jones werehaving matrimonial trouble, and my brother was hired towatch Mrs. Jones.George: Well, I imagine she was a very attractive woman.Gracie: She was, and my brother watched her day andnight for six months.George: Well, what happened?Gracie: She finally got a divorce.George: Mrs. Jones?Gracie: No, my brother’s wife.

John went to Bill’s car dealership to check out anAcura Integra. He looked at it for about an hour.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 32: NLP_lectures_English

NLP: Ambiguities and Solutions

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 33: NLP_lectures_English

NLP: Ambiguities and Solutions

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 34: NLP_lectures_English

Linguistic Ambiguities

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 35: NLP_lectures_English

Linguistic Ambiguities

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 36: NLP_lectures_English

Linguistic Ambiguities

ExampleI made her duck.

Five possible interpretations:

1 I cooked waterfowl for her.2 I cooked waterfowl belonging to her.3 I created the (plaster?) duck she owns.4 I caused her to quickly lower her head or body.5 I waved my magic wand and turned her into

undifferentiated waterfowl.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 37: NLP_lectures_English

Linguistic Ambiguities

ExampleI made her duck.

Five possible interpretations:

1 I cooked waterfowl for her.2 I cooked waterfowl belonging to her.3 I created the (plaster?) duck she owns.4 I caused her to quickly lower her head or body.5 I waved my magic wand and turned her into

undifferentiated waterfowl.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 38: NLP_lectures_English

Linguistic Ambiguities

ExampleI made her duck.

Five possible interpretations:

1 I cooked waterfowl for her.2 I cooked waterfowl belonging to her.3 I created the (plaster?) duck she owns.4 I caused her to quickly lower her head or body.5 I waved my magic wand and turned her into

undifferentiated waterfowl.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 39: NLP_lectures_English

Linguistic Ambiguities

ExampleI made her duck.

Five possible interpretations:

1 I cooked waterfowl for her.

2 I cooked waterfowl belonging to her.3 I created the (plaster?) duck she owns.4 I caused her to quickly lower her head or body.5 I waved my magic wand and turned her into

undifferentiated waterfowl.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 40: NLP_lectures_English

Linguistic Ambiguities

ExampleI made her duck.

Five possible interpretations:

1 I cooked waterfowl for her.2 I cooked waterfowl belonging to her.

3 I created the (plaster?) duck she owns.4 I caused her to quickly lower her head or body.5 I waved my magic wand and turned her into

undifferentiated waterfowl.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 41: NLP_lectures_English

Linguistic Ambiguities

ExampleI made her duck.

Five possible interpretations:

1 I cooked waterfowl for her.2 I cooked waterfowl belonging to her.3 I created the (plaster?) duck she owns.

4 I caused her to quickly lower her head or body.5 I waved my magic wand and turned her into

undifferentiated waterfowl.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 42: NLP_lectures_English

Linguistic Ambiguities

ExampleI made her duck.

Five possible interpretations:

1 I cooked waterfowl for her.2 I cooked waterfowl belonging to her.3 I created the (plaster?) duck she owns.4 I caused her to quickly lower her head or body.

5 I waved my magic wand and turned her intoundifferentiated waterfowl.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 43: NLP_lectures_English

Linguistic Ambiguities

ExampleI made her duck.

Five possible interpretations:

1 I cooked waterfowl for her.2 I cooked waterfowl belonging to her.3 I created the (plaster?) duck she owns.4 I caused her to quickly lower her head or body.5 I waved my magic wand and turned her into

undifferentiated waterfowl.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 44: NLP_lectures_English

Linguistic Ambiguities

Morphological ambiguityduck : verb or nounher : dative pronoun or possessive pronoun

Syntactic ambiguity: make

transitive: taking a single direct object (case 2)ditransitive: taking two objects, meaning that the first object(her ) got made into the second object (duck )taking a direct object and a verb, meaning that the object(her ) got caused to perform the verbal action (duck )

Semantic ambiguity: makecookcreate

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 45: NLP_lectures_English

Linguistic Ambiguities

Morphological ambiguityduck : verb or nounher : dative pronoun or possessive pronoun

Syntactic ambiguity: make

transitive: taking a single direct object (case 2)ditransitive: taking two objects, meaning that the first object(her ) got made into the second object (duck )taking a direct object and a verb, meaning that the object(her ) got caused to perform the verbal action (duck )

Semantic ambiguity: makecookcreate

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 46: NLP_lectures_English

Linguistic Ambiguities

Morphological ambiguityduck : verb or nounher : dative pronoun or possessive pronoun

Syntactic ambiguity: make

transitive: taking a single direct object (case 2)ditransitive: taking two objects, meaning that the first object(her ) got made into the second object (duck )taking a direct object and a verb, meaning that the object(her ) got caused to perform the verbal action (duck )

Semantic ambiguity: makecookcreate

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 47: NLP_lectures_English

Linguistic Ambiguities

Morphological ambiguityduck : verb or nounher : dative pronoun or possessive pronoun

Syntactic ambiguity: make

transitive: taking a single direct object (case 2)ditransitive: taking two objects, meaning that the first object(her ) got made into the second object (duck )taking a direct object and a verb, meaning that the object(her ) got caused to perform the verbal action (duck )

Semantic ambiguity: makecookcreate

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 48: NLP_lectures_English

Corpus

DefinitionCorpus = Large and structured set of texts.

NLPTwo types of corpora:

Training corpus ↔ to make the list of rules or to get thestatistical dataTest corpus ↔ to test the results found with the trainingcorpus

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 49: NLP_lectures_English

Corpus

DefinitionCorpus = Large and structured set of texts.

NLPTwo types of corpora:

Training corpus ↔ to make the list of rules or to get thestatistical dataTest corpus ↔ to test the results found with the trainingcorpus

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 50: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 51: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization

RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 52: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 53: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting

RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 54: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 55: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging

POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 56: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 57: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Syntactic parsing

Early algorithm, CYKalgorithm, GHR algorithm,

Stanford Parser (Javaimplementation of

probabilistic algorithm)

Topic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK,

...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 58: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or Stemming

Implementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Syntactic parsing

Early algorithm, CYKalgorithm, GHR algorithm,

Stanford Parser (Javaimplementation of

probabilistic algorithm)

Topic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK,

...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 59: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Syntactic parsing

Early algorithm, CYKalgorithm, GHR algorithm,

Stanford Parser (Javaimplementation of

probabilistic algorithm)

Topic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK,

...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 60: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Syntactic parsing

Early algorithm, CYKalgorithm, GHR algorithm,

Stanford Parser (Javaimplementation of

probabilistic algorithm)

Topic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK,

...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 61: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Syntactic parsing

Early algorithm, CYKalgorithm, GHR algorithm,

Stanford Parser (Javaimplementation of

probabilistic algorithm)

Topic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK,

...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 62: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Syntactic parsing

Early algorithm, CYKalgorithm, GHR algorithm,

Stanford Parser (Javaimplementation of

probabilistic algorithm)

Topic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK,

...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 63: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Syntactic parsing

Early algorithm, CYKalgorithm, GHR algorithm,

Stanford Parser (Javaimplementation of

probabilistic algorithm)

Topic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK,

...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 64: NLP_lectures_English

POS-tagging

Example with Penn Treebank POS-tags:

A/DT woman/NN came/VBD from/IN the/DT back/NN of/INthe/DT store/NN ./. She/PP appeared/VBD to/TO be/VB

sleepy/JJ and/CC quite/RB a/DT bit/NN younger/JJR than/INMr./NNP Dobbs/NNP and/CC to/TO be/VB wearing/VBG

too/RB much/RB makeup/NN ./.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 65: NLP_lectures_English

POS-tagging

Example with Penn Treebank POS-tags:

A/DT woman/NN came/VBD from/IN the/DT back/NN of/INthe/DT store/NN ./. She/PP appeared/VBD to/TO be/VB

sleepy/JJ and/CC quite/RB a/DT bit/NN younger/JJR than/INMr./NNP Dobbs/NNP and/CC to/TO be/VB wearing/VBG

too/RB much/RB makeup/NN ./.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 66: NLP_lectures_English

POS-tagging

Example with Penn Treebank POS-tags:

A/DT woman/NN came/VBD from/IN the/DT back/NN of/INthe/DT store/NN ./. She/PP appeared/VBD to/TO be/VB

sleepy/JJ and/CC quite/RB a/DT bit/NN younger/JJR than/INMr./NNP Dobbs/NNP and/CC to/TO be/VB wearing/VBG

too/RB much/RB makeup/NN ./.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 67: NLP_lectures_English

POS-tagging

Example of ambiguity:

1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VBtomorrow/NN ./.

2 People/NNS continue/VBP to/TO inquire/VB the/DTreason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN./.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 68: NLP_lectures_English

POS-tagging

Example of ambiguity:1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB

tomorrow/NN ./.

2 People/NNS continue/VBP to/TO inquire/VB the/DTreason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN./.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 69: NLP_lectures_English

POS-tagging

Example of ambiguity:1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB

tomorrow/NN ./.

2 People/NNS continue/VBP to/TO inquire/VB the/DTreason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN./.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 70: NLP_lectures_English

POS-tagging

Three main tagging algorithms or methods:1 rule-based tagging, e.g. ENGTWOL2 stochastic tagging, e.g. HMM tagger3 transformation-based tagging, e.g. Brill tagger

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 71: NLP_lectures_English

Rule-based POS-tagging

Example of ambiguity:1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB

tomorrow/NN ./.2 People/NNS continue/VBP to/TO inquire/VB the/DT

reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN./.

Large database of hand-written disambiguation rules, e.g.:

TO + VB → YESTO + NN → NODT + NN → YESDT + VB → NO

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 72: NLP_lectures_English

Rule-based POS-tagging

Example of ambiguity:1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB

tomorrow/NN ./.2 People/NNS continue/VBP to/TO inquire/VB the/DT

reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN./.

Large database of hand-written disambiguation rules, e.g.:

TO + VB → YESTO + NN → NODT + NN → YESDT + VB → NO

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 73: NLP_lectures_English

Rule-based POS-tagging

Example of ambiguity:1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB

tomorrow/NN ./.2 People/NNS continue/VBP to/TO inquire/VB the/DT

reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN./.

Large database of hand-written disambiguation rules, e.g.:

TO + VB → YESTO + NN → NODT + NN → YESDT + VB → NO

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 74: NLP_lectures_English

Rule-based POS-tagging

Example of ambiguity:1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB

tomorrow/NN ./.2 People/NNS continue/VBP to/TO inquire/VB the/DT

reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN./.

Large database of hand-written disambiguation rules, e.g.:TO + VB → YESTO + NN → NODT + NN → YESDT + VB → NO

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 75: NLP_lectures_English

Stochastic POS-tagging

Example of ambiguity:1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB

tomorrow/NN ./.2 People/NNS continue/VBP to/TO inquire/VB the/DT

reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN./.

Training corpus to compute probability of given word havinggiven tag in given context, e.g.:

is/VBZ expected/VBN to/TO race/VB → 98%

is/VBZ expected/VBN to/TO race/NN → 2%

reason/NN for/IN the/DT race/NN → 97%

reason/NN for/IN the/DT race/VB → 3%

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 76: NLP_lectures_English

Stochastic POS-tagging

Example of ambiguity:1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB

tomorrow/NN ./.2 People/NNS continue/VBP to/TO inquire/VB the/DT

reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN./.

Training corpus to compute probability of given word havinggiven tag in given context, e.g.:

is/VBZ expected/VBN to/TO race/VB → 98%

is/VBZ expected/VBN to/TO race/NN → 2%

reason/NN for/IN the/DT race/NN → 97%

reason/NN for/IN the/DT race/VB → 3%

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 77: NLP_lectures_English

Stochastic POS-tagging

Example of ambiguity:1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB

tomorrow/NN ./.2 People/NNS continue/VBP to/TO inquire/VB the/DT

reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN./.

Training corpus to compute probability of given word havinggiven tag in given context, e.g.:

is/VBZ expected/VBN to/TO race/VB → 98%

is/VBZ expected/VBN to/TO race/NN → 2%

reason/NN for/IN the/DT race/NN → 97%

reason/NN for/IN the/DT race/VB → 3%

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 78: NLP_lectures_English

Stochastic POS-tagging

Example of ambiguity:1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB

tomorrow/NN ./.2 People/NNS continue/VBP to/TO inquire/VB the/DT

reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN./.

Training corpus to compute probability of given word havinggiven tag in given context, e.g.:

is/VBZ expected/VBN to/TO race/VB → 98%

is/VBZ expected/VBN to/TO race/NN → 2%

reason/NN for/IN the/DT race/NN → 97%

reason/NN for/IN the/DT race/VB → 3%

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 79: NLP_lectures_English

Transformation-based tagging POS-tagging

Example of ambiguity:

Secretariat/NNP is/VBZ expected/VBN to/TO race/VBtomorrow/NN ./.People/NNS continue/VBP to/TO inquire/VB the/DTreason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN

Rules automatically induced from data using Machine Learningtechniques, e.g.:

1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%→ system would always take race = NN

2 Machine Learning to learn conditional probabilities:3 is/VBZ expected/VBN to/TO race/VB → 98%

reason/NN for/IN the/DT race/NN → 97%4 system takes race = NN or race = VB depending on

context.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 80: NLP_lectures_English

Transformation-based tagging POS-tagging

Example of ambiguity:

Secretariat/NNP is/VBZ expected/VBN to/TO race/VBtomorrow/NN ./.People/NNS continue/VBP to/TO inquire/VB the/DTreason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN

Rules automatically induced from data using Machine Learningtechniques, e.g.:

1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%→ system would always take race = NN

2 Machine Learning to learn conditional probabilities:3 is/VBZ expected/VBN to/TO race/VB → 98%

reason/NN for/IN the/DT race/NN → 97%4 system takes race = NN or race = VB depending on

context.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 81: NLP_lectures_English

Transformation-based tagging POS-tagging

Example of ambiguity:

Secretariat/NNP is/VBZ expected/VBN to/TO race/VBtomorrow/NN ./.People/NNS continue/VBP to/TO inquire/VB the/DTreason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN

Rules automatically induced from data using Machine Learningtechniques, e.g.:

1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%→ system would always take race = NN

2 Machine Learning to learn conditional probabilities:3 is/VBZ expected/VBN to/TO race/VB → 98%

reason/NN for/IN the/DT race/NN → 97%4 system takes race = NN or race = VB depending on

context.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 82: NLP_lectures_English

Transformation-based tagging POS-tagging

Example of ambiguity:

Secretariat/NNP is/VBZ expected/VBN to/TO race/VBtomorrow/NN ./.People/NNS continue/VBP to/TO inquire/VB the/DTreason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN

Rules automatically induced from data using Machine Learningtechniques, e.g.:

1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%→ system would always take race = NN

2 Machine Learning to learn conditional probabilities:3 is/VBZ expected/VBN to/TO race/VB → 98%

reason/NN for/IN the/DT race/NN → 97%4 system takes race = NN or race = VB depending on

context.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 83: NLP_lectures_English

Transformation-based tagging POS-tagging

Example of ambiguity:

Secretariat/NNP is/VBZ expected/VBN to/TO race/VBtomorrow/NN ./.People/NNS continue/VBP to/TO inquire/VB the/DTreason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN

Rules automatically induced from data using Machine Learningtechniques, e.g.:

1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%→ system would always take race = NN

2 Machine Learning to learn conditional probabilities:

3 is/VBZ expected/VBN to/TO race/VB → 98%reason/NN for/IN the/DT race/NN → 97%

4 system takes race = NN or race = VB depending oncontext.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 84: NLP_lectures_English

Transformation-based tagging POS-tagging

Example of ambiguity:

Secretariat/NNP is/VBZ expected/VBN to/TO race/VBtomorrow/NN ./.People/NNS continue/VBP to/TO inquire/VB the/DTreason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN

Rules automatically induced from data using Machine Learningtechniques, e.g.:

1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%→ system would always take race = NN

2 Machine Learning to learn conditional probabilities:3 is/VBZ expected/VBN to/TO race/VB → 98%

reason/NN for/IN the/DT race/NN → 97%

4 system takes race = NN or race = VB depending oncontext.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 85: NLP_lectures_English

Transformation-based tagging POS-tagging

Example of ambiguity:

Secretariat/NNP is/VBZ expected/VBN to/TO race/VBtomorrow/NN ./.People/NNS continue/VBP to/TO inquire/VB the/DTreason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN

Rules automatically induced from data using Machine Learningtechniques, e.g.:

1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%→ system would always take race = NN

2 Machine Learning to learn conditional probabilities:3 is/VBZ expected/VBN to/TO race/VB → 98%

reason/NN for/IN the/DT race/NN → 97%4 system takes race = NN or race = VB depending on

context.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 86: NLP_lectures_English

POS-tagging

POS-tagging tools for English:Brill taggerStanford Log-linear POS-tagger (Java)POS-tagger integrated in GATE (Java)POS-tagger with NLTK (Python)

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 87: NLP_lectures_English

Outline: Lecture 2

1 Recap: Typical NLP tasks → practical examples with GATE2 Def. of semantics3 Frames approach

1 FrameNet2 GATE for semantic/content analysis

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 88: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 89: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization

RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 90: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 91: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting

RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 92: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 93: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging

POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 94: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 95: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Topic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK, ...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 96: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or Stemming

Implementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Topic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK, ...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 97: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Topic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK, ...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 98: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Topic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK, ...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 99: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Topic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK, ...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 100: NLP_lectures_English

EX.: GATE

Concrete examples with GATE:1 Tokenizer2 Sentence-splitter3 POS-tagger4 Stemmer

GATEhttps://gate.ac.uk/

development at the University of Sheffield, UK.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 101: NLP_lectures_English

EX.: GATE

Concrete examples with GATE:1 Tokenizer2 Sentence-splitter3 POS-tagger4 Stemmer

GATEhttps://gate.ac.uk/

development at the University of Sheffield, UK.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 102: NLP_lectures_English

Semantics

’Then you should say what you mean,’the March Hare went on.’I do,’ Alice hastily replied;

’at least, I mean what I say –that’s the same thing, you know.’

’Not the same thing a bit!’ said the Hatter. ’You might just aswell say that

”I see what I eat” is the same thing as ”I eat what I see”! ’

Lewis Carroll,Alice in Wonderland

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 103: NLP_lectures_English

Frames and FrameNet

FrameA schematic representation of a situation involving various

participants, and other conceptual roles.E.g.:

Abby bought a car from Robin for $5,000.Robin sold a car to Abby for $5,000.

English FrameNet

https://framenet.icsi.berkeley.edu/fndrupal/development at the International Computer Science Institute in

Berkeley, California.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 104: NLP_lectures_English

Frames and FrameNet

FrameA schematic representation of a situation involving various

participants, and other conceptual roles.E.g.:

Abby bought a car from Robin for $5,000.Robin sold a car to Abby for $5,000.

English FrameNet

https://framenet.icsi.berkeley.edu/fndrupal/development at the International Computer Science Institute in

Berkeley, California.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 105: NLP_lectures_English

English FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 106: NLP_lectures_English

English FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 107: NLP_lectures_English

English FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 108: NLP_lectures_English

English FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 109: NLP_lectures_English

English FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 110: NLP_lectures_English

English FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 111: NLP_lectures_English

English FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 112: NLP_lectures_English

English FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 113: NLP_lectures_English

English FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 114: NLP_lectures_English

English FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 115: NLP_lectures_English

Frame Relations

FrameNet additionally captures relationships between differentframes using relations. These include the following:

Inheritance: When one frame is a more specific version ofanother, more abstract parent frame. Anything that is trueabout the parent frame must also be true about the childframe, and a mapping is specified between the frameelements of the parent and the frame elements of the child.Perspectivized-in: A neutral frame (likeCommerce-transfer-goods) is connected to a frame with aspecific perspective of the same scenario (e.g. theCommerce-sell frame, which assumes the perspective ofthe seller or the Commerce-buy frame, which assumes theperspective of the buyer)Subframe: Some frames like the Criminal-process framerefer to complex scenarios that consist of several individualstates or events that can be described by separate frameslike Arrest, Trial, and so on.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 116: NLP_lectures_English

Frame Relations

Precedes: The Precedes relation captures a temporalorder that holds between subframes of a complex scenario.Causative-of and Inchoative-of : There is a fairly systematicrelationship between stative descriptions (e.g. thePosition-on-a-scale frame, "She had a high salary") andcausative descriptions (Cause-change-of-scalar-position,"She raised his salary") or inchoative descriptions(Change-position-on-a-scale, e.g. "Her salary increased").Using: A relationship that holds between a frame that insome way involves another frame. For instance, theJudgment-communication frame uses both the Judgmentframe and the Statement frame, but does not inherit fromeither of them because there is no clear correspondence ofthe frame elements.See-also: Connects frames that bear some resemblancebut need to be distinguished carefully.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 117: NLP_lectures_English

Spanish FrameNet

FrameA schematic representation of a situation involving various

participants, and other conceptual roles. E.g.:El rock influye en los artistas de hoy en díapara sus producciones artísticas.Los artistas de hoy en día se inspiran al rockpara sus producciones artísticas.

Spanish FrameNethttp://sfn.uab.es:8080/SFN

development at the Universidad Autónoma de Barcelona andInternational Computer Science Institute in Berkeley, California.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 118: NLP_lectures_English

Spanish FrameNet

FrameA schematic representation of a situation involving various

participants, and other conceptual roles. E.g.:El rock influye en los artistas de hoy en díapara sus producciones artísticas.Los artistas de hoy en día se inspiran al rockpara sus producciones artísticas.

Spanish FrameNethttp://sfn.uab.es:8080/SFN

development at the Universidad Autónoma de Barcelona andInternational Computer Science Institute in Berkeley, California.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 119: NLP_lectures_English

Spanish FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 120: NLP_lectures_English

Spanish FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 121: NLP_lectures_English

Spanish FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 122: NLP_lectures_English

Spanish FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 123: NLP_lectures_English

Spanish FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 124: NLP_lectures_English

Spanish FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 125: NLP_lectures_English

Spanish FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 126: NLP_lectures_English

Spanish FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 127: NLP_lectures_English

Spanish FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 128: NLP_lectures_English

Spanish FrameNet Example

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 129: NLP_lectures_English

Frames and GATE

And now...Ex. in English implementing FRAMES, LUs, and FEs with

GATE !!!

GATEhttps://gate.ac.uk/

development at the University of Sheffield, UK.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 130: NLP_lectures_English

Frames and GATE

And now...Ex. in English implementing FRAMES, LUs, and FEs with

GATE !!!

GATEhttps://gate.ac.uk/

development at the University of Sheffield, UK.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 131: NLP_lectures_English

Outline: Lecture 3

1 Recap: Typical NLP tasks2 Automatic Question Answering3 Reference resolution4 Named Entity Recognition (NER)5 Keyword / topic / information extraction

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 132: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 133: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization

RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 134: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 135: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting

RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 136: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 137: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging

POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 138: NLP_lectures_English

Typical NLP tasks: Basic and simpler tasks

Tokenization RegEx

Sentence splitting RegEx

POS-tagging POS-tagging algorithms andtag sets

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 139: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Syntactic parsing

Early algorithm, CYKalgorithm, GHR algorithm,

Stanford Parser (Javaimplementation of

probabilistic algorithm)

Question answeringTopic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK, ...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 140: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or Stemming

Implementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Syntactic parsing

Early algorithm, CYKalgorithm, GHR algorithm,

Stanford Parser (Javaimplementation of

probabilistic algorithm)

Question answeringTopic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK, ...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 141: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Syntactic parsing

Early algorithm, CYKalgorithm, GHR algorithm,

Stanford Parser (Javaimplementation of

probabilistic algorithm)

Question answeringTopic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK, ...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 142: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Syntactic parsing

Early algorithm, CYKalgorithm, GHR algorithm,

Stanford Parser (Javaimplementation of

probabilistic algorithm)

Question answeringTopic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK, ...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 143: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Syntactic parsing

Early algorithm, CYKalgorithm, GHR algorithm,

Stanford Parser (Javaimplementation of

probabilistic algorithm)

Question answeringTopic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK, ...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 144: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Syntactic parsing

Early algorithm, CYKalgorithm, GHR algorithm,

Stanford Parser (Javaimplementation of

probabilistic algorithm)

Question answeringTopic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK, ...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 145: NLP_lectures_English

Typical NLP tasks: Complex tasks

Lemmatization or StemmingImplementations of PorterStemmer (e.g. in Java),

Stanford NLP tool, GATE, ...

Syntactic parsing

Early algorithm, CYKalgorithm, GHR algorithm,

Stanford Parser (Javaimplementation of

probabilistic algorithm)

Question answeringTopic extractionNERSemantic analysis...

Ad hoc tools, e.g.dictionaries, ontologies,Frames, GATE, NLTK, ...

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 146: NLP_lectures_English

NLP: Ambiguities and Solutions

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 147: NLP_lectures_English

NLP: Ambiguities and Solutions

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 148: NLP_lectures_English

Question Answering

Example from Who’s on first

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 149: NLP_lectures_English

Question Answering

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 150: NLP_lectures_English

Simple Question Answering

ELIZAELIZA simulated a Rogerian psychologist, and could carry on

conversations with the user like the following:

User-1: Men are all alike.ELIZA-1: IN WHAT WAY

User-2: They’re always bugging us about something or other..ELIZA-2: CAN YOU THINK OF A SPECIFIC EXAMPLE

User-3: Well, my boyfriend made me come here.ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE

User-4: Yes, he says I’m depressed much of the time.ELIZA-4: I AM SORRY TO HEAR YOU ARE DEPRESSED.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 151: NLP_lectures_English

ELIZA

Regular expression substitutionsEliza worked by having a cascade of regular expression substitutionsthat each matched some part of the input lines and changed them:

change all instances of my to YOUR, and I’m to YOU ARE, etc.,e.g.:

1 User-3: Well, my boyfriend made me come here.ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE

2 User-4: ... I’m depressed ... .ELIZA-4: ... YOU ARE DEPRESSED.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 152: NLP_lectures_English

ELIZA

Regular expression substitutionsEliza worked by having a cascade of regular expression substitutionsthat each matched some part of the input lines and changed them:

change all instances of my to YOUR, and I’m to YOU ARE, etc.,e.g.:

1 User-3: Well, my boyfriend made me come here.ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE

2 User-4: ... I’m depressed ... .ELIZA-4: ... YOU ARE DEPRESSED.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 153: NLP_lectures_English

ELIZA

Regular expression substitutionsEliza worked by having a cascade of regular expression substitutionsthat each matched some part of the input lines and changed them:

change all instances of my to YOUR, and I’m to YOU ARE, etc.,e.g.:

1 User-3: Well, my boyfriend made me come here.ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE

2 User-4: ... I’m depressed ... .ELIZA-4: ... YOU ARE DEPRESSED.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 154: NLP_lectures_English

ELIZA

Regular expression substitutionsEliza worked by having a cascade of regular expression substitutionsthat each matched some part of the input lines and changed them:

relevant patterns in the input → creat an appropriate output;e.g.:

1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEARYOU ARE \1 /

2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINKYOU ARE \1 /

3 s/.* all .*/IN WHAT WAY/4 s/.* always .*/CAN YOU THINK OF A SPECIFIC EXAMPLE/

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 155: NLP_lectures_English

ELIZA

Regular expression substitutionsEliza worked by having a cascade of regular expression substitutionsthat each matched some part of the input lines and changed them:

relevant patterns in the input → creat an appropriate output;e.g.:

1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEARYOU ARE \1 /

2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINKYOU ARE \1 /

3 s/.* all .*/IN WHAT WAY/4 s/.* always .*/CAN YOU THINK OF A SPECIFIC EXAMPLE/

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 156: NLP_lectures_English

ELIZA

Regular expression substitutionsEliza worked by having a cascade of regular expression substitutionsthat each matched some part of the input lines and changed them:

relevant patterns in the input → creat an appropriate output;e.g.:

1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEARYOU ARE \1 /

2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINKYOU ARE \1 /

3 s/.* all .*/IN WHAT WAY/4 s/.* always .*/CAN YOU THINK OF A SPECIFIC EXAMPLE/

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 157: NLP_lectures_English

ELIZA

Regular expression substitutionsEliza worked by having a cascade of regular expression substitutionsthat each matched some part of the input lines and changed them:

relevant patterns in the input → creat an appropriate output;e.g.:

1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEARYOU ARE \1 /

2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINKYOU ARE \1 /

3 s/.* all .*/IN WHAT WAY/

4 s/.* always .*/CAN YOU THINK OF A SPECIFIC EXAMPLE/

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 158: NLP_lectures_English

ELIZA

Regular expression substitutionsEliza worked by having a cascade of regular expression substitutionsthat each matched some part of the input lines and changed them:

relevant patterns in the input → creat an appropriate output;e.g.:

1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEARYOU ARE \1 /

2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINKYOU ARE \1 /

3 s/.* all .*/IN WHAT WAY/4 s/.* always .*/CAN YOU THINK OF A SPECIFIC EXAMPLE/

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 159: NLP_lectures_English

Reference resolution

Discourse

Gracie: Oh yeah... And then Mr. and Mrs. Jones were havingmatrimonial trouble, and my brother was hired to watch Mrs. Jones.George: Well, I imagine she was a very attractive woman.Gracie: She was, and my brother watched her day and night for sixmonths.George: Well, what happened?Gracie: She finally got a divorce.George: Mrs. Jones?Gracie: No, my brother’s wife.

John went to Bill’s car dealership to check out an Acura Integra.He looked at it for about an hour.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 160: NLP_lectures_English

Reference resolution

Discourse

Gracie: Oh yeah... And then Mr. and Mrs. Jones were havingmatrimonial trouble, and my brother was hired to watch Mrs. Jones.George: Well, I imagine she was a very attractive woman.Gracie: She was, and my brother watched her day and night for sixmonths.George: Well, what happened?Gracie: She finally got a divorce.George: Mrs. Jones?Gracie: No, my brother’s wife.

John went to Bill’s car dealership to check out an Acura Integra.He looked at it for about an hour.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 161: NLP_lectures_English

Reference resolution

Discourse

Gracie: Oh yeah... And then Mr. and Mrs. Jones were havingmatrimonial trouble, and my brother was hired to watch Mrs. Jones.George: Well, I imagine she was a very attractive woman.Gracie: She was, and my brother watched her day and night for sixmonths.George: Well, what happened?Gracie: She finally got a divorce.George: Mrs. Jones?Gracie: No, my brother’s wife.

John went to Bill’s car dealership to check out an Acura Integra.He looked at it for about an hour.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 162: NLP_lectures_English

Reference resolution

1 Reference phenomena2 Constraints on coreference3 Preferences in pronoun interpretation4 Example of algorithm for pronoun resolution

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 163: NLP_lectures_English

Reference resolution

1 Reference phenomena

2 Constraints on coreference3 Preferences in pronoun interpretation4 Example of algorithm for pronoun resolution

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 164: NLP_lectures_English

Reference resolution

1 Reference phenomena2 Constraints on coreference

3 Preferences in pronoun interpretation4 Example of algorithm for pronoun resolution

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 165: NLP_lectures_English

Reference resolution

1 Reference phenomena2 Constraints on coreference3 Preferences in pronoun interpretation

4 Example of algorithm for pronoun resolution

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 166: NLP_lectures_English

Reference resolution

1 Reference phenomena2 Constraints on coreference3 Preferences in pronoun interpretation4 Example of algorithm for pronoun resolution

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 167: NLP_lectures_English

Reference resolution

Reference phenomena

1 Indefinite noun phrases ↔ I saw an Honda Civic today.2 Definite noun phrases ↔ I saw an Honda Civic today.

The Honda Civic was blue.3 Pronouns ↔ I saw an Honda Civic today. It was blue.4 Demonstratives ↔ I bought an Honda Civic today. It’s

similar to the one I bought five years ago. That one wasreally nice, but I like this one even better.

5 One-anaphora ↔ I saw no less than 6 Honda Civicstoday. Now I want one.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 168: NLP_lectures_English

Reference resolution

Reference phenomena1 Indefinite noun phrases ↔ I saw an Honda Civic today.

2 Definite noun phrases ↔ I saw an Honda Civic today.The Honda Civic was blue.

3 Pronouns ↔ I saw an Honda Civic today. It was blue.4 Demonstratives ↔ I bought an Honda Civic today. It’s

similar to the one I bought five years ago. That one wasreally nice, but I like this one even better.

5 One-anaphora ↔ I saw no less than 6 Honda Civicstoday. Now I want one.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 169: NLP_lectures_English

Reference resolution

Reference phenomena1 Indefinite noun phrases ↔ I saw an Honda Civic today.2 Definite noun phrases ↔ I saw an Honda Civic today.

The Honda Civic was blue.

3 Pronouns ↔ I saw an Honda Civic today. It was blue.4 Demonstratives ↔ I bought an Honda Civic today. It’s

similar to the one I bought five years ago. That one wasreally nice, but I like this one even better.

5 One-anaphora ↔ I saw no less than 6 Honda Civicstoday. Now I want one.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 170: NLP_lectures_English

Reference resolution

Reference phenomena1 Indefinite noun phrases ↔ I saw an Honda Civic today.2 Definite noun phrases ↔ I saw an Honda Civic today.

The Honda Civic was blue.3 Pronouns ↔ I saw an Honda Civic today. It was blue.

4 Demonstratives ↔ I bought an Honda Civic today. It’ssimilar to the one I bought five years ago. That one wasreally nice, but I like this one even better.

5 One-anaphora ↔ I saw no less than 6 Honda Civicstoday. Now I want one.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 171: NLP_lectures_English

Reference resolution

Reference phenomena1 Indefinite noun phrases ↔ I saw an Honda Civic today.2 Definite noun phrases ↔ I saw an Honda Civic today.

The Honda Civic was blue.3 Pronouns ↔ I saw an Honda Civic today. It was blue.4 Demonstratives ↔ I bought an Honda Civic today. It’s

similar to the one I bought five years ago. That one wasreally nice, but I like this one even better.

5 One-anaphora ↔ I saw no less than 6 Honda Civicstoday. Now I want one.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 172: NLP_lectures_English

Reference resolution

Reference phenomena1 Indefinite noun phrases ↔ I saw an Honda Civic today.2 Definite noun phrases ↔ I saw an Honda Civic today.

The Honda Civic was blue.3 Pronouns ↔ I saw an Honda Civic today. It was blue.4 Demonstratives ↔ I bought an Honda Civic today. It’s

similar to the one I bought five years ago. That one wasreally nice, but I like this one even better.

5 One-anaphora ↔ I saw no less than 6 Honda Civicstoday. Now I want one.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 173: NLP_lectures_English

Reference resolution

Constraints on coreference

1 number agreement ↔ John has a new car. It is red. /John has a new car. They are red.

2 person and case agreement ↔ John and Mary have newcars. They love them.

3 gender agreement ↔ John has a new car. It is attractive./ John has a new car. He is attractive.

4 syntactic constraints ↔ John bought himself a new car. /John bought him a new car.

5 selectional restrictions ↔ John parked his car in thegarage. He had driven it around for hours.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 174: NLP_lectures_English

Reference resolution

Constraints on coreference1 number agreement ↔ John has a new car. It is red. /

John has a new car. They are red.

2 person and case agreement ↔ John and Mary have newcars. They love them.

3 gender agreement ↔ John has a new car. It is attractive./ John has a new car. He is attractive.

4 syntactic constraints ↔ John bought himself a new car. /John bought him a new car.

5 selectional restrictions ↔ John parked his car in thegarage. He had driven it around for hours.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 175: NLP_lectures_English

Reference resolution

Constraints on coreference1 number agreement ↔ John has a new car. It is red. /

John has a new car. They are red.2 person and case agreement ↔ John and Mary have new

cars. They love them.

3 gender agreement ↔ John has a new car. It is attractive./ John has a new car. He is attractive.

4 syntactic constraints ↔ John bought himself a new car. /John bought him a new car.

5 selectional restrictions ↔ John parked his car in thegarage. He had driven it around for hours.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 176: NLP_lectures_English

Reference resolution

Constraints on coreference1 number agreement ↔ John has a new car. It is red. /

John has a new car. They are red.2 person and case agreement ↔ John and Mary have new

cars. They love them.3 gender agreement ↔ John has a new car. It is attractive.

/ John has a new car. He is attractive.

4 syntactic constraints ↔ John bought himself a new car. /John bought him a new car.

5 selectional restrictions ↔ John parked his car in thegarage. He had driven it around for hours.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 177: NLP_lectures_English

Reference resolution

Constraints on coreference1 number agreement ↔ John has a new car. It is red. /

John has a new car. They are red.2 person and case agreement ↔ John and Mary have new

cars. They love them.3 gender agreement ↔ John has a new car. It is attractive.

/ John has a new car. He is attractive.4 syntactic constraints ↔ John bought himself a new car. /

John bought him a new car.

5 selectional restrictions ↔ John parked his car in thegarage. He had driven it around for hours.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 178: NLP_lectures_English

Reference resolution

Constraints on coreference1 number agreement ↔ John has a new car. It is red. /

John has a new car. They are red.2 person and case agreement ↔ John and Mary have new

cars. They love them.3 gender agreement ↔ John has a new car. It is attractive.

/ John has a new car. He is attractive.4 syntactic constraints ↔ John bought himself a new car. /

John bought him a new car.5 selectional restrictions ↔ John parked his car in the

garage. He had driven it around for hours.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 179: NLP_lectures_English

Reference resolution

Preferences in pronoun interpretation

1 recency ↔ Peter has an Audi. Bob has a Honda. Annelikes to drive it.

2 grammatical role ↔ Peter went to the Honda dealershipwith Bob. He bought a Civic. / Bob went to the Hondadealership with Peter. He bought a Civic.

3 repeated mention ↔ Anne needed a car to drive to hernew job. She decided she wanted something roomy. Carolwent to the Honda dealership with her. She bought a Civic.

4 parallelism ↔ Anne went with Carol to the Hondadealership. Sally went with her to the VW dealership.

5 verb semantics ↔ Peter seized the Honda pamphlet fromBob. He loves reading about cars. / Peter passed theHonda pamphlet to Bob. He loves reading about cars.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 180: NLP_lectures_English

Reference resolution

Preferences in pronoun interpretation1 recency ↔ Peter has an Audi. Bob has a Honda. Anne

likes to drive it.

2 grammatical role ↔ Peter went to the Honda dealershipwith Bob. He bought a Civic. / Bob went to the Hondadealership with Peter. He bought a Civic.

3 repeated mention ↔ Anne needed a car to drive to hernew job. She decided she wanted something roomy. Carolwent to the Honda dealership with her. She bought a Civic.

4 parallelism ↔ Anne went with Carol to the Hondadealership. Sally went with her to the VW dealership.

5 verb semantics ↔ Peter seized the Honda pamphlet fromBob. He loves reading about cars. / Peter passed theHonda pamphlet to Bob. He loves reading about cars.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 181: NLP_lectures_English

Reference resolution

Preferences in pronoun interpretation1 recency ↔ Peter has an Audi. Bob has a Honda. Anne

likes to drive it.2 grammatical role ↔ Peter went to the Honda dealership

with Bob. He bought a Civic. / Bob went to the Hondadealership with Peter. He bought a Civic.

3 repeated mention ↔ Anne needed a car to drive to hernew job. She decided she wanted something roomy. Carolwent to the Honda dealership with her. She bought a Civic.

4 parallelism ↔ Anne went with Carol to the Hondadealership. Sally went with her to the VW dealership.

5 verb semantics ↔ Peter seized the Honda pamphlet fromBob. He loves reading about cars. / Peter passed theHonda pamphlet to Bob. He loves reading about cars.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 182: NLP_lectures_English

Reference resolution

Preferences in pronoun interpretation1 recency ↔ Peter has an Audi. Bob has a Honda. Anne

likes to drive it.2 grammatical role ↔ Peter went to the Honda dealership

with Bob. He bought a Civic. / Bob went to the Hondadealership with Peter. He bought a Civic.

3 repeated mention ↔ Anne needed a car to drive to hernew job. She decided she wanted something roomy. Carolwent to the Honda dealership with her. She bought a Civic.

4 parallelism ↔ Anne went with Carol to the Hondadealership. Sally went with her to the VW dealership.

5 verb semantics ↔ Peter seized the Honda pamphlet fromBob. He loves reading about cars. / Peter passed theHonda pamphlet to Bob. He loves reading about cars.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 183: NLP_lectures_English

Reference resolution

Preferences in pronoun interpretation1 recency ↔ Peter has an Audi. Bob has a Honda. Anne

likes to drive it.2 grammatical role ↔ Peter went to the Honda dealership

with Bob. He bought a Civic. / Bob went to the Hondadealership with Peter. He bought a Civic.

3 repeated mention ↔ Anne needed a car to drive to hernew job. She decided she wanted something roomy. Carolwent to the Honda dealership with her. She bought a Civic.

4 parallelism ↔ Anne went with Carol to the Hondadealership. Sally went with her to the VW dealership.

5 verb semantics ↔ Peter seized the Honda pamphlet fromBob. He loves reading about cars. / Peter passed theHonda pamphlet to Bob. He loves reading about cars.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 184: NLP_lectures_English

Reference resolution

Preferences in pronoun interpretation1 recency ↔ Peter has an Audi. Bob has a Honda. Anne

likes to drive it.2 grammatical role ↔ Peter went to the Honda dealership

with Bob. He bought a Civic. / Bob went to the Hondadealership with Peter. He bought a Civic.

3 repeated mention ↔ Anne needed a car to drive to hernew job. She decided she wanted something roomy. Carolwent to the Honda dealership with her. She bought a Civic.

4 parallelism ↔ Anne went with Carol to the Hondadealership. Sally went with her to the VW dealership.

5 verb semantics ↔ Peter seized the Honda pamphlet fromBob. He loves reading about cars. / Peter passed theHonda pamphlet to Bob. He loves reading about cars.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 185: NLP_lectures_English

NER

Named Entity Recognition

Can be broken down in two distinct problems, i.e.:

1 detection of names2 classification of the names by the type of entity to which

they refer → 4 standard types:1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.)3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”,

”Barcelona”, etc.)4 other (e.g. ”Hotel Sunshine”, etc. )

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 186: NLP_lectures_English

NER

Named Entity Recognition

Can be broken down in two distinct problems, i.e.:

1 detection of names2 classification of the names by the type of entity to which

they refer → 4 standard types:1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.)3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”,

”Barcelona”, etc.)4 other (e.g. ”Hotel Sunshine”, etc. )

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 187: NLP_lectures_English

NER

Named Entity Recognition

Can be broken down in two distinct problems, i.e.:1 detection of names

2 classification of the names by the type of entity to whichthey refer → 4 standard types:

1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.)3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”,

”Barcelona”, etc.)4 other (e.g. ”Hotel Sunshine”, etc. )

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 188: NLP_lectures_English

NER

Named Entity Recognition

Can be broken down in two distinct problems, i.e.:1 detection of names2 classification of the names by the type of entity to which

they refer → 4 standard types:

1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.)3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”,

”Barcelona”, etc.)4 other (e.g. ”Hotel Sunshine”, etc. )

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 189: NLP_lectures_English

NER

Named Entity Recognition

Can be broken down in two distinct problems, i.e.:1 detection of names2 classification of the names by the type of entity to which

they refer → 4 standard types:1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)

2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.)3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”,

”Barcelona”, etc.)4 other (e.g. ”Hotel Sunshine”, etc. )

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 190: NLP_lectures_English

NER

Named Entity Recognition

Can be broken down in two distinct problems, i.e.:1 detection of names2 classification of the names by the type of entity to which

they refer → 4 standard types:1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.)

3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”,”Barcelona”, etc.)

4 other (e.g. ”Hotel Sunshine”, etc. )

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 191: NLP_lectures_English

NER

Named Entity Recognition

Can be broken down in two distinct problems, i.e.:1 detection of names2 classification of the names by the type of entity to which

they refer → 4 standard types:1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.)3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”,

”Barcelona”, etc.)

4 other (e.g. ”Hotel Sunshine”, etc. )

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 192: NLP_lectures_English

NER

Named Entity Recognition

Can be broken down in two distinct problems, i.e.:1 detection of names2 classification of the names by the type of entity to which

they refer → 4 standard types:1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.)3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”,

”Barcelona”, etc.)4 other (e.g. ”Hotel Sunshine”, etc. )

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 193: NLP_lectures_English

NER

Tools for Named Entity RecognitionGATE for English, Spanish, and many more, via graphicalinterface and Java API (development at the University ofSheffield, UK)https://gate.ac.uk/NETagger: Java based Illinois Named Entity Recognition(development by Cognitive Computation Group at University ofIllinois at Urbana - Champaign)http://cogcomp.cs.illinois.edu/page/software_view/NETaggerOpenNLP: rule based and statistical Named Entity Recognition(development by Apache)http://opennlp.apache.org/index.htmlStanford CoreNLP: Java-based CRF Named Entity Recognition(development by Stanford Natural Language Processing Group)http://nlp.stanford.edu/software/CRF-NER.shtml

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 194: NLP_lectures_English

Keyword / topic / information extraction

ToolsKeyword extraction: e.g. GATE (ANNIE tool) for English,Spanish, and many more, via graphical interface and JavaAPI→ simply using jape files for the LUstool from Volker ?Topic / information extraction: e.g. GATE (ANNIE tool)for English, Spanish, and many more, via graphicalinterface and Java API→ using jape files for the LUs, FEs, and FRAMES

GATEhttps://gate.ac.uk/

development at the University of Sheffield, UK

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

Page 195: NLP_lectures_English

Thank you for your attention!

For more information:

Example text book: Speech and Language Processingby D. Jurafsky and J. H. Martin

Web page: www.alexandramliguoriphd.com

Linkedin profile: Alexandra M. Liguori, Ph.D.

Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions