Semantic Analysis in Language Technology Lecture 1...

31
MARINA SANTINI PROGRAM: COMPUTATIONAL LINGUISTICS AND LANGUAGE TECHNOLOGY DEPT OF LINGUISTICS AND PHILOLOGY UPPSALA UNIVERSITY, SWEDEN 12 NOV 2013 (REVISED 17 DEC) Semantic Analysis in Language Technology Lecture 1: Introduction & Digressions Course Website: http://stp.lingfil.uu.se/~santinim/sais/sais_fall2013.htm

Transcript of Semantic Analysis in Language Technology Lecture 1...

Page 1: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

MARINA SANTINI

P R O G R A M : C O M P U T A T I O N A L L I N G U I S T I C S A N D L A N G U A G E T E C H N O L O G Y

D E P T O F L I N G U I S T I C S A N D P H I L O L O G Y

U P P S A L A U N I V E R S I T Y , S W E D E N

1 2 N O V 2 0 1 3 ( R E V I S E D 1 7 D E C )

Semantic Analysis in Language Technology

Lecture 1: Introduction & Digressions

Course Website: http://stp.lingfil.uu.se/~santinim/sais/sais_fall2013.htm

Page 2: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Acknowledgements

Thanks to Mats Dahllöf for the many slides I borrowed from his previous course and for structuring such an interesting and comprehensive content.

2

Lecture 1: Introduction

Page 3: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

I N T E N D E D L E A R N I N G O U T C O M E S

A S S I G N M E N T S A N D E X A M I N A T I O N

R E A D I N G L I S T

D E M O S

Practical Information 3

Lecture 1: Introduction

Page 4: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Course Website & Contact Details

Lecture 1: Introduction

4

Course website:

http://stp.lingfil.uu.se/~santinim/sais/sais_fall2013.htm

Contact details:

[email protected]

[email protected]

[email protected]

Page 5: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Check the website regularly and make sure to refresh the page:

we are building up this course together, so this page will be continously

updated!

Lecture 1: Introduction

5

Page 6: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

About the Course 6

Introduction to Semantics in Language Technology and NLP.

Focus on methods used in Language Technology and NLP to perform the following tasks:

Sentiment Analysis (SA)

Word Sense Disambiguation (WSD)

Semantic Role Labelling/Predicate-Argument Extraction (SRL/PAS)

Lecture 1: Introduction

Page 7: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Intended Learning Outcomes

In order to pass the course, a student must be able to:

describe systems that perform the following tasks, apply them to

authentic linguistic data, and evaluate the results:

1. detect and extract attitudes and opinions from text, i.e. Sentiment Analysis (SA);

2. disambiguate instances of polysemous lemmas, i.e. Word Sense Disambiguation (WSD);

3. perform Semantic Role Labelling/Predicate-Argument Extraction (SRL/PAS).

7

Lecture 1: Introduction

Page 8: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Compulsory Readings

8

1. Bing Liu (2012) Sentiment Analysis and Opinion Mining, Morgan & Claypool.

2. Daniel Jurafsky and James H. Martin (2009), Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Second Edition, Pearson Education.

3. M Palmer, D Gildea, P Kingsbury. 2005. The proposition bank: An annotated corpus of semantic roles, Computational Linguistics 31 (1), 71-106.

4. Additional suggested readings will be listed in the course website.

Lecture 1: Introduction

Page 9: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Assignments and Examination

9

Four Assignments: 1. Essay writing: independent study of a system, an approach, or a field within semantics-oriented

language technology. The study will be presented both as a written essay and an oral presentation. The essay work will also include a feedback step where the work of another group is reviewed.

2. Assignment on Semantic Role Labelling/Predicate-Argument Structure (SRL/PAS) 3. Assignment on Sentiment Analysis (SA) 4. Assignment on Word Sense Disambiguation (WSD)

General Info: No lab sessions, supervision by email Essay and assignments must be submitted to [email protected]

Examination: Written reports submitted for each assignment All four assignments necessary to pass the course Grade G will be given to students who pass each assignment. Grade VG to those who

pass the essay assignment and at least one of the other ones with distinction.

Lecture 1: Introduction

Page 10: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

IMPORTANT!

Lecture 1: Introduction

10

Start thinking about a topic you are interested in for your essay writing assignment!

Page 11: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Practical Organization 11

45 min + 15 min break

Lectures on Course webpage and SlideShare

Email all your questions to me: [email protected]

IMPORTANT: Send me an email to [email protected], so I make sure that I have

all the correct email addresses. If you do not get an acknowledgement of receipt, please give me a shout!

Lecture 1: Introduction

Page 12: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

SuperImportant: Interaction and Cooperation

Lecture 1: Introduction

12

Communicate with me and with your classmates to exchange ideas, if you have problems in understanding notions and concepts or practical implementations.

Recommemdation: share your knowledge with your peers and steam off stress.

Cheating is not permitted

Page 13: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

S E M A N T I C S I N L A N G U A G E T E C H N O L O G Y

A P P L I C A T I O N S

L E X I C A L S E M A N T I C S

R E P R E S E N T A T I O N O F M E A N I N G

S U M M A R Y

Lecture 1: Introduction

13

Semantics in Language Technology - Overview

Page 14: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

What is Semantics?

Lecture 1: Introduction

14

Your impromptu definitions:

1. Meaning of language (words, phrases, etc.)

2. Break down complex meaning into simpler blocks of meaning

3. Content understanding

4. Disambiguation

5. Understanding a phrase

6. Understanding the meaning of phrases depending on different contexts

7. Meaning and connotation

Page 15: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Logic and Semantics

Lecture 1: Introduction

15

Aristotelian logic – important ever since (dominant until the advent of predicat logic at the end of XIX century).

Syllogisms, e.g.:

Premise: No reptiles have fur.

Premise: All snakes are reptiles.

Conclusion: No snakes have fur.

This is an inference system

Page 16: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Formal and Computational Semantics

Lecture 1: Introduction

16

Computational semantics “is the study of how to automate the process of constructing and reasoning with meaning representations of natural language expressions.” (Wikipedia).

Early systems rule-based, most famous example: “Montague grammar” (1970). Sophisticated mechanisms for translation of English into a very rich logic. Belief: NL and Formal languages can be treated in the same way

Language technology: Recent interest in data-driven and machine learning-based methods.

Page 17: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Modern Logic and Formal Semantics

Lecture 1: Introduction

17

Modern logic develops, late 19th Century – more general and systematic (Gottlob Frege). Predicate logic: symbolic formal systems like FOL etc. Two common quantifiers: there exists - for all.

Formal semantics in linguistics and philosophy based on logic (20th Century). Formal semantics seeks to understand linguistic meaning by constructing precise mathematical models.

Page 18: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Formal Semantics

Lecture 1: Introduction

18

Semantics, meanings and states of affairs: What a sentence means: a structure involving

(lexical) concepts and relations among them. Can be articulated as a semantic representation.

E.g. I ate a turkey sandwich. in predicate logic:

A sentence and the semantic representation of a sentence is also the representation of a possible state of affairs.

Page 19: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Semantics and Truth

Lecture 1: Introduction

19

Correspondence theory of truth:

If the content of a sentence corresponds to an actual state of affairs if it is true; otherwise, it is false.

However:

Many sentences are difficult to formalize in logic. (Modality, conditionality, vague quantification, tense, etc.)

Page 20: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Formalizing Meaning

Lecture 1: Introduction

20

Linguistic content has – at least to a certain degree – a logical structure that can be formalized by means of logical calculi – meaning representations.

The representation languages should be simple and unambiguous – in contrast to complex and ambiguous NL.

Logical calculi come with accounts of logical inference. They are useful for reasoning-based applications.

Meaning formalization faces far-reaching conceptual and

computational difficulties.

Page 21: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Compositionality

Lecture 1: Introduction

21

Linguistic content is compositional: Simple expressions have a given (lexical) meaning; the meaning of complex expressions is determined by the meanings of their constituents.

People produce and understand new phrases and sentences all the time. (NLP must also deal with these.)

Compositionality is studied in detail in compositional syntax-driven semantics. Work in this field is typically about hand-coded rule systems for small fragments of NL.

Page 22: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Compositional Aspects

Lecture 1: Introduction

22

Page 23: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Compositional semantics in Language Technology

Lecture 1: Introduction

23

Page 24: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

First-Order Predicate Logic

Lecture 1: Introduction

24

“flexible, well-understood, and computationally tractable approach to the representation of knowledge [and] meaning” (J&M. 2009: 589)

expressive

verifiability against a knowledge base (related to database languages)

inference

model-theoretic semantics

Page 25: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Cannot cover: multi-word expressions

Lecture 1: Introduction

25

Page 26: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Multi-Word Expressions, what are they?

Lecture 1: Introduction

26

MWEs (a.k.a multiword units or MUs) are lexical units encompassing a wide range of linguistic phenomena, such as idioms (e.g. kick the bucket = to die), collocations (e.g. cream tea = a small meal eaten in Britain, with small cakes and tea), regular compounds (cosmetic surgery), graphically unstable compounds (e.g. self-contained <> self contained <> selfcontained - all graphical variants have huge number of hits in Google), light verbs (e.g. do a revision vs. revise), lexical bundles (e.g. in my opinion), etc. While easily mastered by native speakers, MWEs' correct interpretation remains challenging both for non-native speakers and for language technology (LT), due to their complex and often unpredictable nature.

Page 27: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Conclusions (i)

Lecture 1: Introduction

27

Logic-based semantics is a theoretical foundation for NLP semantics, but implemented systems are typically more coarse-grained and of a more limited scope.

Meaning depends both on literal content and contextual information. This is a challenge for most NLP tasks.

Most NLP applications have to be highly sensitive to semantics.

Page 28: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Conclusions (ii)

Lecture 1: Introduction

28

Finding and interpreting names and other referential expressions is a central issue for NLP semantics.

Disambiguation of polysemous lexical tokens is also a central issue for NLP semantics.

Accessing the content of lexical tokens is also useful.

Meaning representation involves semantic role labelling and predicate-argument structure, which captures a basic aspect of NL compositionality.

Page 29: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Read the chapters recommended in the next slide.

Start thinking about a essay topic .

Tell me your thoughts next time…

Lecture 1: Introduction

29

Page 30: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

Suggested Readings

Lecture 1: Introduction

30

Jurafsky and Martin (2009):

Ch. 17 ”Representation of Meaning”

Ch. 18 ”Computational Semantics”

Clark et al. (2010):

Ch 15 ”Computational Semantics”

Indurkhya and Damerau (2010):

Ch 5 ”Semantic Analysis”

Page 31: Semantic Analysis in Language Technology Lecture 1 ...stp.lingfil.uu.se/~santinim/sais/01_SAIS_Lecture_IntroductionToThe... · Lecture 1: Introduction ... Second Edition, Pearson

This is the end… Thanks for your attention !

Lecture 1: Introduction

31