Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 -...

51
Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1

Transcript of Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 -...

Page 1: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

1

Shallow Processing:Recap

Domain Adapt

Shallow Processing Techniques for NLPLing570

Day 21 - December 6, 2012

Page 2: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

2

Roadmap

• MT & Domain Adapt• Looking back:

– Topics covered– Tools and Data

• Looking forward– Upcoming courses

Page 3: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

3

DOMAIN ADAPTATION

Slides 4-19 adapted from Barry Haddow’s slides

Page 4: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

4

• Most words have multiple senses• Cross-lingual mapping difficult for all contexts• Senses are often “domain” specific

Fun with Tables and Chairs (into German):

– Table • Tisch• Tabelle

– Chair• Stuhl• Vorsitzende

The Battle of Word Senses and Domains

Page 5: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

5

• Most words have multiple senses• Cross-lingual mapping difficult for all contexts• Senses are often “domain” specific

Fun with Tables and Chairs (into German):

– Table • Tisch• Tabelle

– Chair• Stuhl• Vorsitzende

The Battle of Word Senses and Domains

General usageTech usage

General usageGovernmental usage

Page 6: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

6

• TableThe food is on the table.

Das Essen ist auf dem Tisch.

The results are in the table.Die Ergebnisse sind in der Tabelle.

• ChairHe sat on the chair.

Er saß auf dem Stuhl.

He is chair of the committee.Er ist Vorsitzender des Ausschusses.

The Battle of Word Senses and Domains : Contexts

Page 7: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

7

Domain?

• Not a well-defined concept• Should be based on some notion of textual

similarity– Lexical choice– Grammar

• What level of granularity?– News Sports Football

Page 8: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

8

Examples of Domains

• Europarl (ep)– European parliamentary proceedings

• News-commentary (nc)– Analysis of current affairs

• Subtitles (st)– Film subtitles

Page 9: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

9

Europarl

Resumption of the session

I declare resumed the session of the European Parliament adjourned on Friday 17 December 1999, and I would like once again to wish you a happy new year in the hope that you enjoyed a pleasant festive period.

Although, as you will have seen, the dreaded 'millennium bug' failed to materialise, still the people in a number of countries suffered a series of natural disasters that truly were dreadful.

You have requested a debate on this subject in the course of the next few days, during this part-session.

Page 10: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

10

News Commentary

Musharraf's Last Act?

Desperate to hold onto power, Pervez Musharraf has discarded Pakistan's constitutional framework and declared a state of emergency.

His goal?

To stifle the independent judiciary and free media.

Artfully, though shamelessly, he has tried to sell this action as an effort to bring about stability and help fight the war on terror more effectively.

Page 11: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

11

Subtitles

I'il call in 30 minutes to check

Oh, hello Fujio

Is your mother here, too?

Why are you outside?

It's no fun listening to women's talk

Well, why don't we go in together

Page 12: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

12

Translation Performance

• Measure effect of domain on performance• Train two systems

– One using in-domain data (nc or st)– One using out-of-domain data (ep)

• Test on in-domain data (nc or st)• Only vary translation model

Page 13: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

13

Test on nc

Page 14: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

14

Test on st

Page 15: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

15

NC vs. EP: Example

• Source:– Veamos el mercado accionario de los Estados

Unidos, el mayor del mundo por mucho.• Translation - ep

– Consider the US accionario market, the world's largest by much.

• Translation - nc– Consider the US stock market, the largest by far.

Page 16: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

16

Translations of “stock market”

Page 17: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

17

Domain Adaptation Techniques

• Data selection– Filtering

• Data weighting– Corpus weighting– Model interpolation– Topic-based clustering and weighting– Phrase or sentence weighting

• Enlarge data-set– Web crawling– Extract from monolingual– Self-training

Page 18: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

18

Data Selection

• Suppose we have:– A small in-domain corpus I– A large out-of-domain corpus O

• Select data from O which is similar to I– Equivalent to weighting the sentences with a 1-0

weighting– Would be better not to discard data...

Page 19: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

19

Data Selection for Translation Models

• Modfied Moore-Lewis (Axelrod et al, EMNLP 2011, & Axelrod et al, IWSLT 2012)

• Moore-Lewis based on perplexity difference between I and O

• For TM, apply this to source and target sentence

• Selects sentence most like I and least like O

Page 20: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

20

Does it work?

Axelrod et al 2011See also Axelrod et al 2012 for comparisons between domain adapted (10%)

versus data (100%) system counterparts

Page 21: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

21

COURSE RECAP

Page 22: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

22

Units #0

• Unit #0 (0.5 weeks): – HW #1 – Introduction to NLP & shallow processing– Tokenization

Page 23: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

23

Unit #1

• Unit #1 (0.5 weeks): – Formal Languages and Automata (1 week)

• Formal languages• Finite-state Automata• Finite-state Transducers• Morphological analysis

• Transition from PFA’s to MMs

Page 24: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

24

Unit #2

• Unit #2 (2 weeks): – HW #2, #3

• HW #2 – Building Markov models for English• HW #3 – Building a Korean POS HMM tagger

– Markov Chains and HMMs• Building and applying Markov models• Part-of-speech (POS) tagging:

– Ngram– Hidden Markov Models

Page 25: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

25

Unit #3

• Unit #3: Intro to Classification (1.5 weeks)– Project #1 – News blog classification

– Classification & Machine Learning• Intro to classification & Mallet• Intro to feature engineering• Document classification with classifiers

Page 26: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

26

Unit #4

• Unit #4: Language Models & Smoothing (1.5 weeks)– HW #4 – Building LMs and applying smoothing– HW #5 – Building LMs, applying KL divergence to

compare models

– Intro to Language Models– Intro to Smoothing techniques

• Laplace (Add 1)• Good Turing

Page 27: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

27

Unit #5

• Unit #5: QA & IR (1 week)

– Introduction to QA & IR• Applying NLP methods to QA & IR• Reviewing data “pipelines” for NLP and related tasks

Page 28: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

28

Units #6

• Unit #6: Discriminative sequence modeling (1.5 weeks)– Project #2 – Applying disc models to POS tagging

– POS tagging with classifiers– Chunking– Named Entity (NE) recognition

Page 29: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

29

Unit #7

• Unit #7: Misc topics in Stat NLP (2 weeks)

– Introduction to IE– Application of IE to “linguistics”– Introduction to MT

• NLP models and techniques as applied to MT• Word Alignment• Intro to EM algorithm• Domain Adaptation as applied to MT

Page 30: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

30

TOOLS & DATA

Page 31: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

31

Tools Developed• English tokenizer: HW#1• Markov Models from corpora: HW#2

– Building a transition matrix– Building an emission matrix

• Korean POS tagger (using an HMM): HW#3– Apply #2 to Korean data– Simple smoothing

• Text classifier: Project #1– Classifier of blog/news data, right vs. left

• Language Modeler: HW#4– Tool to build and smooth an LM– Applied to Portuguese data

• Tools for calculating Entropy and KL Divergence: HW#5– Building and smoothing mulitlingual LMs– How to compare LMs and distributions

• Discriminative POS Tagger: Project #2– Korean POS Tagger, part 2– ML applied to Sequence Labeling problems

Page 32: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

32

Corpora & Systems

• Data:

Page 33: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

33

Corpora & Systems

• Data:– Penn Treebank

• Wall Street Journal• Air Travel Information System (ATIS)

Page 34: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

34

Corpora & Systems

• Data:– Penn Treebank

• Wall Street Journal• Air Travel Information System (ATIS)

– Korean Treebank– Portuguese Newswire Text Corpus– LM Training files from Cavnar & Trenkle (multiple languages)– Online Blogs from various media sites

• Systems:– Mallet Machine Learning Package– Porter Stemmer

Page 35: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

35

LOOKING FORWARD

Page 36: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

36

Winter Courses

• Ling 571: Deep Processing Techniques for NLP– Parsing, Semantics (Lambda Calculus), Generation

Page 37: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

37

Winter Courses

• Ling 571: Deep Processing Techniques for NLP– Parsing, Semantics (Lambda Calculus), Generation

• Ling 572: Advanced Statistical Methods in NLP– Roughly, machine learning for CompLing– Decision Trees, Naïve Bayes, MaxEnt, SVM, CRF,…

Page 38: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

38

Winter Courses

• Ling 571: Deep Processing Techniques for NLP– Parsing, Semantics (Lambda Calculus), Generation

• Ling 572: Advanced Statistical Methods in NLP– Roughly, machine learning for CompLing– Decision Trees, Naïve Bayes, MaxEnt, SVM, CRF,…

• Ling 567: Knowledge Engineering for Deep NLP– HPSG and MRS for novel languages

Page 39: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

39

Winter Courses• Ling 571: Deep Processing Techniques for NLP

– Parsing, Semantics (Lambda Calculus), Generation• Ling 572: Advanced Statistical Methods in NLP

– Roughly, machine learning for CompLing– Decision Trees, Naïve Bayes, MaxEnt, SVM, CRF,…

• Ling 567: Knowledge Engineering for Deep NLP– HPSG and MRS for novel languages

• Ling 575: – (Xia) Domain adaptation

• Dealing with system degradation when training and test data are from different domains

– (Tjalve) Speech Technologies– (Bender) Semantic Representations– (Levow) Spoken Dialog Systems (?), in Spring

Page 40: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

40

Tentative Outline for Ling 572

• Unit #0 (1 week): Basics– Introduction– Feature representations– Classification review

Page 41: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

41

Tentative Outline for Ling 572

• Unit #0 (1 week): Basics– Introduction– Feature representations– Classification review

• Unit #1 (2.5 weeks): Classic Machine Learning– K Nearest Neighbors– Decision Trees– Naïve Bayes

Page 42: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

42

Tentative Outline for Ling 572

• Unit #3: (4 weeks): Discriminative Classifiers– Feature Selection– Maximum Entropy Models– Support Vectors Machines

Page 43: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

43

Tentative Outline for Ling 572

• Unit #3: (4 weeks): Discriminative Classifiers– Feature Selection– Maximum Entropy Models– Support Vectors Machines

• Unit #4: (1.5 weeks): Sequence Learning– Conditional Random Fields– Transformation Based Learning

Page 44: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

44

Tentative Outline for Ling 572

• Unit #3: (4 weeks): Discriminative Classifiers– Feature Selection– Maximum Entropy Models– Support Vectors Machines

• Unit #4: (1.5 weeks): Sequence Learning– Conditional Random Fields– Transformation Based Learning

• Unit #5: (1 week): Other Topics– Semi-supervised learning,…

Page 45: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

45

Ling 572 Information

• No required textbook:– Online readings and articles

Page 46: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

46

Ling 572 Information

• No required textbook:– Online readings and articles

• More math/stat content than 570– Probability, Information Theory, Optimization

Page 47: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

47

Ling 572 Information

• No required textbook:– Online readings and articles

• More math/stat content than 570– Probability, Information Theory, Optimization

• Please try to register at least 2 weeks in advance

Page 48: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

48

Beyond Ling 572

• Machine learning:– Graphical models– Bayesian approaches– Online learning– Reinforcement learning– ….

Page 49: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

49

Beyond Ling 572• Machine learning:

– Graphical models– Bayesian approaches– Online learning– Reinforcement learning– ….

• Applications:– Information Retrieval– Question Answering – Generation– Machine translation– ….

Page 50: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

50

Ling 575: Domain Adaptation

• Handling system degradation when training and test data are from different domains

• Focus on improving performance of POS taggers and parsers

• Time: Thurs 3:30-5:50pm

Page 51: Shallow Processing: Recap Domain Adapt Shallow Processing Techniques for NLP Ling570 Day 21 - December 6, 2012 1.

51

Notes

• Grades will be submitted by 12/18– Any issues (errors) with grades in Gradebook,

please email us by 12/15