Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

33
1 A Comparative Investigation of Morphological Language Modeling for the Languages of the European Union Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang ICT

description

A Comparative Investigation of Morphological Language Modeling for the Languages of the European Union. ICT. Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang. Outline. Introduction Modeling of morphology and shape Experimental Setup - PowerPoint PPT Presentation

Transcript of Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

Page 1: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

1

A Comparative Investigation of Morphological Language Modeling

for the Languages of the European UnionThomas Muller, Hinrich Schutze and Helmut Schmid

ACL June 3-8, 2012 Reporter:Sitong Yang

ICT

Page 2: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

2

Outline

• Introduction • Modeling of morphology and shape• Experimental Setup• Results and Discussion• Conclusion

Page 3: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

3

Outline

• Introduction • Modeling of morphology and shape• Experimental Setup• Results and Discussion• Conclusion

Page 4: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

4

Introduction

• Motivation

• Main idea

Page 5: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

5

Motivation

Language model?

potentially

large

dangerous

serious

hypothetically

large

dangerous

serious

(frequent history) (rare history)

how to transfer ?

morphology

Page 6: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

6

main idea• goal

•perplexity reduction(PD) for a large number of languages

Page 7: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

7

main idea• goal

•perplexity reduction(PD) for a large number of languages

• Feature•Morphologigy•Shape Feature

Page 8: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

8

main idea• goal

•perplexity reduction(PD) for a large number of languages

• Feature•Morphologigy•Shape Feature

• parameters•frequency threshold θ•number of suffixes uesd φ•morphological segmentation algorithms

Page 9: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

9

Outline

• Introduction • Modeling of morphology and shape• Experimental Setup• Results and Discussion• Conclusion

Page 10: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

10

Modeling of morphology and shape

• Morphology

• Shape features

• Similarity measure

Page 11: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

11

Morphology

• Automatic suffix identification algorithms:Reports , Morfessor and Frequency

• Parameter:φ most frequent suffixes

Page 12: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

12

Shape features• capitalization• special characters• word length

Page 13: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

13

similarity measure

• similarity measure and details of the shape features in prior work (M¨ uller and Sch¨ utze, 2011).

Page 14: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

14

Outline

• Introduction • Modeling of morphology and shape• Experimental Setup• Results and Discussion• Conclusion

Page 15: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

15

Experimental Setup• Baseline

• Morphological class language model

• Distributional class language model

• Corpus

Page 16: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

16

Experimental Setup• Experiments:

•srilm, kneser-Ney(KN), generic class implementation, optimal interpolation parameters

• Baseline•modified KN model

Page 17: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

17

Morphological class language model

Class-based language model:

Word emission probobility:

Page 18: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

18

Morphological class language model

Final model PM interpolates PC with a modified KN model:

Unknow word estimation:

Page 19: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

19

Morphological class language model

modified class model PC'

Page 20: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

20

Distributional class language model

• PD is same form PM

• The difference is the classes are mophological for PM and distributional for PD

• Whole-context distributional vector space model

Page 21: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

21

Corpus• training set(80%)• validation set(10%)• test set(10%)

Page 22: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

22

Outline

• Introduction • Modeling of morphology and shape• Experimental Setup• Results and Discussion• Conclusion

Page 23: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

23

Results and Discussion

• Morphological model vs. Distributional model

• Sensitivity analysis of parameters

Page 24: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

24

Morphological model vs. Distributional model

• MM:more morphological , more perplexity reduction ,largerφ.

• MM : Result considerable perplexity reduc-tions 3%-11%

• Frequency is surprisingly well

• Noly 4 cases DM better than MM

• DM restriction clustering to less frequent words

Page 25: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

25

Morphological model vs. Distributional model

Page 26: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

26

Sensitivity analysis of parameters• best and worst values of each parameter and the diffe

rence in perplexity improve-ment between the two.

• θ•strong influence on PD•positive correlated with morphological complexit

y

• φ and segmentation algorithms•negligible effect•frequency is perform best.

Page 27: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

27

Sensitivity analysis of parameters

Page 28: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

28

Outline

• Introduction • Modeling of morphology and shape• Experimental Setup• Results and Discussion• Conclusion

Page 29: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

29

Conclusion• Feature:morphology shape feature

• Result:perplexity reduc-tions 3%-11%

• parameters:•θ:considerable influence•φ and segmentation algorithms: small effect

Page 30: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

30

Future Work• A model that interpolates KN, morphological class mo

del and distributional class model.

Page 31: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

31

my thought

• Minority language model

Page 32: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

32

Q&A?

ICT

Page 33: Thomas Muller, Hinrich Schutze and Helmut Schmid ACL June 3-8, 2012 Reporter:Sitong Yang

33

Thank you!

ICT