Language Model Methods and Metrics
-
Upload
dara-burke -
Category
Documents
-
view
23 -
download
1
description
Transcript of Language Model Methods and Metrics
![Page 1: Language Model Methods and Metrics](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d65550346895d9275a6/html5/thumbnails/1.jpg)
Language Model Methods and Metrics
Gary LuuRyan Fortune
![Page 2: Language Model Methods and Metrics](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d65550346895d9275a6/html5/thumbnails/2.jpg)
Skip N-grams
• Interpolated with Bigram• Get Influence of words further away without
increasing dimensionality• Learning Curve
![Page 3: Language Model Methods and Metrics](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d65550346895d9275a6/html5/thumbnails/3.jpg)
Skip N-gram Learning Curve
![Page 4: Language Model Methods and Metrics](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d65550346895d9275a6/html5/thumbnails/4.jpg)
Content Word Language Model
• Help predict next word using last uncommon word, try to capture context
• Found list of 250 most common words• Tried different sizes for common words• Interpolated with language models, since this
wouldn’t maintain grammar• P(w|C)
![Page 5: Language Model Methods and Metrics](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d65550346895d9275a6/html5/thumbnails/5.jpg)
Content Word Model
![Page 6: Language Model Methods and Metrics](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d65550346895d9275a6/html5/thumbnails/6.jpg)
Bag Generation Metrics
• Bag Generation – NP-Hard• Random Restart Greedy Hill-Climbing• Stability Metric
• Give model correct sentence, does it maintain it as an optima?
• A percentage of sentences that remain stable
• Reconstruction Metric• Needs to be compared against lucky/random
![Page 7: Language Model Methods and Metrics](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d65550346895d9275a6/html5/thumbnails/7.jpg)
Bag Generation Metrics
![Page 8: Language Model Methods and Metrics](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d65550346895d9275a6/html5/thumbnails/8.jpg)
Clustering -IBMFullPredict
• Clustering overview• Perplexity down to 107 with million sentence
corpus
• Pibmfullpredict(wi|wi-2wi-1) = [λP(W|wi-2wi-1) + (1-λ)P(W|Wi-1Wi-2)] * [μP(w|wi-1wi-2,W) + (1-μ)P(w|Wi-2,Wi-1,W)]
![Page 9: Language Model Methods and Metrics](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d65550346895d9275a6/html5/thumbnails/9.jpg)
Learning Curve for IBMFullPredict