Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.
-
date post
22-Dec-2015 -
Category
Documents
-
view
218 -
download
3
Transcript of Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.
Bigram
• “the storm swept through the land”• [(the, storm), (storm, swept), (swept,
through), (through, the), (the land)]
Markov Assumption
• The assumption that the probability of a word can depend only on the previous word, or previous N words
• P(“land” | “the”)• P (“land” | “the storm swept through the”)
Maximum Likelihood Estimation
• N-Gram probability based on corpus counts• P(word n| word n-1) = counts of word n-1 followed by word n /Counts of all times word n-1 occurs
Trigram
• “the quick red fox jumped the quick black bear. The quick red fox hopped away.”
• [(the, quick, red), (quick, red, fox), (red, fox, jumped), (fox, jumped, the), (jumped, the, quick), (the, quick, black), (quick, black, bear) (the, quick, red) (quick, red, fox), (red, fox, hopped), (fox, hopped, away)]
Trigram
• Prob “the quick red” given “the quick” = occurrences of “the quick red” /
occurrences of “the quick”