Morphological learning as principled argument Lars G Johnsen University of Bergen Norway.
-
Upload
rosalyn-gaines -
Category
Documents
-
view
215 -
download
1
Transcript of Morphological learning as principled argument Lars G Johnsen University of Bergen Norway.
Maybe in order to understand mankind, we have to look at the word itself: "Mankind". Basically, it's made up of two separate words - "mank" and "ind". What do these words mean? It's a mystery, and that's why so is mankind.
Jack Handy
There are reasons for positing a word structure
There are at least three conditions on structuring a word w into x.y
There are at least three conditions on structuring w into x.y
x is a stem and y is a suffix
There are at least three conditions on structuring w into x.y
x is a stem and y is a suffix
y selects x
There are at least three conditions on structuring w into x.y
x is a stem and y is a suffix
y selects x
x and y are relevant for the distribution of w
Arguments for x being a stem carries over to an argument that y is a suffix
If x is a stem then x has meaning
If x is a stem then x has meaning
stem(x) → meaning(x)
If x is a stem then x has meaning
stem(x) → meaning(x)
word(x) → meaning(x)
Being a stem is translated into being a word
Being a stem is translated into being a word
Pr( stem(x)| w=x.y) ~ Pr(word(x)| w=x.y)
Being a stem is translated into being a word
Pr( stem(x)| w=x.y) ~ Pr(word(x)| w=x.y)
Being a stem is translated into being a word
Pr( stem(x)| w=x.y) ~ Pr(word(x)| w=x.y)
{ | ( . ) ( )}
{ | ( . )}
z z y z
z z y
W W
W
A beta distribution is used for assigning a probability based on the proportion
A beta distribution is used for assigning a probability based on the proportion
beta(positive, negative)
The top ten listMorph Ratio SD Prob Pos Neg
less 95 1 94 368 18
' 93 1 92 1723 133
's 92 0 92 9783 857
ship 91 2 89 167 16
like 91 2 89 140 14
house 91 3 88 75 7
'll 91 3 88 105 11
head 88 4 85 61 8
fish 88 4 84 66 9
stone 87 4 83 66 10
The top ten listMorph Ratio SD Prob Pos Neg
less 95 1 94 368 18
' 93 1 92 1723 133
's 92 0 92 9783 857
ship 91 2 89 167 16
like 91 2 89 140 14
house 91 3 88 75 7
'll 91 3 88 105 11
head 88 4 85 61 8
fish 88 4 84 66 9
stone 87 4 83 66 10
The top ten listMorph Ratio SD Prob Pos Neg
less 95 1 94 368 18
' 93 1 92 1723 133
's 92 0 92 9783 857
ship 91 2 89 167 16
like 91 2 89 140 14
house 91 3 88 75 7
'll 91 3 88 105 11
head 88 4 85 61 8
fish 88 4 84 66 9
stone 87 4 83 66 10
Analyzing easiness
easi ness 78
eas iness 46
easines s 42
easin ess 6
easine ss 4
Analyzing termites
Suffix Ratio SD Prob Pos Neg
s 42 0 42 18098 25001
ites 43 4 40 78 102
es 23 0 23 2094 6925
tes 19 1 17 211 927
Analyzing termites
Suffix Ratio SD Prob Pos Neg
s 42 0 42 18098 25001
ites 43 4 40 78 102
es 23 0 23 2094 6925
tes 19 1 17 211 927
Analyzing termites
Suffix Ratio SD Prob Pos Neg
s 42 0 42 18098 25001
ites 43 4 40 78 102
es 23 0 23 2094 6925
tes 19 1 17 211 927
Analyzing termites
Suffix Ratio SD Prob Pos Neg
s 42 0 42 18098 25001
ites 43 4 40 78 102
es 23 0 23 2094 6925
tes 19 1 17 211 927
The measure of meaning captures the stem and suffix part
x is a stem and y is a suffix
Selectional relation is treated as the predictive power of the stem and suffix
easinesseasi → easi.er, easi.ly ness → readi.ness, fond.ness, hard.ness
eas → eas.ier, eas.ily, eas.ter, eas.toniness → read.iness,
Selectional relation is treated as the predictive power of the stem and suffix
easinesseasi → easi.er, easi.ly ness → readi.ness, fond.ness, hard.ness
eas → eas.ier, eas.ily, eas.ter, eas.toniness → read.iness,
Selectional relation is treated as the predictive power of the stem and suffix
easinesseasi → easi.er, easi.ly ness → readi.ness, fond.ness, hard.ness
eas → eas.ier, eas.ily, eas.ter, eas.toniness → read.iness,
Combining the endings from the stem and the starts from the suffix results in a collection of possible words
The first hypothesis is easi.ness
easi → .er, .ly ness → readi., fond., hard.
readi.er, readi.ly, fond.er, fond.ly, hard.er, hard.ly
5 positive 1 negative approx 90%
The second hypothesis is eas.iness
eas → .ier,.ily,.ter,.toniness → read.
read.ier, read.ily, read.ter, read.ton
1 positive 3 negative, 25%
easi.ness is best on both accounts and is the preferred analysis