Implications and applications of corpus-based analysis Christiane Fellbaum Princeton University.
-
Upload
julian-moore -
Category
Documents
-
view
215 -
download
0
Transcript of Implications and applications of corpus-based analysis Christiane Fellbaum Princeton University.
Implications and applications of corpus-based analysis
Christiane Fellbaum
Princeton University
Implications
• Linguist is no longer his/her own corpus• Corpus data don’t necessarily agree with introspection,
intuition• Broad speaker community may reveal linguist’s idiolect
Implications
• New research methodology requires new analyses • Statistical, "soft" rules rather than hard “yes/no” rules• “Messy” theories (hence remaining resistance to CL!)
Applications
Gain better theoretical understanding of linguistic phenomenaFor lexical semantics work:--New challenges for lexicographic representation--Natural Language Processing Applications, e.g., text understanding,
language generation
Two examples
• Large-scale corpus analysis of German VP idioms
• Discovery of scales
Corpus-based study of idioms
Linguists, lexicographers, psycholinguists assume that idioms are fixed
kein Blatt vor den Mund nehmen
No leaf in front of the mouth take
“speak freely and frankly”
Non-compositional, opaque
Corpus data show• Morphosyntactic variation:Ein Blatt nehmen sie dabei vor keinen MundA leaf take they in front of no mouth(topicalization, shift of negation)
• Lexical variationEin Regierungssprecher ist ein Mann,A government spokesman is a man der sich 100 Blaetter vor den Mund nimmtwho 100 leaves in front of his mouth takes
• No theory of idiom grammar/representation accounts for all phenomena
Discovering scales
• Scalar adjectives (Sheinman & Tokunaga 2009; Schulam & Fellbaum 2010)
…terrible-lousy-bad-mediocre-good-great-outstanding…
• Gradable emotions (Fellbaum & Mathieu 2010)…alarm-frighten-scare-terrify…
• Where on the scale are these words placed? What is their relative position (their strength, intensity)?
Discovering scales
Corpus searches with seed pair reveals lexical-semantic patterns for asymmetry, such as
X even Y (Y is stronger than X)If not X, at least Y (X>Y)X but not Y (X is weaker than Y)
Patterns can be applied to all members of a scale, establish relative order
Conclusion
Corpus may reveal linguistic data that challenge current theories
escape introspection