1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London,...
-
Upload
tyrone-chandler -
Category
Documents
-
view
219 -
download
2
Transcript of 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London,...
![Page 1: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/1.jpg)
1
Syntagmatic Preferences
Patrick HanksMasaryk University
In honour of Yorick Wilks
BCS, London, June 22, 2007
![Page 2: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/2.jpg)
2
What's so important about “My car drinks gasoline”?
• Violation of “selection restrictions” is normal.
• So selectional restrictions aren't restrictions at all
– They are, in fact selectional preferences
– Different combinations of selectional preferences activate different senses
• Yorick's insights of the 1970s deserve to be followed up more vigorously and systematically than they have been.
![Page 3: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/3.jpg)
3
A language is a double helix
• Start from the bottom up:– Let’s look at what the words do.
– How do people use words to make meanings?
• A natural language is a system of norms and exploitations:– Norms: Animals drink water, people drink beverages
– Exploitations: My car drinks gasoline
• Syntagmatic rules governing normal linguistic behaviour systematically interact with exploitation rules governing how those norms are exploited
![Page 4: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/4.jpg)
Patterns of linguistic behaviour• Normal linguistic behaviour is highly patterned.• Words in isolation have meaning potential, not meaning
– A meaning potential is a more or less vague cluster of possibilities – e.g. what does fire mean?
– A burning process? (and if so is it a good thing – in a house, under control – or a bad thing, raging out of control in a forest?) An electric heater? A sense of enthusiasm? Dismiss from employment? Operate a gun? Shoot an arrow? Cause to enthuse? Bake?
– All of these and more. – Much overlap.– Sense enumeration doesn’t get it (cf Pustejovsky’s lexical conceptual paradigms)
• In context, the range of possible interpretations of a word is severely limited:– People firing guns, ideas that fire people with enthusiasm, employers firing their
staff, firing pottery in a kiln
![Page 5: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/5.jpg)
Word Use, Meaning, and Linguistic Theory
• The normal uses of a word can be grouped into patterns, and meanings can be associated with the patterns (rather than the word in isolation)
• So far they haven’t been. Why not?– Lack of evidence
• Lexical analysis can only be done effectively with large corpora– Tradition and intuition
• direttissimo assaults on word meaning• No one thought to go the long way round, via patterns
– The tyranny of “all and only”• Lexicographers aimed to cover all possible uses, not just all normal uses• NLP and linguistic theory focused on boundary cases
– Syntactocentrism in linguistic theory• misses the point about syntagmatics
– Lack of a suitable theory• Aha! Preference Semantics provides the basis for such a theory• We should take PS seriously and ally it with other relevant theoretical work
(Wittgenstein, Putnam, Rosch, Sinclair, Hoey, Pustejovsky, …)
![Page 6: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/6.jpg)
6
Why is a Pattern Dictionary Necessary?
• Standard dictionaries do not provide the contexts that distinguish one sense of a word from another.– very poor syntagmatic information– give equal prominence to normal and merely
possible senses– definitions (and senses) are not mutually exclusive
• WordNet: synsets ≠ word senses!• FrameNet: frames ≠ word senses!
![Page 7: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/7.jpg)
7
Identifying norms is hard
• ... and boring– The painful rediscovery of the obvious,– which is only obvious when pointed out
• Only by painstaking corpus analysis is identifying norms possible.
• What counts as a normal use of any verb? – e.g. drink
![Page 8: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/8.jpg)
8
Norms for 'drink', v.
1. 55% [[Human]] drink [[{Liquid = Water} | Beverage]]
2. 4% [[Animal]] drink [[Liquid = Water]]
3. 39% [[Human]] drink [NO OBJ]
4. 1% [[Human]] drink [[Experience]] {in}
5. 1% [[Human]] drink ([[Liquid = Beverage]]) {up}
![Page 9: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/9.jpg)
9
Some Exploitations of 'drink'
A metaphor (or literary allusion):
• The child of a nonconformist father learnt to drink deep of the Catholic tradition .
– Owen Chadwick, 1991. Michael Ramsey: a life.
A coercion:
• ` He knows them all , ' she says adoringly , ` and they all drink shampoo -- nearly every night .
– The Guardian, 1989.
![Page 10: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/10.jpg)
10
How pervasive is ambiguity?
• Not as pervasive as you might think.
– If we attach meanings to patterns, not to words, most “ambiguities” don't get a chance to rear their ugly heads.
• But here's one: He drank. • Could be a null-object alternation of “he drank [[Beverage]]”
• or it could mean that he had a problem with alcohol (pattern 2)
![Page 11: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/11.jpg)
11
Getting the right level of generalization is hard
“John fired at a line of stags”
• Corpus evidence shows that fire at does not prefer ANIM in the prepositional object slot. Any PHYSOBJ will do.
• Building a pattern dictionary is a constant struggle to get “the right level” (or at least an acceptable level ) of generalization
• Art is required to choose a level.
• There are no right answers (no absolutes). – But plenty of wrong ones!
![Page 12: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/12.jpg)
12
Semantic Types and Semantic Roles
• fire at assigns the semantic role “Target” to words of semantic type [[Physical Object]]
• Semantic types are the intrinsic prototypical values of nouns – their essences
• Semantic roles are assigned by context
![Page 13: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/13.jpg)
13
Word Meaning: a complex linguistic Gestalt
• In the mind of an English speaker, the verb land is primed for any or all of the following: – passengers land from a plane – the pilot lands the plane – the plane
lands – we landed at Heathrow – passengers land from a boat (but more probably they are soldiers) – a commander lands his troops (but not from a plane) – a boat lands its cargo – a trawler lands its catch – an angler lands a fish – Yorick landed the role of Caliban – He landed a job in Sheffield – someone else may land in trouble – or be landed with a problem – and someone may even land a blow on your nose
![Page 14: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/14.jpg)
14
Imposing order on chaos
In the Pattern Dictionary:• Verbs are sorted into patterns
• Exploitations are flagged for later analysis
• Nouns (“lexical sets”) are clustered into an ontology
• The ontology is “distorted” by usage
• Lexical sets “shimmer”
![Page 15: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/15.jpg)
15
Lexical Sets “shimmer”
• [[Human]] attend [[Event]]– Lexical set [[Event]] = { meeting, conference, funeral, ceremony,
course, school, seminar, lecture, session, class, rally, dinner, hearing, briefing, reception, workshop, wedding, inquest, summit, concert, event, premiere, …}
• [[Human]] participate {in [[Event]]}– Lexical set [[Event]] = {debate, election, exercise, coup,
demonstration, activity, process, conference, consultation,
selection, meeting, …} • [[Human]] hail [[Event]]
– Lexical set [[Event]] = {victory, success, agreement, vote, opening, development, result, start, resurgence, …}
![Page 16: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/16.jpg)
16
Patterns are contrastive
• 2% [[Human]] launch [[Boat]]
• 7% [[Human]] launch [[Projectile]]
• 58% [[Human | Institution]] launch [[Activity | Plan]]
• 24% [[Institution]] launch [[{Artifact = Product} | {Activity = Service}]]
![Page 17: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/17.jpg)
17
What is a Pattern Dictionary?
• a inventory of all normal patterns of verb use– not all possible uses.
• an ontology of “shimmering” lexical sets (clusters of nouns according to semantic type and argument roles)
• an inventory of semantically motivated syntagmatic distinctions
![Page 18: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/18.jpg)
18
Tools needed to build a Pattern Dictionary
• A balanced corpus of the language (i.e. general language)• A theory
– An initial lexical architecture that guides clusteringWilks, Pustejovsky, Sinclair, …– A lexical model that distinguishes norms from exploitations
• A methodology: Corpus Pattern Analysis– Hanks 2004, Hanks and Pustejovsky 2005– Including statistical corpus analysis
• Church and Hanks 1989, Kilgarriff et al. 2004, 2005• A shallow ontology
– A hierarchical organization of semantic types, reflecting word groupings, not scientific conceptualization of the universe
• A suite of corpus tools: Manatee, Bonito, Word Sketch Engine• Kilgarriff, Rychlý
![Page 19: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/19.jpg)
19
CPA procedure
• Create a sample concordance (KWIC index) for a word: – 250 examples of actual uses of the word
• Identify the typical syntagmatic patterns. • Assign each line of the sample to one of the
patterns.• Take further samples if necessary.
– Introspection is used to interpret data, but not to create data.
• Store the pattern in the entry manager.
![Page 20: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/20.jpg)
20
In CPA, every line in the sample must be classified
The choices are:
• Norms
• Exploitations
• Alternations
• Names (Midnight Storm: name of a horse, not a storm)
• Mentions (to mention a word or phrase is not to use it)
• Errors (e.g. learned mistyped as leaned)
• Unassignables– See Proceedings of the Eleventh EURALEX International
Congress, pages 105–116, Lorient, France, 2004.
![Page 21: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/21.jpg)
21
How normal are norms? How frequent are exploitations?
• Roughly 75% of all clauses activate “primary norms”
• About 20% activate secondary norms
– including conventional metaphors
– and some expressions that may once have been exploitations themselves
• About 4% of all clauses involve exploitations of various sorts
– dynamic metaphors, other tropes, coercions, ellipsis, etc.
• About 1% of all clauses are unclassifiable
![Page 22: 1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.](https://reader036.fdocuments.us/reader036/viewer/2022062321/56649de45503460f94adb90f/html5/thumbnails/22.jpg)
22
Browsing and Feedback
• The English Pattern Dictionary• http://nlp.fi.muni.cz/projects/cpa/• Browse the first 50 verbs at https://apollo.fi.muni.cz:8007/
– Login and password are both “guest” – Click on the pattern number to see the whole pattern– Click on “lines” to see supporting corpus evidence
• 50 verb entries have been completed and released– Feedback, please!
• 400 additional entries have been analysed, awaiting release– A shallow ontology has been drafted and is being edited– But not populated with nouns yet– 6500 verbs remain to be analysed
• EPD will not include rare words like saltate or saccharify