Mutiword Expressions: An Extremist Approach Charles J. Fillmore ICSI and UCB.
-
Upload
allen-park -
Category
Documents
-
view
215 -
download
0
Transcript of Mutiword Expressions: An Extremist Approach Charles J. Fillmore ICSI and UCB.
Mutiword Expressions:An Extremist Approach
Charles J. Fillmore
ICSI and UCB
Background:or, Why Do I Care?
FrameNet Project
How to evaluate progress
"Words" versus LUs: complain, take off, depend on
Search problems and word frequency
General questions of polysemy
Some corpus linguistics traditions
Certain technical problems of representation: parcelling out meanings
MWEs and the rest of the grammar
Estimation of vocabulary size
Questions of acquisition, typology, etc.
What is a MWE?
Any linguistic expression, involving more than one word, that requires an interpreter – human or machine – to have more than the abilities of an "Innocent Speaker-Hearer".
The concept is not limited to lexicalized (listable) expressions.
Innocent Speaker-Hearer
The ISH knows – individual simple lexical units, – the basic head-to-dependent grammatical relations, – the basic head-to-dependent semantic relations as
determined by the frame of the governing lexical unit,
– regular and specific rules for realizing these, – strategies for building a semantic structure out of all
this.
That's all it knows.
Dependency Representation
Since ISH's knowledge is about – unitary words and – word-to-word relations,
that can be represented in dependency diagrams in – which each node is a word and – each word-to-word link, i.e., each branch,
• stands for one of the basic grammatical relations and
• is capable of bearing a frame-based semantic relation to the governor.
Here's a simple case:
His parents gave me a copy of that fascinating book about frogs.
gave
parents me copy
his a of
book
that fascinating about
frogs
Basic syntactic relations
Complementation
Specification
Modification
(there are others)
Complementation
His parents gave me a copy of that fascinating book about frogs.
gave
parents me copy
his a of
book
that fascinating about
frogs
Complementation
His parents gave me a copy of that fascinating book about frogs.
gave
parents me copy
his a of
book
that fascinating about
frogs
Actually, copy of shouldbe treated as a MWE.
Specification
His parents gave me a copy of that fascinating book about frogs.
gave
parents me copy
his a of
book
that fascinating about
frogs
Specification
His parents gave me a copy of that fascinating book about frogs.
gave
parents me copy
his a of
book
that fascinating about
frogs
Actually his can also bethought of as satisfyinga frame requirement ofthe relational noun parents.
Modification
His parents gave me a copy of that fascinating book about frogs.
gave
parents me copy
his a of
book
that fascinating about
frogs
So ...
The study of MWEs proceeds by examining meaning units of the language that do not lend themselves to such a simple treatment.
(Consider a parser.)
Where the ISH idealization fails
1. Some apparent MWEs are best analyzed as single words, occupying one node.
2. Some MWEs are the product of "non-core" constructions and semi-independent mini-grammars.
3. Some MWEs are the products of "regular" processes but have institutionally stipulated meanings.
4. Some MWEs can be represented as dependency subgraphs (not "just" word strings, or collocate sets).
Where the ISH idealization fails
1. Some apparent MWEs are best analyzed as single words, occupying one node.
2. Some MWEs are the product of "non-core" constructions and semi-independent mini-grammars.
3. Some MWEs are the products of "regular" processes but have institutionally stipulated meanings.
4. Some MWEs can be represented as dependency subgraphs (not "just" word strings, or collocate sets).
1. "Runs"
“Runs"
There are things that look like MWEs (that are written as sequences of words), but they have no internal variation and may just as well be thought of as long words with spaces in them. Examples– used to, let alone, of course, all of a sudden, first off
Many are easily mislearned– used to > used of– by and large > by in large– to all intents and purposes > to all intensive purposes– an arm and a leg > a nominal egg
2. Special Constructions
Special Constructions
Some common grammatical constructions require structures that go beyond the "core" provisions of a grammar. Consider the structure of:– the faster we drive the sooner we'll get there– what's this scratch doing on my violin?– she's older than any of us realized– she wouldn't give her mother a nickel let alone a dollar
Minigrammars
Some MWEs are generated by simple generative structures, usually finite state automata, for which dependency – or constituency – representations are not always relevant.– Names– Numbers– Locations (addresses, coordinates)– Time Expressions– Kinterms
Personal Names
Reverend Dr T. Allen Hampton-Smith III
Components: titles, honorifics, given names, patronymics, family names, extensions, ...
English Kinterms
grandfather, great grandfather, great great grandfather, etc.
first cousin, second cousin, third cousin
first cousin once removed, second cousin three times removed, etc.
father-in-law, son-in-law, sister-in-law, etc.
siblings
X
A B
C D
E F
G H
cousins
X
A B
C D
E F
G H
second cousins
X
A B
C D
E F
G H
first cousins once removed
X
A B
C D
E F
G H
first cousins twice removed
X
A B
C D
E F
G H
Digression
Ordinary techniques of computational linguistics/corpus linguistics won't be able to recognize the constructional nature of some expressions.
Test case
another $600
Indefinite article
Qualifier Quantifier Plural Noun
a whopping 600 dollars
an additional 10 pages
a paltry 20 euros
a respectable 6,000 francs
*a mere - pages
*a - 12 pages
Indefinite article
Qualifier Quantifier Plural Noun
a whopping 600 dollars
an additional 10 pages
a paltry 20 euros
a respectable 6,000 francs
*a mere - pages
*a - 12 pages
But how do we analyze "another $600"?
Indefinite article
Qualifier Quantifier Plural Noun
a whopping 600 dollars
an additional 10 pages
a paltry 20 euros
a respectable 6,000 francs
*a mere - pages
*a - 12 pages
But how do we analyze "another $600"?
*an -other 600 $
Relations to the rest of the grammar
It would be most convenient if the products of minigrammars could be "sealed" and not interfere with the rest of the sentence. But:– Croatian names– Finnish numbers– Internal grammar
3. Stipulated Designations
Translucent Idioms:regular productions with stipulated designations
From one point of view these are just "long words" with special meanings, but they are semantically penetrable; e.g.,– names of organizations
The American Society for the Prevention of Cruelty to Animals. (ASPCA)
– names of titlesDeputy Undersecretary of Defense for Intelligence
– names of officially designated crimesassaulting a federal officer with a deadly or lethal weapon
4. Dependency Subgraphs
Dependency Subgraphs
Here we refer to lexical units that are continuous parts of dependency structures.
x
y
x
y z
x
y
z
Dependency Subgraphs
A given lexical unit of this kind can have its own subcategorization requirements.
x
y
x
y z
x
y
z
A A
A
(Motivating digression)
word strings - "wrist watch" - how to find - statistical significance ("of the")
discontinuous - "collocates" - within spans - within sentences
some kind of grammatical relation between them?
Subcategorization Details
Particle Verbs - Intransitive
Verb > particle is the lexical unit.
Exx: wake up, go away, sit down, shut up,
Interruptible: Shut the hell up!
V
partX
shut
upX shut
upthe_hellshut
Particle Verbs - Transitive
Verb > particle is the lexical unit.
Exx: take off ('remove'), take out ('date'),
Interruptible: Take your shoes off.I took her out once.
V
partX Y
take
offyour shoestake
take
offX Y
In the Old Days ...
About half a century ago it was generally believed that in Deep Structure, phrases like pick up, take off, etc., started out as single constituents, and a Particle Movement Transformation allowed the extraction of the particle so that it could follow the direct object.
[take off] [your shoes] >> [take] [your shoes] [off]
A dependency subgraph can recognize the unity of the two-word block without worrying about phrasal constituency.
Prepositional Verbs - Intransitive
Verb > preposition is the lexical unit.Exx: look for ('seek'), object to ('oppose'), look into ('investigate') Interruptible: I looked long and hard for the perfect wife.We objected strenuously to her proposal.Comment: Some PPs are omissible, some aren't. look (for), look into
V
X prep
Y
look
X for
Y
PP Omissibility
Omissible (under conditions of zero anaphora)
Look at it!- I'm looking.
Look for it.- I'm looking.
Non-omissible
Could you look into this problem for me?- *I've already started looking.
Prepositional Verbs - Transitive
Verb > preposition is the lexical unit.Exx: talk into ('persuade'), rid of Comment: PP is sometimes omissible: The judge cleared me (of all charges).They tried to talk me *(into quitting my job).Who will rid me *(of this meddlesome priest)?
V
X prep
Z
clear
X of
Z
Y
Y
Particle-&-Preposition Verbs
Verb > {part,prep} is the lexical unit.
Exx: put up with ('tolerate'), look up to ('respect'), break in on ('interrupt')
Not generally interruptible, I think (haven't checked corpus data).
V
partX prep
Y
put
upX with
Y
V+N+P Verbs
Verb > /N,prep/ is the lexical unit.Exx: take advantage of ('exploit'), take part in ('participate in'), take charge ofComments: N can be modified; N can be passive subject:Considerable advantage was taken of this opportunity.Pseudo-passive:They were cruelly taken advantage of.N does not take a determiner.
V
NX prep
Y
take
partX in
Y
Other Parts of Speech
Adjectives can have prepositional and clausal complements:– fond of cats; interested in math; similar to mud
Nouns can have prepositional and causal complements:– top of the tower; friend to the poor; journey into the
jungle; copy of the book
VP Idioms
Obvious ones– pull someone's leg, blow one's nose
– kick the bucket Less obvious ones– answer the door
(Would you answer the door?)– mention someone's name
(Did anybody mention my name at the party?)
Support Constructions
Support Verbs with Subject N
Verb > N is the lexical unit, N is semantic head, V is support verbExx: The wind is blowing, the fire is burning, the rain is falling, a riot occurred; an accident happenedComment: The frame is evoked by the noun. The support verb is selected by the noun.Compare "the fire is burning" with "the house is burning".
V
N
blow
wind
V
N
blow
wind
Note linearization: Since these are intransitive, the N is (or heads) the subject NP and the verb is the predicate.
V
blows
Support Verbs with Object N
Verb > N is the lexical unit, N is semantic head, V is support verb. N has its own valence.Exx: We had an argument with the kids. ('we argued with the kids')I made the decision to leave. ('I decided to leave')Comment: The frame is evoked by the noun. The SV is selected by the noun, which also brings in its own complement structure.Comment: The N doesn't have to be deverbal: wage war, commit a crime
V
NX
have
argumentX
Y
with
Ditransitive Support Verbs
Verb > N is the lexical unit, N is semantic head, V is support verb. X and Y are each participants in N's frame.Exx: She gave me a kiss. ('she kissed me')I paid him a bribe. ('I bribed him')They gave me good advice.('they advised me well')
V
NX
give
kissX
Y
X
SVs can resolve polysemy.
Polysemous event nouns can take different support verbs:– ('quarrel') have an argument– ('reason') make an argument
– ('rest') take a break– ('flight') make a break
A common test of SVs:
One frequent proposed characteristic of support verbs is that their nominal object can’t really be interrogated - meaning that the verb in question isn’t functioning as a self-standing verb. The following are not natural conversations:– What did you heave? - A sigh.– What have you made? - A decision to go home.– What did you have? - A fight with my brother.– What did you wreak? - Vengeance on my enemies.– What did you lodge? - A complaint.
Interchangeable with Verbs
She heaved a sigh. (She sighed.)
We made the decision to give up. (We decided to give up.)
I took a bath. (I bathed.)
He suffered a relapse. (He relapsed.)
Let’s say a prayer. (Let’s pray.)
Profiling Different Participants
Agent of eventperform an operationinflict injury
exact/wreak vengeancelaunch an attackgive instructionssubmit an applicationask a question
Undergoer of eventundergo an operation
sustain injury
have a setback
suffer a defeat
undergo an operation
receive a rebuke
get advice
Beyond "light verbs"
Simple cases: the verb has essential no meaning except to reveal that its subject is necessarily a participant in the event named by the noun.– a. active role– b. passive role
More nuanced cases: the verb contributes information about register, attitude, aktionsart, or the like.More extended cases: the verb identifies its subject as a participant in the larger scenario associated with the event named by the verb.
Examples
Simple, active: – he made a complaint
Nuanced: – he registered a complaint
Examples
Simple, active: – she gave an exam
Simple, passive: – he took/sat the exam
Examples
Simple, active: – she gave an exam
Simple, passive: – he took/sat the exam
Extended: – he passed/failed the exam
Examples
Simple, active: – she made a promise
Examples
Simple, active: – she made a promise
Extended: – she kept/broke her promise
For the full story, and then some, see ...
Mel'cuk, Igor' (1995), Phrasemes in language and phraseology in linguistics. In M. Everaert et al., Idioms: Structural and Psychological Perspectives. Lawrence Erlbaum Associates.
Mel'cuk, Igor' (1996), Lexical functions: a tool for the description of lexical relations in a lexicon. In Leo Wanner, ed., Lexical Functions in Lexicography and Natural Language Processing. John Benjamins.
Mel'cuk, Igor' (1998), Collocations and lexical functions. In Cowie 1998
Mel'cuk, Igor' (1995), The future of the lexicon in linguistic description and the explanatory combinatorial dictionary. Linguistics in the Morning Calm 3. 181-270. Hanshin: Seoul
Support Verbs with Adjective
Verb > A is the lexical unit, A is semantic head, V is support verb, A may have its own complements (e.g., rid of).
Exx: be + any predicate adjective; go crazy, turn red, get naked
Comment:The unit rid of seems to occur only with a SV.
V
AX
get
nakedX
Support Prepositions
Prep > N is the lexical unit, N is semantic head, V is support verb. N has its own valence.Exx: at risk, in danger, on fire, under scrutiny, under arrestSome are modifiable:at considerable risk, in grave danger, under careful scrutinyComment: The P>N structure may function adjectivally or adverbially; the N can have its own complements.(he participated in the race) at considerable risk to his health, (the building is) in danger of collapse
P
N
at
risk
More Complex Cases
Verb > P > N is the lexical unit, N is semantic head, V is support verb, N is generally not expandable.
Exx: take into account, take under consideration, have in (one's) possession
V
PX
N
take
underX
consideration
Y
Y
Support Verbs with PP
Verb > P > N is the lexical unit, N is semantic head, V is support verb. With possession there are two alignments of the arguments:Possessor - Possessed
I came into possession of these documents.
Possessed - Possessor
These documents came into my possession.
V
X prep
N
come
X into
possession
Transparent Nouns
N of N
N > of is the lexical unit, The second N is semantic head for purposes of external selection.Comment: sometimes the N > of is "transparent" to the pieces of an MWE; and sometimes the N > of > N is itself an MWE, especially in the case of aggregates and unitizers:
– a case of the flu– a round of golf– a herd of cattle– a flock of geese– a school of fish– a pinch of salt– a pod of whales
N
of
type
of
N
fish
N
of
bout
of
N
flu
Types of transparent nouns1. Aggregates
bunch, group, collection, herd, school, flock
2. Quantities flood, number, scores, storm
3. Types breed, class, ilk, kind, type, sort
4. Portions and Parts half, segment, top, bottom, part
5. Unitizers glass, bottle, box, serving
6. Evaluations gem, idiot, prince
"Transparent" to what?
Relation between locative preposition and object:– on the shelf; on this part of the shelf– in the room; in this part of the room
Relation between verb and typical collocating object– play golf; play a round of golf– eat fish; eat this type of fish
Relation between possessor and kin-term– my wife; my gem of a wife– her husband; her jerk of a husband
Compounds
N > N Compounds
N > N is the lexical unit; listed compounds have the dependent in red; the syntactic head is the frame evoker, the dependent is either a frame element or a "quale". The order is Modifier + Head.
N
risk
N
health
N
knife
N
fish
N+N Compounds
Some are just listed, their internal structure of etymological relevance only. (What's the head of light year? Often misused: "that was light years ago".)– light year, puppy love
Some are listed, with N2 as the head, N1 as satisfier of some requirement of N2; name pre-existing category.– bread knife, wine bottle, cork screw
Some are interpretable with reference to completion needs of N2.– fire risk, health risk, travel risks
A-N Compounds
N > A is the lexical unit; listed compounds have the dependent in red; the syntactic head is the frame evoker, the dependent is either a frame element or a "quale".
Ready-made A+P compounds:hot news, friendly fire, blind alley, dead end
N
police
A
federal
N
news
A
hot
"Pertinative" adjectives
Pertinatives are adjectives whose senses are defined in (some) dictionaries with the phrase "of or pertaining to". Traditional term: relational adjectives. WordNet term: pertainyms.
They are not used predicatively in the same meaning.
They aren't scalar, e.g., they don't get modified with very.
Pertinatives vs. Descriptives
judicial appointmenteconomic policyeducational practicecriminal law
linguistic societyCanadian governmentnational interest
these are MWEs
judicious appointmenteconomical housewifeeducational experiencecriminal behavior
ugly catamazing disclosurebored child
these aren't
Continuity Hypothesis
I assume the continuity of the lexicon and the constructicon.
Reference: Paul Kay & Charles J. Fillmore (1999), "Grammatical constructions and linguistic generalizations: the What's X Doing Y? construction", Language 75 1-33.
Claim: many lexically-headed constructions can be analyzed as dependency subtrees.
be
X doing
what Y
be is finite (not quite true)Y is secondary predicate,i.e.
APwith absoluteparticipiallocative phrase
Different linearizations and interruptions:
What are you doing here? (be before X)I wonder what she's doing wearing her mother's dress. (X before be)What the hell are you still doing standing out there in the rain?(various interruptions)What are you doing without any shoes on?
Meaning: X is Y, and that is anomalous.
Long line for pre-recall appointments to benchPhillip Matier, Andrew RossMonday, Augus t 25, 2003
1. As the recall clock ticks down, it's interesting to note how many Gray Davisloyalists are putting their names in for some highly coveted judicial appointments.
2. Among the more notable bench seekers:
n The governor's own legal affairs secretary, Barry Goode, who is beingvetted by the State Bar for an appointment to the First District Court ofAppeal in Sacramento.
n Davis' legal appointments secretary, Burt Pines, who over the past 4 1/2years has helped his boss fill 304 judgeships around the state. Pine is nowunder consideration himself for a seat on the L.A. Superior Court.
n And Jeremiah Hallisey, one of the governor's top San Francisco fund-raisers who put together last Thursday's big $1,000-a-head cocktail partyfor Davis at the Fairmont. Hallisey, who sits on the CaliforniaTransportation Commission, has filed papers for one of the Superior Courtopenings in either San Francisco or Contra Costa County.
3. And speaking of the Fairmont fund-raiser (which netted a respectable $600, 000),attendees told us the crowd looked like a casting call for wannabe judges andpeople seeking recall-proof commission appointments.
Personal names, long and short:Gray Davis DavisJeremiah Hallisey Hallisey
PlacesLos Angeles San Francisco
Organizations, InstitutionsFirst District Court of AppealL. A. Superior CourtCalifornia Transportation Commission
Noun+Noun Compoundsrecall clockDavis loyalistscasting callcommission appointments
Adjective + Noun Compoundslegal affairsjudicial appointmentmedical leavejudicial vacancy
Complex cases:legal affairs secretarylegal appointments secretary
Support Verbsmake ... appointmentssubmit to ... review
Transparent Nounsa stack of appointmentsa host of 11th hour appointments
Verb-headed phrasesput one's name in for (an appointment)file for (an opening)get the thumbs down fromget one's name clearedsign off onget caught flat-footed
Miscellaneousas the clock ticks downover the past four and a half yearsit is interesting to noteand speaking of ...a respectable 600 thousand dollarson the way out the dooron the chance there may be ...much lessin fairness
Bottom Line
Lexical units can be represented as dependency subgraphs, specifying a semantic head, a syntactic head, required/preferred dependents.Constraints on dependents can be specified lexically, sortally, morphosyntactically, and in terms of frame roles.Dependents can be marked as "closed" (not open to modification) and/or "local" (not subject to extraction) and/or "omissible".The lexical head of the construction bears information about contextual constraints: finiteness, inflection, polarity, etc.