1 Data Collection and Normalization for the Scenario- Based Lexical Knowledge Resource of a...
-
Upload
irma-fletcher -
Category
Documents
-
view
215 -
download
0
Transcript of 1 Data Collection and Normalization for the Scenario- Based Lexical Knowledge Resource of a...
1
Data Collection and Normalization for the Scenario-
Based Lexical Knowledge Resource of a Text-to-Scene
Conversion System
Data Collection and Normalization for the Scenario-
Based Lexical Knowledge Resource of a Text-to-Scene
Conversion System
Margit BowlerMargit Bowler
2
Who I AmWho I Am
Rising senior at Reed College in Portland Linguistics major, concentration in Russian
Rising senior at Reed College in Portland Linguistics major, concentration in Russian
3
OverviewOverview WordsEye & Scenario-Based Lexical
Knowledge Resource (SBLR) Use of Amazon’s Mechanical Turk (AMT)
for data collection Manual normalization of the AMT data and
definition of semantic relations Automatic normalization techniques of
AMT data with respect to building the SBLR
Future automatic normalization techniques
WordsEye & Scenario-Based Lexical Knowledge Resource (SBLR)
Use of Amazon’s Mechanical Turk (AMT) for data collection
Manual normalization of the AMT data and definition of semantic relations
Automatic normalization techniques of AMT data with respect to building the SBLR
Future automatic normalization techniques
4
WordsEye Text-to-Scene Conversion
WordsEye Text-to-Scene Conversion
the humongous white shiny bear is on the american mountain range. the mountain range is 100 feet tall. the ground is water. the sky is partly cloudy. the airplane is 90 feet in front of the nose of the bear. the airplane is facing right.
the humongous white shiny bear is on the american mountain range. the mountain range is 100 feet tall. the ground is water. the sky is partly cloudy. the airplane is 90 feet in front of the nose of the bear. the airplane is facing right.
5
Scenario-Based Lexical Knowledge Resource (SBLR)
Scenario-Based Lexical Knowledge Resource (SBLR)
Information on semantic categories of words Semantic relations between predicates (verbs,
nouns, adjectives, prepositions) and their arguments
Contextual, common-sense knowledge about the visual scenes various actions and items occur in
Information on semantic categories of words Semantic relations between predicates (verbs,
nouns, adjectives, prepositions) and their arguments
Contextual, common-sense knowledge about the visual scenes various actions and items occur in
6
How to build the SBLR… efficiently?
How to build the SBLR… efficiently?
Manual construction of the SBLR is time-consuming and expensive
Past methods have included mining information from external semantic resources (e.g. WordNet, FrameNet, PropBank) & information extraction techniques from other corpora
Manual construction of the SBLR is time-consuming and expensive
Past methods have included mining information from external semantic resources (e.g. WordNet, FrameNet, PropBank) & information extraction techniques from other corpora
7
Amazon’s Mechanical Turk (AMT)
Amazon’s Mechanical Turk (AMT)
Online marketplace for work Anyone can work on AMT, however:
It is possible to screen workers by various criteria. We screened ours by: Located in the USA 99%+ approval rating
Online marketplace for work Anyone can work on AMT, however:
It is possible to screen workers by various criteria. We screened ours by: Located in the USA 99%+ approval rating
8
AMT TasksAMT Tasks In each task, we asked for up to 10 responses. A
comment box was provided for >10 responses.
Task 1: Given the object X, name 10 locations where you would find X. (Locations)
Task 2: Given the object X, name 10 objects found near X. (Nearby Objects)
Task 3: Given the object X, list 10 parts of X. (Part- Whole)
In each task, we asked for up to 10 responses. A comment box was provided for >10 responses.
Task 1: Given the object X, name 10 locations where you would find X. (Locations)
Task 2: Given the object X, name 10 objects found near X. (Nearby Objects)
Task 3: Given the object X, list 10 parts of X. (Part- Whole)
9
AMT Task ResultsAMT Task Results
17,200 total responses Spent $106.90 for all three tasks It took approximately 5 days to complete each task
17,200 total responses Spent $106.90 for all three tasks It took approximately 5 days to complete each task
Target Words
User Inputs Reward
Locations 342 6,850 $0.05
Objects 245 3,500 $0.07
Parts 342 6,850 $0.05
10
Goal: How to automatically normalize data collected from AMT in such a way that AMT would be useful for building the Scenario-Based Lexical Knowledge Resource (SBLR)?
Goal: How to automatically normalize data collected from AMT in such a way that AMT would be useful for building the Scenario-Based Lexical Knowledge Resource (SBLR)?
11
Manual Normalization of AMT Data
Manual Normalization of AMT Data
Removal of uninformative target item-response item pairs between which no relevant semantic relationship was held
Definition of the semantic relations held between the remaining target item-response item pairs
This manually normalized set of data was used as the standard against which we measured various automatic normalization techniques.
Removal of uninformative target item-response item pairs between which no relevant semantic relationship was held
Definition of the semantic relations held between the remaining target item-response item pairs
This manually normalized set of data was used as the standard against which we measured various automatic normalization techniques.
12
Rejected Target-Response PairsRejected Target-Response Pairs
Misinterpretation of ambiguous target item (e.g. mobile)
Viable interpretation of target item was not contained within the SBLR (e.g. crawfish as food rather than a living animal)
Too generic responses (e.g. store in response to turntable)
Misinterpretation of ambiguous target item (e.g. mobile)
Viable interpretation of target item was not contained within the SBLR (e.g. crawfish as food rather than a living animal)
Too generic responses (e.g. store in response to turntable)
13
Examples of Approved AMT Responses
Examples of Approved AMT Responses Locations:
mural - gallerylizard - desert
Nearby Objects:ambulance - stretchercauldron - fire
Part-Whole:scissors - blademonument - granite
Locations:mural - gallerylizard - desert
Nearby Objects:ambulance - stretchercauldron - fire
Part-Whole:scissors - blademonument - granite
14
Semantic RelationsSemantic Relations Defined a total of 34 relations Focused on defining concrete, graphically
depictable relationships “Generic” relations accounted for most of the
labeled pairs (e.g. containing.r, next-to.r) Finer distinctions were made within these generic
semantic relations (e.g. habitat.r, residence.r within the overarching containing.r relation)
Defined a total of 34 relations Focused on defining concrete, graphically
depictable relationships “Generic” relations accounted for most of the
labeled pairs (e.g. containing.r, next-to.r) Finer distinctions were made within these generic
semantic relations (e.g. habitat.r, residence.r within the overarching containing.r relation)
15
Example Semantic RelationsExample Semantic Relations
Locations:mural - gallery - containing.rlizard - desert - habitat.r
Nearby Objects:ambulance - stretcher - next-to.rcauldron - fire - above.r
Part-Whole:scissors - blade - object-part.rmonument - granite - stuff-object.r
Locations:mural - gallery - containing.rlizard - desert - habitat.r
Nearby Objects:ambulance - stretcher - next-to.rcauldron - fire - above.r
Part-Whole:scissors - blade - object-part.rmonument - granite - stuff-object.r
16
Semantic Relations within Locations Task
Semantic Relations within Locations Task
We collected 6850 locations for 342 target objects from our 3D library.
We collected 6850 locations for 342 target objects from our 3D library.
Relation
Number of occurrences
Percentage of total scored pairs
containing.r 1194 38.01%
habitat.r 346 11.02%
on-surface.r 333 10.6%
geographical-location.r
306 9.74%
group.r 183 5.83%
17
Semantic Relations within Nearby Objects Task
Semantic Relations within Nearby Objects Task
We collected 6850 nearby objects for 342 target objects from our 3D library.
We collected 6850 nearby objects for 342 target objects from our 3D library.
Relation
Number of occurrences
Percentage of total scored pairs
next-to.r 4988 75.66%
on-surface.r 375 5.69%
containing.r 293 4.44%
habitat.r 243 3.69%
object-part.r 153 2.32%
18
Semantic Relations within Part-Whole Task
Semantic Relations within Part-Whole Task
We collected 3500 parts of 245 objects. We collected 3500 parts of 245 objects.
Relation
Number of occurrences
Percentage of total scored pairs
object-part.r 2675 79.12%
stuff-object.r 552 16.33%
containing.r 50 1.48%
habitat.r 36 1.06%
stuff-mass.r 17 0.5%
19
Automatic Normalization TechniquesAutomatic Normalization Techniques
Collected AMT data was classified into higher-scoring versus lower-scoring sets by: Log-likelihood and log-odds of sentential co-
occurrences in the Gigaword English corpus WordNet path similarity Resnik similarity WordNet average pair-wise similarity WordNet matrix similarity
Accuracy evaluated by comparison against manually normalized data
Collected AMT data was classified into higher-scoring versus lower-scoring sets by: Log-likelihood and log-odds of sentential co-
occurrences in the Gigaword English corpus WordNet path similarity Resnik similarity WordNet average pair-wise similarity WordNet matrix similarity
Accuracy evaluated by comparison against manually normalized data
20
Precision & RecallPrecision & Recall
AMT data is quite cheap to collect, so we were concerned predominantly with precision (obtaining highly accurate data) rather than recall (avoiding loss of some data).
In order to achieve more accurate data (high precision), we will lose a portion of our AMT data (low recall)
AMT data is quite cheap to collect, so we were concerned predominantly with precision (obtaining highly accurate data) rather than recall (avoiding loss of some data).
In order to achieve more accurate data (high precision), we will lose a portion of our AMT data (low recall)
21
Locations TaskLocations Task
Achieved best precision with log-odds. Within high-scoring set, responses that were too general (e.g.
turntable - store) were rejected. Within low-scoring set, extremely specific locations that
were unlikely to occur within a corpus or WordNet’s synsets were approved (e.g. caliper - architect’s briefcase)
Achieved best precision with log-odds. Within high-scoring set, responses that were too general (e.g.
turntable - store) were rejected. Within low-scoring set, extremely specific locations that
were unlikely to occur within a corpus or WordNet’s synsets were approved (e.g. caliper - architect’s briefcase)
Base-
line
Log- likel.
Log-
odds
WN Path Sim. Resnik
WN Avg. PW
WN Matrix Sim.
Precision 0.5527 0.7502 0.7715 0.5462 0.5562 0.6014 0.4782
Recall 1.0 0.7945 0.6486 0.9649 0.9678 0.3454 1.0
22
Nearby Objects TaskNearby Objects Task
Relatively few target-response pairs were discarded, resulting in high recall.
High precision due to open-ended nature of task; responses often fell under a relation, if not next-to.r.
Relatively few target-response pairs were discarded, resulting in high recall.
High precision due to open-ended nature of task; responses often fell under a relation, if not next-to.r.
Base-
line
Log- likel.
Log-
odds
WN Path Sim. Resnik
WN Avg. PW
WN Matrix Sim.
Precision 0.8934 0.8947 0.9048 0.9076 0.9085 0.9764 0.8795
Recall 1.0 1.0 0.8917 1.0 1.0 0.2659 1.0
23
Part-Whole TaskPart-Whole Task
Rejected target-response pairs from the high-scoring set were often due to responses that named attributes, rather than parts, of the target item (e.g. croissant - flaky)
Approved pairs from the low-scoring set were mainly due to obvious, “common sense” responses that would usually be inferred, not explicitly stated (e.g. bunny - brain)
Rejected target-response pairs from the high-scoring set were often due to responses that named attributes, rather than parts, of the target item (e.g. croissant - flaky)
Approved pairs from the low-scoring set were mainly due to obvious, “common sense” responses that would usually be inferred, not explicitly stated (e.g. bunny - brain)
Base-
line
Log- likel.
Log-
odds
WN Path Sim. Resnik
WN Avg. PW
WN Matrix Sim.
Precision 0.7887 0.7832 0.8231 0.7963 0.7974 0.8823 0.8935
Recall 1.0 0.4129 0.4622 1.0 1.0 0.2621 0.2367
24
Future Automatic Normalization Techniques
Future Automatic Normalization Techniques
Computing word association measures on much larger corpora (e.g. Google’s 1 trillion word corpus)
WordNet synonyms and hypernyms Latent Semantic Analysis to build word
similarity matrices
Computing word association measures on much larger corpora (e.g. Google’s 1 trillion word corpus)
WordNet synonyms and hypernyms Latent Semantic Analysis to build word
similarity matrices
25
In Summary…In Summary…
WordsEye & Scenario-Based Lexical Knowledge Resource (SBLR)
Amazon’s Mechanical Turk & our tasks Manual normalization of AMT data Automatic normalization techniques used
on AMT data and results Possible future automatic normalization
methods
WordsEye & Scenario-Based Lexical Knowledge Resource (SBLR)
Amazon’s Mechanical Turk & our tasks Manual normalization of AMT data Automatic normalization techniques used
on AMT data and results Possible future automatic normalization
methods
26
Thanks to…Thanks to…
Richard Sproat Masoud Rouhizadeh All the CSLU interns
Richard Sproat Masoud Rouhizadeh All the CSLU interns
27
Questions?Questions?