University of AlbertaJune 30, 2007 Slide 1
Learning Noun PhraseQuery Segmentation
Shane Bergsma and Qin Iris WangUniversity of Alberta
EMNLP 2007
University of AlbertaJune 30, 2007 Slide 2
Query Segmentation• Input: Search engine query• Output: Query separated into phrases• Goal: Improve information retrieval• Approach: Supervised machine-learning
– classifier makes segmentation decisions• Conclusion: richer features allow for large
increases in segmentation performance
University of AlbertaJune 30, 2007 Slide 3
Outline1. Introduction2. Segmentation as Classification3. Features4. Data and Experiments5. Results
University of AlbertaJune 30, 2007 Slide 4
Growth of the WebTotal Sites Across All Domains August 1995 - June 2007
Netcraft June 2007 Web Server Survey
University of AlbertaJune 30, 2007 Slide 5
Query Segmentation• Matching tokens not sufficient• Need better strategies for interpreting
queries• Example query:
– two man power saw• Interpretation using phrases:
– “man power”? “power saw”?
University of AlbertaJune 30, 2007 Slide 6
Query SegmentationUnsegmented:two man power saw
“two”“man”“power”“saw”
University of AlbertaJune 30, 2007 Slide 7
Query Segmentation“two man”“power saw”
University of AlbertaJune 30, 2007 Slide 8
Query Segmentation“two”“man”“power saw”
University of AlbertaJune 30, 2007 Slide 9
Query Segmentation• Improves precision• Also can help with recall:
– First step in query substitution / expansion:– “two man” “power saw” to:
“two person” “power saw”– Unsegmented:– “two” “man” “power” “saw” to:
“three” “woman” “authority” “witnessed”
University of AlbertaJune 30, 2007 Slide 10
Query Segmentation• How to segment?
– Link tokens with high statistical association• Jones et al. (WWW 2006) use the Mutual
Information (MI):– MI(x,y) = Pr(x,y) / Pr(x)Pr(y)
• Link tokens x and y if their MI > threshold• For an N-token query, 2N-1 segmentations
University of AlbertaJune 30, 2007 Slide 11
Query Segmentation• Similar to Noun Compound Bracketing
– forms binary tree (bracketing) over tokens– [used [car parts]] or [[used car] parts]– In principle, more bracketings than
segmentations• Our goal:
– Apply bracketing statistics used in Nakov &Hearst (CoNLL 2005) to query segmentation
University of AlbertaJune 30, 2007 Slide 12
Segmentation as Classification• Our approach:
– turn query segmentation into classification– discriminatively learn a classifier to make
segmentation decisions• Benefits
– allows large number of possibly overlappingfeatures
– Adapt to training data / task of interest
University of AlbertaJune 30, 2007 Slide 13
Segmentation as Classification“two man” “power saw”
SupportVector
Machine
- two man+ man power- power saw
- <1,0,0,1… >+ <0,0,1,1… >- <0,1,0,1… >
University of AlbertaJune 30, 2007 Slide 14
Segmentation as Classification
… X Y …
University of AlbertaJune 30, 2007 Slide 15
Features• Basic Features
MI(x,y) = Pr(x,y) / Pr(x)Pr(y)log MI(x,y) = log Pr(x,y) – log Pr(x) – log Pr(y) = log C(x,y) – log C(x) – log C(y) + normalizer
• Can use separately:< log C(x,y) , log C(x) , log C(y) >called the Basic features
• Use counts from search engine
University of AlbertaJune 30, 2007 Slide 16
Indicator Features
Position from end of queryReverse position
Position from beginning of queryForward position
Part-of-speech tags of x yPOS-tags
Token x, y = “free”Is-free
Token x, y = “the”Is-the
DescriptionName
University of AlbertaJune 30, 2007 Slide 17
Statistical Features
Counts x, x y, in AOL databaseQuery-DB
Count “x’s y”Genitive
Count “x and y”And-count
Count “xy”Collapsed
Count “the x y”Definite
DescriptionName
University of AlbertaJune 30, 2007 Slide 18
Example• star wars weapons guns
– star wars: high counts of “star wars”,“starwars”, but low “star and wars”
– weapons guns: lower “weapons guns”, low“weaponsguns”, high “weapons and guns”
• Positively weighted and negativelyweighted features work together.
University of AlbertaJune 30, 2007 Slide 19
Summary of Feature Spans
X1 X2 X3 X4 X5 X6
Boundary
University of AlbertaJune 30, 2007 Slide 20
Summary of Feature Spans
X1 X2 X3 X4 X5 X6
BoundaryContextContextContext Context
University of AlbertaJune 30, 2007 Slide 21
Summary of Feature Spans
X1 X2 X3 X4 X5 X6
BoundaryContextContextContext Context
DependencyDependency
University of AlbertaJune 30, 2007 Slide 22
Context Features• Consider the segmentation decision
between “loan” and “amortization” in:bank loan amortization schedule
• Might want to consider association of“bank” and “loan” as well.
• Get pairwise features with left and rightneighbours, trigram, fourgram, andfivegram features, if available.
University of AlbertaJune 30, 2007 Slide 23
Data and Experiments• Use queries from AOL query database
– queries with a click-URL: indicates user’sintentions for the query
– only noun phrase queries – taggeddeterminers, adjectives and nouns
– only queries of length ≥ 4– 500 queries for training, 500 for development,
500 for testing
University of AlbertaJune 30, 2007 Slide 24
Data and Experiments• Annotators asked to annotate queries to
improve search precision• Test set annotated by three annotators• Agreement on segmentation decisions
around 84% - lower than we expected• More details in paper
University of AlbertaJune 30, 2007 Slide 25
Results
0%
20%
40%
60%
80%
Boundary
Boundary
+Context
+Dependency
Boundary
+Context
+Dependency
MI Basic Basic Basic All All All
Seg-
Acc
Qry-Acc
University of AlbertaJune 30, 2007 Slide 26
Conclusion• Proposed a new approach to query
segmentation, allows richer features• Reduces error by 56% over comparison
system• Future work: train query segmentation (or
query expansion / contraction) to directlyoptimize information retrieval performance
University of AlbertaJune 30, 2007 Slide 27
Thanks
University of AlbertaJune 30, 2007 Slide 28
Dependency Features• Consider the segmentation decision
between “female” and “bus” in:female bus driver
• There is a stronger association between“female” and “driver” than “female” and“bus” – might be useful
• Include features between pairs of tokensseparate by a token.