Better Search Through Query Understanding
-
Upload
daniel-tunkelang -
Category
Technology
-
view
2.560 -
download
1
description
Transcript of Better Search Through Query Understanding
Recruiting SolutionsRecruiting SolutionsRecruiting Solutions
Daniel TunkelangHead, Query Understanding
better search throughquery understanding
Daniel
overview
query understanding: what is it? how we do query understanding at LinkedIn some other thoughts from search in the wild
what I’m not going to cover:
2
Information need query select from results
rank using IR model
user:
system:tf-idf PageRank
bird’s-eye view of how a search engine works
3
Information need query select from results
rank using IR model
user:
system:tf-idf PageRank
query understanding
4
search is a communication problem
5
6
tag: skill OR titlerelated skills: search, ranking, …
tag: companyid: 1337industry: internet
verticals:people, jobs
intent: exploratory
7
query understanding pipeline
spellcheck
query tagging
vertical intent prediction
query expansion
raw query
structured query+
annotations
8
query understanding pipeline
spellcheck
query tagging
vertical intent prediction
query expansion
raw query
structured query+
annotations
9
fix obvious typos
help users spell names
spelling correction
10
spelling out the details
PEOPLE NAMESCOMPANIES
TITLES
PAST QUERIES
n-gramsmarissa => ma ar ri is ss sa
metaphonemark/marc => MRK
co-occurrence countsmarissa:mayer = 1000
marisa meyer yahoo
marissa
marisa
meyer
mayer
yahoo
11
spelling out the details
problem: corpus as well as query logs contain many spelling errors
certain spelling errors are quite frequent
while genuine words (especially names) might be infrequent
12
spelling out the details
problem: corpus & query logs contain spelling errors
solution: use query chains to infer correct spelling
[product manger] [product manager] CLICK
[marissa mayer] CLICK
13
query understanding pipeline
spellcheck
query tagging
vertical intent prediction
query expansion
raw query
structured query+
annotations
14
query tagging: identifying entities in the query
TITLE CO GEO
TITLE-237software engineersoftware developer
programmer…
CO-1441Google Inc.
Industry: Internet
GEO-7583Country: US
Lat: 42.3482 NLong: 75.1890 W
(RECOGNIZED TAGS: NAME, TITLE, COMPANY, SCHOOL, GEO, SKILL )
15
query tagging: identifying entities in the query
TITLE CO GEO
MORE PRECISE MATCHING WITH DOCUMENTS
16
entity-based filtering
BEFORE
17
entity-based filtering
AFTER
BEFORE
18
entity-based filtering
BEFORE
19
entity-based filtering
AFTER
BEFORE
20
entity-based suggestions
21
entity-based suggestions
22
query tagging: sequential model
EMISSION PROBABILITIES
(learned from user profiles)
TRANSITION PROBABILITIES
(learned from query logs)
TRAINING
23
query tagging: sequential model
INFERENCE
given a query, find the most likely sequence of tags
24
query understanding pipeline
spellcheck
query tagging
vertical intent prediction
query expansion
raw query
structured query+
annotations
25
vertical intent prediction: distribution
JOBS
PEOPLE
COMPANIES
(probability distribution over verticals)
26
vertical intent prediction: relevance
[company]
[employees]
[jobs]
[name search]
27
query understanding pipeline
spellcheck
query tagging
vertical intent prediction
query expansion
raw query
structured query+
annotations
28
query expansion: name synonyms
29
query expansion: job title synonyms
30
query expansion: signals
[jon] [jonathan] CLICK
trained using query chains:
[programmer] [developer] CLICK
symmetric but not transitive!
[francis] ⇔ [frank]
[franklin] ⇔ [frank]
[francis] ≠ [franklin]
[software engineer] [software developer] CLICK
context based!
[software engineer] => [software developer]
[civil engineer] ≠ [civil developer]
31
query understanding pipeline
spellcheck
query tagging
vertical intent prediction
query expansion
raw query
structured query+
annotations
32
what else can we learn from search in the wild?
33
don’t guess when it’s better to ask
vs.
34
clarify then refine
computers books
35
give users transparency, guidance, and control
36
think beyond individual search queries
Gene Golovchinsky, FXPAL
37
know when you don’t know
Claudia Hauff, Query Difficulty for Digital Libraries [2009]