Post on 15-Jul-2020
CrowdQ: A Search Engine with Crowdsourced Query Understanding Daniel Bruckner, Daniel Haas, Jonathan Harper
h6p://ec2-‐50-‐16-‐103-‐42.compute-‐1.amazonaws.com:8001/
MOTIVATION
QUERY TEMPLATES
EVALUATION
1-HOP SEMANTICS
CROWD INTERFACE
ARCHITECTURE WEB INTERFACE
FUTURE WORK
User
Keyword QueryOn#line'Complex'Query
ProcessingComplex
query classifier
CrowdsourcingPlatform
Vetrical selection,
Unstructured Search, ...
POS + NER tagging
Query Template Index
Crowd Manager
N
Y
Queries Templ +Answer Types
StructuredLOD Search
Result Joiner
Template Generation
SERP
t1t2t3
Off#line'Complex'QueryDecomposition
Structured Query
Query Logquery
N
Answ
erCo
mpo
sitio
n
LOD Open Data Cloud
Match with existingquery templates
• Templates represent many similar queries. • Given a 1-‐Hop query, we generalize it by abstracJng
the source en@ty. • Example: the 1-‐Hop for “capital of Canada” uses
“Canada” as its source node. We generalize “Canada” to <poliJcal enJty>, and can now answer queries about the capital of any poliJcal enJty.
• Challenge: Correctness of templates is hard to ensure.
O7en templates are too broad or too specific.
Search engines have begun providing direct answers to web search queries, but there is a long tail of less common queries that cannot be answered this way.
• 97% of unique queries occur 10 or fewer @mes • State-‐of-‐the-‐art NLP techniques are not reliable
enough to answer these queries • Crowds have been used to gather answers, but this
approach is expensive, order of $0.50 per query • Meanwhile, large open data sets like DBpedia already
contain many answers, but Crowd input is needed to understand queries and map them onto these databases
Challenge: Understanding arbitrary query seman>cs is hard Solu@on: Focus on a subset of queries with common seman>cs
Example relaJonship extracJon HIT interface.
Our search engine UI. Results (center) are not web pages but direct answers. Structured data about the query is shown at lea, and alternaJve interpretaJons of the query are displayed at right, as a fallback. The UI achieves interacJve latencies.
Key Abstrac@on: a 1-‐Hop encapsulates single semanJc jump, e.g., “Beatles live albums” or “capital of Canada” • Source: a known named enJty in the query (“Beatles”) • Answer: an enJty linked directly to source (an album) • Filter: a predicate answer must match (the “live album” type)
Answer candidate graphs are used to generate English sentences. Mechanical Turk assignments are generated to ask the crowd for the best query interpreta@on.
Data. DBPedia (general, dirty) and MusicBrainz (narrow, clean). InteresJngly, it is easier to produce templates for dirJer data sets, but then templates are less general. Queries. 100+ queries from QALD-‐2 benchmark. 1-‐Hop abstracJon applies to majority of QALD queries (29 DBPedia, 73 MusicBrainz) Candidate genera@on. Text search on DBPedia finds candidate 1-‐Hops for 62% of test queries. Crowd Efficiency. How efficient is the Crowd? We posted 252 tasks on Mechanical Turk, cosJng $0.84 per template. The crowd was 66.7% accurate in answering keyword queries. We evaluated two interfaces: mulJ-‐select and single-‐select, and found that accuracy was the same in both approaches. Template Coverage. How useful are our templates? Template Performance. Do we achieve interacJve latencies?
0
0.05
0.1
0.15
0.2
0.25
0.3
1 10 100 1000 10000 100000
Frac@o
n of Tem
plates
Relevant En@@es in Template (Lower Bound Shown)
Example Templates We measure generality by how many source enJJes their 1-‐Hops match (DistribuJon at right).
Query Size Comment Actors in <Top Gun> 8,642 Good! <Maribor> populaJon 164,329 Great! members of <The Prodigy> 3 Too narrow German Shepherd breeds 659,430 Too general
• Improve candidate template generaJon with NLP tools like stemmers and WordNet
• Extend 1-‐Hop abstracJon to support more complex queries
• Augment quality controls for data and templates, e.g., by adding verificaJon to crowd pipeline
• Build ML model to enable complex template matching • Op@mize the crowd interface performance and apply it
to addiJonal sub-‐problems • Run on larger query logs (requires enJty extracJon!)
Charts at right show latency distribuJon for a randomized 10K query benchmark. The client was local in the lea chart. Average latency for local requests is 26ms and maximum observed is 240ms.