Query Recommendation
Xiaofei Zhu ([email protected])L3S Research Center, Leibniz Universität Hannover
Introduction
Ambiguous(e.g., Java)
Lack of domain knowledge
Short(1-2 words)
?
3
Query Recommendation
It aims to provide users alternative queries, which can represent their information needs more clearly in order to return better search results .
original query
recommendation
Query Recommendation
How to do query recommendation? Find alternative queries with similar search intent. Differ with Document , Image?
19.04.2023 4
Query log
Query log. A query log records information about the search actions of the users
of a search engine.
A typical query log is a set of records <qi,ui,ti,Vi,Ci> qi – the submitted query
ui – an anonymized identifier for the user who submitted the query
ti – timestamp, the time at which the query was submitted for search.
Vi – the set of returned results to the query
Ci - the set of documents clicked by the user.
19.04.2023 5
Example of query log (AOL, 2006)
AnonID Query QueryTime ItemRank ClickURL7051923 motorola text messages 2006-03-24 19:35:31 1 http://www.telusmobility.com7051923 motorola text messages 2006-03-24 19:35:31 4 http://support.t-mobile.com7051923 motorola t730 text messages 2006-03-24 19:38:40 2 http://www.phonescoop.com7051923 motorola t730 text messages 2006-03-24 19:38:40 3 http://www.1800mobiles.com7051923 motorola t730 text messages 2006-03-24 19:38:40 5 http://cgi.ebay.com7051923 motorola t730 text messages 2006-03-24 19:38:40 7 http://phonearena.com7051923 spike muscle car 2006-03-25 12:57:43 2 http://www.classicauto-sales.com7051923 spike muscle car 2006-03-25 12:57:43 5 http://sev.prnewswire.com7051923 spike muscle car 2006-03-25 13:00:22 7051923 usps 2006-03-25 14:23:21 1 http://www.usps.com7051923 vc2 auctions 2006-03-25 14:31:417051923 auctions for 1 2006-03-25 14:33:47
19.04.2023 6
Microsoft 2006 RFP dataset
QueryID Query Time URL Position0000003a718649f2 schwab 2006-05-11 08:07:35 http://www.schwab.com/ 10000006d43b549c1 us geography 2006-05-04 14:23:00 http://www.enchantedlearning.com/usa/ 30000006d43b549c1 us geography 2006-05-04 14:23:03 http://www.sheppardsoftware.comState15s_500.html 4
0000016aa52e4fbc wwf 2006-05-21 09:25:34 http://www.panda.org/ 2000002aa6e27443f biggercity 2006-05-07 13:30:45 http://www.biggercity.com/chat/ 11000005aac1f6423f studios 2006-05-09 14:21:29 http://www.shawneestudios.com/contact_us.php 11000008d8afaa459a www.nfl.com 2006-05-28 18:22:39 http://www.nfl.com/teams/NYJ.html 77000009c2848e4a68 north hills school district 2006-05-04 12:29:12 http://www.nhsd.net/ 1
19.04.2023 7
Time Query QueryID SessionID ResultCount2006-05-01 00:00:01defination Gravitational 46c13f0705f6436b 19ab975e898d46d1 112006-05-01 00:00:01kimclement a3d2cae45e2b4c5b 1b748d1afa9b4828 102006-05-01 00:00:01scientology crazy beliefs 418324ef33d14ed2 10f477402db84c9a 102006-05-01 00:00:01www.joj.sk 489238bdf8834d68 16271eb6bf174c5c 92006-05-01 00:00:04www.selectcareers.com f92efd8044904ac4 193f9f8442d44c48 02006-05-01 00:00:08What is May Day? 37afe7af832649d2 21f6a0dfea4348ac 14 2006-05-01 00:00:10vikings draft choices suck b0519e4528d84b44 196b0bb2f1d643f2 102006-05-01 00:00:10wwwcrownawards.com 9eda4716dfb045e2 04e3a26067a84748 0 2006-05-01 00:00:15Australian miners ba6d190cc4cd4fd3 136fd5e571d24886 10
Click-through data
• Click-through data records the clicked documents after user submit a query to the search engine.
Query Feature Representation
Basic Assumption
If user clicks a document after she issues a query, then the clicked document is more or less relevant to the submitted query, thus the query can be represented by it clicked documents.
Query-URL GraphIf two queries co-clicked many common documents, then they have similar search intent.
[Beeferman, KDD’00][Mei, CIKM’08]
How to use query log for query recommendation?
Query Session
• Query session: a single user submits a sequence of related queries in a time interval for a specific search task.
Association Rules
Basic Assumption
If two queries frequently co-occur in the same sessions, then they are relevant to each other.
Query GraphContinuous submitted queries in short time interval by the same user share similar search intent. [Foneseca, LA-WEB’03]
[Zhang, WWW’06][Boldi, CIKM’08, WSCD’09]
How to use query log for query recommendation?
High Relevant Query Recommendation
Query Suggestion Using Hitting Time (CIKM’08) Click-through Data Query-URL Bipartite Graph
Query Suggestions Using Query-Flow Graphs (WSCD’09) Session Data Query-Flow Graph
19.04.2023 10
High Relevant Query Recommendation
Query Suggestion Using Hitting Time (CIKM’08) Click-through Data Query-URL Bipartite Graph
Query Suggestions Using Query-Flow Graphs (WSCD’09) Session Data Query-Flow Graph
19.04.2023 11
Query Suggestion Using Hitting Time (CIKM’08)
Query-URL Bipartite Graph- Edges between V1 and V2
- No edge inside V1 or V2
- Edges are weighted- e.g., V1 = query; V2 = Url
Transition Probabilities
)73(
3),()(
id
jiwjip
)13(
3),()(
jd
jiwijp
A
ij
4
5
7
4V1 V2
7 1
3
2
),(),()(
Vj ji d
jkw
d
jiwkip
Query Suggestion Using Hitting Time (CIKM’08)
Random Walk and Hitting Time Hitting time. How long does it take to hit node a in a
random walk starting at node b ?
19.04.2023 13
2
3
4
15
• Start at 1
Query Suggestion Using Hitting Time (CIKM’08)
Random Walk and Hitting Time Hitting time. How long does it take to hit node a in a
random walk starting at node b ?
19.04.2023 14
2
3
4
15
• Start at 1• Pick a neighbor i based
on the transition probability.
• Move to i
t=1
Query Suggestion Using Hitting Time (CIKM’08)
Random Walk and Hitting Time Hitting time. How long does it take to hit node a in a
random walk starting at node b ?
19.04.2023 15
2
3
4
15
• Start at 1• Pick a neighbor i
uniformly at random• Move to i• Continue
t=2
Query Suggestion Using Hitting Time (CIKM’08)
Random Walk and Hitting Time Hitting time. How long does it take to hit node a in a
random walk starting at node b ?
19.04.2023 16
2
3
4
15
• Start at 1• Pick a neighbor i
uniformly at random• Move to i• Continue
t=2
If the random walk hits a node quickly,
then its close to the start node!
If the random walk hits a node quickly,
then its close to the start node!
Hitting time!
Hitting time from i to A
i A
Graph G
?Aih
Hitting time from i to A
i A
j
k
Graph G
1Aih L
( , )P i k
( , )P i j
Hitting time from i to A
i A
j
k
Graph G
Ajh
Akh
1 ( , ) ( , )A A Ai j kh P i j h P i k h
( , )P i k
( , )P i j
Generate Query Suggestion
20
T
aa
american airline
mexiana
www.aa.com
www.theaa.com/travelwatch/planner_main.jsp
en.wikipedia.org/wiki/Mexicana
300
15
Query Url• Construct a (kNN)
subgraph from the query log data (of a predefined number of queries/urls)
• Compute transition probabilities p(i j)
• Compute hitting time hiA
• Rank candidate queries using hi
A
Result: Query Suggestion
21
Yahoo
aa route planner
aa route finder
aa airlines
aa meetings
aa autoroute
aa road map
Live
aa route finder
aa route planner
aa airlines
american airlines
aa meeting
aa road map
Query = ‘aa’
Hitting time
alcoholics anonymous
automobile association
theaa
american airlines
american air
american airline ticket reservation
High Relevant Query Recommendation
Query Suggestion Using Hitting Time (CIKM’08) Click-through Data Query-URL Bipartite Graph
Query Suggestions Using Query-Flow Graphs (WSCD’09) Session Data Query-Flow Graph
19.04.2023 22
Session Data Definition: the sequence of queries of one particular user
within a specific time limit .
19.04.2023 23
Query Suggestions Using Query-Flow Graphs (WSCD’09)
Query Graph
24Z. Zhang and O. Nasraoui. Mining search engine query logs for query recommendation. In WWW, pages 1039–1040, 2006.
• This model works by accumulating many query sessions and adding up the similarity values for many same query pairs
two consecutive queries
queries that are not neighbors in the same session
Query-Flow Graph
P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, S. Vigna: “The query-flow graph: model and applications”. CIKM 2008.
Build Query-flow Graph
The key aspect of the construction of the query-flow graph is to define the weighting function w.
19.04.2023 26
represent the number of times the transition was observed in the same search session.
:r E N
Query Recommendation
The query recommendation methods are based on the probability of being at a certain node after performing a random walk over a query graph.
Random Walk with restart a random surfer starts at the initial query q at each step
α , follows one of the outlinks from the current node 1 - α , jumps back to q
19.04.2023 27
Query Recommendation
The query recommendation methods are based on the probability of being at a certain node after performing a random walk over a query graph.
Random Walk with restart
19.04.2023 28
M - the transition matrix of a Markov chainP - row-normalized weight matrix of the query flow graphej - the vector j-th entry is 1,others are zeroes
(1 )1 TqM P e
Random walks
Random walks on graphs correspond to Markov Chains The set of states S is the set of nodes of the graph G The transition probability matrix is the probability that we
follow an edge from one node to another
30
Definitions
Adjacency matrix A Transition matrix P1
1
11
1
1/2
1/21
31
random walk
1
1/2
1/21
t=0
32
random walk
1
1/2
1/21
1
1/2
1/21
t=0 t=1
33
random walk
1
1/2
1/21
1
1/2
1/21
t=0 t=1
1
1/2
1/21
t=2
34
random walk
1
1/2
1/21
1
1/2
1/21
t=0 t=1
1
1/2
1/21
t=2
1
1/2
1/21
t=3
35
Probability Distributions
xt(i) = probability that the surfer is on node i at time txt+1(i) = ∑j(Probability of being at node j)*Pr(j->i) =∑jxt(j)*P(j,i)xt+1 = xtP = xt-1*P*P= xt-2*P*P*P = …=x0 Pt
What happens when the surfer keeps walking for a long time?
36
What happens when the surfer keeps walking for a long time? Stationary Distribution
Intuitively the stationary distribution at a node is related to the amount of
time a random walker spends visiting that node. Mathematically
Remember that we can write the probability distribution at a node as
xt+1 = xtP. For the stationary distribution v0 we have
v0 = v0 P
v0 is the left eigenvector of the transition matrix P !
37
Interesting questions
Does a stationary distribution always exist? Is it unique? Yes, if the graph is “well-behaved”, i.e., P is ergodic
P is ergodic if : irreducible aperiodic
Irreducible Not irreducible
Irreducible: There is a path from every node to every other node.
AperiodicPeriodicity is 3
Aperiodic: State i is periodic with period k if all paths from i to i have length that is multiple of k. Otherwise, it’s aperiodic.
38
If a markov chain P is irreducible and aperiodic then the largest eigenvalue of the transition matrix will be equal to 1 and all the other eigenvalues will be strictly less than 1. Let the eigenvalues of P be {σi| i=0:n-1} in non-
increasing order of σi .
σ0 = 1 > σ1 > σ2 >= ……>= σn
Result: Query Suggestion (q =“apple” and q =“jeep” )
19.04.2023 39
Why Diversity Query Recommendation
Actually, in query recommendation, only providing the “relevant” recommendations is far away from satisfying users’ information needs.
Original Query :Apple
⁞
apple ipad 3
apple iphone 4s
apple tree
apple seed
apple computer
The queries we recommend should cover multiple potential search intents of users and minimize the risk that users will not be satisfied.
High Diversity Query Recommendation
Diversifying Query Suggestion Results [Hao Ma, AAAI’10] Query-URL graph Hitting time
A Unified Framework for Recommending Diverse and Relevant Queries[Xiaofei Zhu, WWW’11] Manifold Manifold Ranking with Stop Points
19.04.2023 41
High Diversity Query Recommendation
Diversifying Query Suggestion Results [H. Ma, AAAI’10] Query-URL graph Hitting time
A Unified Framework for Recommending Diverse and Relevant Queries[X.F. Zhu, WWW’11] Manifold Manifold Ranking with Stop Points
19.04.2023 42
Graph Construction
19.04.2023 43
Figure 1: Example for Bipartite Graph (extracted from the clickthrough data)
Determining the First Suggested Query
Initial Transition Probability
44
initial transition probability from node i to node j
normalization term, is the total number of times that the query node i has been issued in the dataset.
the number of click frequency between node i and node j
--
--
--
Determining the First Suggested Query
Random Jump In addition to the transition probability, there are random
relations among different queries. It adds a uniform random relation among different queries
19.04.2023 45
-- the probability of taking a “random jump”, i.e., transit among different queries
Without any prior knowledge, it sets , where d is a uniform stochastic distribution vector
--
Determining the First Suggested Query
Random Walk on the Query-URL graph With the transition probabilistic matrix P defined, it then can
perform the random walk on the query-URL graph. the probability of transition from node i to node j after a t
step random walk as:
19.04.2023 46
Explain: 1) The random walk sums the probabilities of all paths of length t between the two nodes. if there are many paths the transition probability will be high 2) The larger the transition probability Pt(i, j) is, the more the node j is similar to the node i.
Determining the First Suggested Query
the largest transition probability from node q will be recommended as the first suggested query performing a t-step random walk
parameter t determines the resolution of the Markov random walk
Large t: the random walk depend more on the graph structure Small t: preserves information about the starting node
19.04.2023 47
Ranking the Rest Queries
Employ the hitting time to rank and diversify the rest of the queries. Hitting time
Let S be a subset of vertex set V, the expected hitting time h(i|S) of the random walk is the expected number of steps before node i is visiting the starting set S.
19.04.2023 48
N(i) denotes the neighbors of node i
Ranking the Rest Queries
Property those nodes strongly connected to s1 will have many fewer
visits by the random walk nodes far away from s1 still allow the random walk to move
among them and thus receive more visits
The second suggestion node select the second suggestion node s2 ∈ Q with the largest
expected hitting time to the subset S containing two nodes q and s1.
19.04.2023 49
Result: Query Suggestion
19.04.2023 50
High Diversity Query Recommendation
Diversifying Query Suggestion Results [Hao Ma, aaai’10] Query-URL graph Hitting time
A Unified Framework for Recommending Diverse and Relevant Queries[Xiaofei Zhu, WWW’11] Manifold Manifold Ranking with Stop Points
19.04.2023 51
19.04.2023 52
relevance diversity
Query Recommendation
Manifold ranking Import stop points
A novel unified frameworkManifold ranking with stop points
19.04.2023 53
1 1 1
2 2 2
m m m
u u u
u u u
u u u
query1 query2 queryn
Affinity matrix W
Traditional manifold ranking process
19.04.2023 54
W- affinity matrix, D – diagonal matrix
Step 1:
Step 2:
Step 3:
Manifold ranking with stop points
19.04.2023 55
- set of stop points
- set of free points
T
R
RR RT
TR TT
S SS
S S
0 RT TTS S
( 1) ( )
(1 )
t t
R R R
T T
RR RT
TR T TT
f f y
yS f
S
f
S
S (2)
( 1) ( )
(10
)0
t t
R RR R R
T TR T T
f S f y
f S f y (3)
( 1) ( ) (1 ) t tR RR R Rf S f y (4)
( 1) ( ) (1 ) t tf fS y (1)
19.04.2023 57
( 1) ( ) (1 )t tR RR R Rf S f y
Results: Query recommendation (‘abc’, ‘yamaha’)
19.04.2023 58
Evaluation Metrics
Automatic Evaluation Open Directory Project(ODP) <-> Relevance
Given two queries q and q’
19.04.2023 59
c(q’): Arts/Television/Stations/North America /United States’
c(q): ‘Arts/Television/News’
l(c, c’): their longest common prefix , e.g., ‘Arts/Television’
: the longest category of c and c’, e.g., 5
Evaluation Metrics
Automatic Evaluation Open Directory Project(ODP) <-> Relevance
Given two queries q and q’
19.04.2023 60
c(q’): Arts/Television/Stations/North America /United States’
c(q): ‘Arts/Television/News’
Evaluation Metrics
Automatic Evaluation Commercial search engine (i.e., Google) <-> Diversity
Given two queries q and q’
19.04.2023 61
o(q, q) is the number of overlapped URLs among thetop k search results of query q and q’.
Evaluation Metrics
Automatic Evaluation Commercial search engine (i.e., Google) <-> Diversity
Given two queries q and q’
19.04.2023 62
Evaluation Metrics
Automatic Evaluation Open Directory Project(ODP) <-> Relevance Commercial search engine (i.e., Google) <-> Diversity
Evaluation metrics Q-measure
19.04.2023 63
β - parameter to control the tradeoff between relevance and diversity
Experiments
Average Q-measure of Query Recommendation over Different Recommendation Size under 5 Approaches.
Proposed Method
Recommendation pool
search results
Experiments
Manual Evaluation Recommendation pool 3 human judges Label tool
Experiments
Evaluation Metrics
– Intent-Coverage
– α-nDCG (α -normalized Discounted Cumulative Gain )
Experiments
Table 2: Performance of recommendation results over a sample of queries under five different approaches.
68
Why High Utility Query Recommendation
Focuses on recommending users relevant queries to their initial queries.
Query Levelinitial query
query 1
query 2
query 3
• Common Query Terms (Wen J. et al, WWW2001)
• Same Clicked Documents (Mei Q. et al, CIKM 2008)
• Co-Occurring in Same Search Sessions (Zhang Z.et al, WWW 2006)
Only recommend relevant query is enough for find useful search results?
69
Why High Utility Query Recommendation
‘iphone start sell’
‘iphone initial release’
iphone sell time
Recommend High Utility Query
High Utility Query Recommendation
More Than Relevance: High Utility Query Recommendation By Mining Users’ Search Behaviors[X.F. Zhu, CIKM’12] Probabilistic Graphical Model (Query Utility Model)
Recommending High Utility Query via Session-Flow Graph [X.F. Zhu, ECIR’13] Session-Flow Graph Two-phase model based on absorbing random walk
19.04.2023 70
High Utility Query Recommendation
More Than Relevance: High Utility Query Recommendation By Mining Users’ Search Behaviors[X.F. Zhu, CIKM’12] Probabilistic Graphical Model (Query Utility Model)
Recommending High Utility Query via Session-Flow Graph [X.F. Zhu, ECIR’13] Session-Flow Graph Two-phase model based on absorbing random walk
19.04.2023 71
72
A Typical Search Session
UserSatisfied
Information Needs
bad perceived utilitybad posterior utiltiy
red - relevant √ - attractiveness
Probabilistic Graphical Model
73
Ri -1 Ri Ri +1
Ci -1 Ci Ci +1
Ai -1 Si -1 Ai Si Ai +1 Si +1
α β
Ri : whether there is a reformulation at position iCi : whether the user clicks on some of the search results of the reformulation at position i;
Ai : whether the user is attracted by the search results of the reformulaiton at position i;Si : whether the user’s information needs have been satisfied at position i;
74
Parameter Estimation
Maximum Likelihood Estimation
1: 1: 1: 1:
1 1 1:1
( , , , )
( | , ) ( | , ) ( ) ( | )
M M M M
M
i i i i i i i i ii
P C R A S
P C A R P R R S P A P S C
1( | , ) ( , ) (1 )i iC Ci i i i i i iP C R A R A R A
11 1 1 1 1 1( | , ) ( 1-S )) (1 1-S ))i iR R
i i i i i i iP R R S R R ( (
1( ) ( )( ) (1 )i iA Ai iP Ai
11: ( ) ( )
1 1
( | ) ( ( 1)) (1 ( ( 1)))i i
i iS S
i i k k k kk k
P S C I C I C
Where
Parameter Estimation
Log Likelihood Function
( ) ( )1 1
( )1
( )1
( log( ) (1 ) log(1 )
log( ( ( 1)))
(1 ) log(1 ( ( 1))))
j j
j
j
N Mj j
i i i ij i
ij j
i k kk
ij j
i k kk
L A A
S I C
S I C
1 1
1 1
1 1
1 1
( ( ) )
( ( ) )
( 1) ( ( ) )
( ( ) )
N M ji jj i
t N M
jj i
N M ji jj i
N M
jj i
A I i t
I i t
I C I i t
I i t
Parameter Estimation
Maximize Log Likelihood Function
( ) ( )1 1
( )1
( )1
( log( ) (1 ) log(1 )
log( ( ( 1)))
(1 ) log(1 ( ( 1))))
j j
j
j
N Mj j
i i i ij i
ij j
i k kk
ij j
i k kk
L A A
S I C
S I C
21
M
t tt
L
Regularization term
Lagrange multiplier
0t
Parameter Estimation
Optimization Condition :
Parameter Estimation
Newton-Raphson
Experimental Results
Dataset Our experiments are based on publicly available query
logs, namely UFindIt log data. There are totally 40 search tasks represented by 40 test queries.
Experimental Results
Metric QRR (Query Relevant Ratio)
MRD (Mean Relevant Document)
Measuring the probability that a user finds relevant results when she uses query q for her search task
Measuring the average number of relevant results a user finds when she uses query q for her search task.
Experimental Results
Query Utility Model(QUM): the expected information gain users obtained from the search results of the query according to their original information needs, which is the product of the two component utilities.
QUM
Adjacency (ADJ): given a test query q, the top frequent queries in the same session adjacent to q are recommended to users[www'06].
Co-occurrence (CO): given a test query q, the top frequent queries co-occurred in the same session with q are selected as recommendations [wsdm'10].
ADJ
CO
Query-Flow Graph (QF): query-flow graph based on collective search sessions, and perform a random walk on this graph for query recommendation [cikm'08].
Click-through Graph (CT): query-URL bipartite graph, employs the hitting time as a measure to select queries for recommendation [cikm'08].
QF
CT
Two component utilities (i.e., perceived utility and posterior utility) in the QUM method: Perceived Utility method (PCU) and Posterior Utility method (PTU).
PCU
PTU
Experiments
Impact of parameter μ to the performance of QUM
Limitation of QUM method
Cannot make full use of the click-through information. it only considers whether the search results of a
reformulated query have some clicked documents or not, but does not take individually clicked document into consideration.
It is necessary to proposes a novel method to further capture these specific clicked documents for modeling query utility.
19.04.2023 83
84
Framework of Our Approach
Query-Flow Graph
Reformulation Behaviors
Random Walk
Click Behaviors
Document Nodes
Absorbing States
Session-Flow Graph
Absorbting Random Walk
+
Two-phase model based on Absorbing Random Walk (TARW)
Session Flow Graph
q → q1 → q3
q → q3 → q4
q → q4
⁞
query session
Query-Flow Graph: Boldi et al. (CIKM 2008)
Session Flow Graph
q → q1:u1:u2→ q3:u3
q → q3 → q4:u4:u5
q → q4:u6
⁞
query session
Session Flow Graph: expands query-flow graph (document nodes + failure nodes)
87
Session Flow Graph
Definition:
Adjacency MatrixNodes
Edges
88
Two-phase model based on absorbing random walk (TARW)
Forward Utility Propagation
Backward Utility Propagation
Two-phase Model Based on Absorbing Random Walk
> Utility score was transferred from the original query node to reformulation node, and at last absorbed by document node and failure node.> Utility score was inversely transferred from document nodes to reformulation node.
Recommendation: queries with the highest utilities.
Forward Utility Propagation
89
Assign transition probability to different types of nodes (reformulation, document, failure):
Reformulation Node
—— α2
—— α1
Document Node
—— α3 Failure Node
α1
α2α3
α1+α2+α3=1
Parameter Setting:
α1
α2
α3
—— Reformulation node
—— document node
—— failure node
Our work: assign transition probability based on characteristics of each candidate query.
observed transition probabilityprior transition probability
Previous work (Sadikov, WWW2010): share the same transition probability setting (a1,a2,a3) to different types of nodes.
posterior transition probability
91
Transition Probability
Reformulation Nodes
Document Nodes:
Failure Node:
Computing the Distribution
In the forward utility propagation, the corresponding transition matrix is:
PQ : n n transition matrix on query nodes
PD : n m matrix of transition from query node to document node
ID,IS: identity matrix, denoting document nodes and failure nodes are absorbing states.
PS : n 1 matrix of transition from query to failure node.
reducible (no station distribution)
Computing the Distribution
Computing the absorbing distribution by an iterative way :
Pt[i, j] represents the probability of node i to node j after t step walk.
we only have to compute the probability from query to document. O(tn3+n2m)
in recommendation scenario, only the probability from original query to documents are needed, i.e. computing the matrix row of original query.
O(tn2+nm)
94
Backward Utility Propagation
Experimental Results
Dataset Our experiments are based on publicly available query
logs, namely UFindIt log data. There are totally 40 search tasks represented by 40 test queries.
Experimental Results
Metric QRR (Query Relevant Ratio)
MRD (Mean Relevant Document)
Measuring the probability that a user finds relevant results when she uses query q for her search task
Measuring the average number of relevant results a user finds when she uses query q for her search task.
Experimental Results
Overall Evaluation Results
TARW method significantly better than all the baseline recommendation methods (p-value <= 0.05))
TARW
Evaluation of Document Utility
Baseline methods: Document Frequency Based Method (DF)
the click frequency of a document reflects users preference for that document when they search with the original query
Session Document Frequency Based Method (SDF) clicked documents within the same search session convey
the similar search intent
Markov-model Based Method (MM): Based on the learned document distribution for the original
query by a Markov-model based method
Metrics: Precision at position k(P@k)
Normalized Discounted Cumulative Gain(NDCG)
Mean Average Precision (MAP)
Evaluation of Document Utility
Evaluation of Document Utility
TARW improvements over MM by:1) using an adaptive transition probability setting to different types of nodes2) modeling users' behaviors of giving up their search tasks by introducing the failure nodes.
Summary
query recommendation techniques
High Relevant Query Recommendation
High Diversity Query Recommendation
High Utility Query Recommendation
19.04.2023 101
Top Related