Post on 11-Jan-2017
Behavioral Dynamics from the SERP’s Perspective: What are Failed SERPs and How to Fix Them?
Julia Kiseleva, Jaap Kamps, Vadim Nikulin , Nikita Makarov
Eindhoven University of TechnologyUniversity of Amsterdam
Yandex
CIKM’15, Melbourne, Australia
Changes in User Intents
Wik
iped
ia P
age
View
2015
Malaysia Airlines Flight 370
Malaysia Airlines Flight 17
• By analyzing behavioral dynamics at the SERP level, • can we detect an important class of detrimental cases (such as search failure)• based on changes in observable behavior
caused by low user satisfaction?
Research Problem
ti
ti+1
Tim
elin
e
How Can We Detect the Changes?
BF1 = Reformulation Signal (RS)
BF2 = Abandonment Signal (AS)
BF3 = Volume Signal (VS)
QUERY SERP,
QUERY SERP, BF4 = Click Position Signal (CS)
ti
ti+1
Tim
elin
e
How Can We Detect the Changes?
BF1 = Reformulation Signal (RS)
BF2 = Abandonment Signal (AS)
BF3 = Volume Signal (VS)
QUERY SERP,
QUERY SERP, BF4 = Click Position Signal (CS)
• Change detection techniques o In dynamically changing and non-stationary environments, the data
distribution can change over time yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable) given the input (input features)
• Concept drift types:
Change Detection Techniques
• Change detection techniques o In dynamically changing and non-stationary environments, the data
distribution can change over time yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable) given the input (input features)
• Concept drift types:
Time
Dat
a m
ean
Sudden/abrupt Increment
alGradual Reoccurring Outlier
(not concept drift)The example:
“flawless Beyoncé”
Seasonal change such as
“black Friday 2014”
The example:“idaho bus
crash investigation”
The example:“cikm conference
2015”
Change Detection Techniques
Detecting Drift in User Satisfaction
QUERY SERP,BFi Pti BFi( )
QUERY SERP,Pti+1 BFi( )
,
,Failed SERP !!!
Failed SERP
schedule of matches of the World Cup 2014schedule of matches of the World Cup 2014 ON TV
Detecting Drifts in Behavioral Signals
Query: “cikm conference”
0.1TimeLinet0
0.1 0.2 0.2 0.3
Reformulation: “2015”
Window W0ti
Query: “cikm conference”
0.1TimeLinet0 ti+ t
0.1 0.2 0.2 0.3 0.7 0.8 0.8
Reformulation: “2015”
Window W0 Window W1ti
E(W0) E(W1)Size of Window W1
= n1
Size of Window W0
= n0
Drift
If |E(W1) - E(W2)|> eout
Then Drift Detected
Detecting Drifts in Behavioral Signals
Time
Dat
a m
ean Sudden drift
Sudden Drift:Disambiguation
such as ‘medal Olympics
2016’
ti ti+1 ti+1+W1
Sudden VS Incremental Drifts
Time
Dat
a m
ean Sudden drift
Sudden Drift:Disambiguation
such as ‘medal Olympics
2016’
ti ti+1 ti+1+W1
Sudden VS Incremental Drifts
Abandonment Signal (AS)
Volume Signal (VS)
Time
Dat
a m
ean Sudden drift Incremental Drift
Sudden Drift:Disambiguation
such as ‘medal Olympics
2016’
Incremental Drift:
Disambiguation such as ‘CIKM
conference 2015’
ti ti+1
ti+1+W1
tj tj+1 tj+1+W2
Sudden VS Incremental Drifts
Time
Dat
a m
ean Sudden drift Incremental Drift
Sudden Drift:Disambiguation
such as ‘medal Olympics
2016’
Incremental Drift:
Disambiguation such as ‘CIKM
conference 2015’
ti ti+1
ti+1+W1
tj tj+1 tj+1+W2
Sudden VS Incremental Drifts
Click Position Signal (CS)
Abandonment Signal (AS)
Volume Signal (VS)
Time
Dat
a m
ean Sudden drift Incremental
Drift
Sudden Drift:Disambiguation
such as ‘medal Olympics
2016’
Incremental Drift:
Disambiguation such as ‘CIKM
conference 2015’
ti ti+1
ti+1+W1
tj tj+1 tj+1+W2W1 <<
W2
Sudden VS Incremental Drifts
Click Position Signal (CS)
Abandonment Signal (AS)
Volume Signal (VS)
Time
Dat
a m
ean Reoccurring
driftDisambiguation
such as ‘movie
premieres November 2014’
Disambiguation such as ‘movie
premieres December 2014’
Disambiguation such as ‘movie
premieres January 2015’
+ _
Changes in query intent
_ _+ +
Positive Sudden
Drift
Negative
Sudden Drift
Negative
Sudden Drift
Negative
Sudden Drift
Positive Sudden
Drift
Positive Sudden
Drift
Reoccurring Drifts
If sign(E(Wi+1) - E(Wi)) > 0 then “+”If sign(E(Wi+1) - E(Wi)) < 0 then “-”
Gradual Drifts
Time
Dat
a m
ean
Gradual drift Disambiguation
such as ‘novak djokovic
fiancée’
Disambiguation such as
‘novak djokovic wedding’
Disambiguation such as
‘novak djokovic baby’
+ + +__ _
Positive Sudden
Drift
Negative
Sudden Drift
Positive Incremental
Drift
Changes in query intent
oDataset consists of 12 months of the behavioral log data from Yandex (2015)
o~25 millions users per dayo In total ~ 150 millions of queries per dayoTrain Period – one monthoTest Period – 3, 7, and 14 days
Experimentation
o 100s of thousands of query drifts detected• huge number, but small fraction of traffic
o Over 200,000 unique <Q,Q’> pairso Revisions are varied• unique revision term(s) occurring in 3-4 unique pairs• 2-3 % are year revisions (‘2014’, ‘2015’)• 17-18 % of revisions contain any number
o Detect far more revisions than standard rules/templates• Queries and revision in many language• Demonstrates general applicability of the approach
Detected Query Drifts
• We conducted a conceptual analysis of success and failure at the SERP levelo we introduced the concept of a successful and failed SERPo we analyzed their behavioral consequences identifying indicators
of success and failure• We conducted an analysis of different types of drifts in
query intent over timeo we studied different changes in query intent: sudden,
incremental, gradual and reoccurringo we introduced an unsupervised approach to detect failed SERPs
caused by drift (sudden, incremental)• We tested our detector on massive raw search logs
Conclusions
• We conducted a conceptual analysis of success and failure at the SERP levelo we introduced the concept of a successful and failed SERPo we analyzed their behavioral consequences identifying indicators
of success and failure• We conducted an analysis of different types of drifts in
query intent over timeo we studied different changes in query intent: sudden,
incremental, gradual and reoccurringo we introduced an unsupervised approach to detect failed SERPs
caused by drift (sudden, incremental)• We tested our detector on massive raw search logs
Conclusions