Efficient Approximate Thompson Sampling for Search Query Recommendation in SAC'15
-
Upload
ebay-inc -
Category
Technology
-
view
68 -
download
0
Transcript of Efficient Approximate Thompson Sampling for Search Query Recommendation in SAC'15
x = random.random() #0<=x<1
y = 0.49
if (x < y):
return true
else:
return false
(Observed)
13
(Estimate)
I played 10 times -- win 5 and lose 5
I played 100 times -- win 45 and lose 55
#Learn#
Chance o
f havin
g μ
μ
``prior’’
0.0 0.2 0.4 0.6 0.8 1.0
02
46
8
x
PD
F
0.0 0.2 0.4 0.6 0.8 1.0
02
46
8
x
PD
F
The motivation of Thompson-S (2)
21
Beta(20,10)
Beta(60,40)
See a
good one;
“learn more”
0.0 0.2 0.4 0.6 0.8 1.0
02
46
8
x
PD
F
0.0 0.2 0.4 0.6 0.8 1.0
02
46
8
x
PD
F
Intuition
(Underdog, but worth to learn)
22
Beta(4,6)
Beta(60,40)
0.0 0.2 0.4 0.6 0.8 1.0
02
46
8
x
PD
F
0.0 0.2 0.4 0.6 0.8 1.0
02
46
8
x
PD
F
The motivation of Thompson-S (1)
23
Beta(10,15)
Beta(60,40)
avoid exploring
“low potential” arm
early on
Intuition (Equal exploration)
24 0.0 0.2 0.4 0.6 0.8 1.0
02
46
8
x
PD
F
0.0 0.2 0.4 0.6 0.8 1.0
02
46
8
x
PD
F
Beta(40,60) Beta(60,40)
25
Init: a=1, b=1, Sx=Fx=0 for all x
each arm corresponds to Beta(Sx+a, Fx+b) prior
1. Draw a random number from each arm
based on Beta(Sx+a, Fx+b)
2. Play the arm (x’) with the highest number
3. If (see a reward)
Sx’ += 1
else
Fx’ += 1
Algorithm
33
#Experiments#
Target: popular 100 (queries)
Date: 2 weeks (Nov. 2013)
Goal: identify top M
Measurement: Regret
->
Best
->
Picked