Modeling Sequential Preferences with Dynamic User and … · 2021. 6. 11. · Peter The notion of...

Modeling Sequential Preferences

with Dynamic User and Context Factors

Duc-Trong Le, Yuan Fang, Hady W. Lauw

ECML-PKDD 2016

Riva Del Garda, Italy

Outline

• Motivating examples and Models

• Modeling sequential preferences (HMM)

• Modeling Dynamic User-Bias Emissions (SEQ-E)

• Modeling Dynamic Context-Biased Transitions (SEQ-T)

• Joint Model (SEQ*)

• Experiments

• Real-life Datasets: Twitter & Yes.com

• Synthetic Dataset

2

The notion of Sequence – Song playlists

3

Dream

onCrazy Layla

Faded ForceDisfigure

…

…

Sequential

Preference

Tom

Mary

t=0 t=1 t=2 …


4

HMM-Formulation: 𝜃 = (𝜋, 𝐴, 𝐵)

• 𝜋 is the initial state distribution: 𝜋𝑥 ≜ 𝑃 𝑋1 = 𝑥 ;

• A is the transition matrix: 𝐴𝑥𝑢 ≜ 𝑃 𝑋𝑡 = 𝑢 𝑋𝑡−1 = 𝑥);

• B is the emission matrix: 𝐵𝑥𝑦 = 𝑃 𝑌𝑡 = 𝑦 𝑋𝑡= 𝑥);

∀ 𝑥, 𝑢 ∈ 𝒳; 𝑦 ∈ 𝒴; 𝑡 ∈ {1, 2, … . }

Hidden Markov Model (HMM)

latent states 𝒳

Observations 𝒴


5

HMM-Example: A HMM model with 2 latent states, 4 items

• 𝜋 = 𝜋0, 𝜋1 = 0.8, 0.2 𝜋0 = 𝑃 𝑋1 = 0 = 0.8

• 𝐴 =𝐴00 𝐴01𝐴10 𝐴11

=0.7 0.30.6 0.4

𝐴00 = 𝑃(𝑋𝑡 = 0 | 𝑋𝑡−1 = 0) = 0.7

• 𝐵 =𝐵00 𝐵01 𝐵02 𝐵03𝐵10 𝐵11 𝐵12 𝐵13

=0.6 0.1 0.2 0.10.3 0.5 0.1 0.1

𝐵00 = 𝑃(𝑌𝑡 = "Dream on" | 𝑋𝑡 = 0) = 0.6

𝐵10 = 𝑃(𝑌𝑡 = "Dream on" | 𝑋𝑡 = 1) = 0.3

The notion of Group

6

Dream

onCrazy Layla

Faded Force Disfigure

Tom

Mary

t=0 t=1 t=2

Front-

ierFly

AwayWay Back

HomePeter

Soft

Rock

EDM

EDM

Hypotheses: There exists different groups of users

• Users in the same group share the same emission probabilities

• Users across groups may have different emission probabilities.

Modeling Dynamic User-Bias Emissions

7

Formulation: 𝜃 = 𝜋, 𝜎, 𝐴, 𝐵 with a set of groups 𝒢

• 𝜎 is the group distribution: 𝜎𝑔 ≜ 𝑃(𝐺 = 𝑔)

• B is the new emission tensor: 𝐵𝑔𝑥𝑦 ≜ 𝑃 𝑌𝑡 = 𝑦 𝑋𝑡 = 𝑥, 𝑮 = 𝒈)

∀ 𝑥, 𝑢 ∈ 𝒳; 𝑦 ∈ 𝒴; 𝑔 ∈ 𝒢; 𝑡 ∈ {1, 2, … . }

• Example: 𝐵000 = 𝑃 𝑌𝑡 = "Dream on" 𝑋𝑡 = 0,𝑮 = 𝟎) = 0.8

𝐵100 = 𝑃 𝑌𝑡 = "Dream on" 𝑋𝑡 = 0,𝑮 = 𝟏) = 0.3

SEQ-E

Model

…𝜋 …

𝑌𝑡

𝑋𝑡

𝑌𝑡−1𝑌1

𝑋𝑡−1𝑋1

𝜎 𝐺

Peter

The notion of Context Features, Factors

8

Disfigure

Artist: Blank

Tag: Deep House

Artist: Alan Walker

Tag: Deep HouseArtist: Alan Walker

Tag: Melodic House

Faded Force

Mary

Context

FeaturesLatent Context Factors

Hypotheses: There exists context features and factors

• Latent context factors manifest through context features

• Transitions are affected by latent context factors.

Front-

ier

Fly

AwayWay Back

HomeEDM

Artist: Krale

Tag: DubstepArtist: Krys Talk

Tag: DubstepArtist: Krys Talk

Tag: Chill Trap

EDM

Modeling Dynamic Context-Biased Transitions

9

Hypothesis:

Transitions are affected by latent context factors

Formulation: 𝜃 = 𝜋, 𝜌, 𝐴, 𝐵, 𝐶 with a context factors set ℛ;

• 𝜌 is the distribution of the latent context factor: 𝜌𝑟 ≜ 𝑃 𝑅𝑡 = 𝑟 ;

• A is the new transition tensor: 𝐴𝑟𝑥𝑢 ≜ 𝑃 𝑋𝑡 = 𝑢 𝑋𝑡−1 = 𝑥, 𝑅𝑡−1 = 𝑟);

∀ 𝑥, 𝑢 ∈ 𝒳; 𝑟 ∈ 1, … , ℛ ;

• Examples: 𝐴100 = 𝑃 𝑋𝑡 = 0 𝑋𝑡−1 = 0,𝑹𝒕−𝟏 = 𝟏) = 0.9

𝐴000 = 𝑃 𝑋𝑡 = 0 𝑋𝑡−1 = 0,𝑹𝒕−𝟏 = 𝟎) = 0.4

SEQ-T

Model

…𝜋 …

𝑌𝑡

𝑋𝑡

𝑌𝑡−1

𝑋𝑡−1

𝜌

𝑅𝑡−2 𝑅𝑡−1

𝜌

Modeling Dynamic Context-Biased Transitions

10

Hypothesis:

Latent context factors manifest through context features

Formulation: 𝜃 = 𝜋, 𝜌, 𝐴, 𝐵, 𝐶 with a context features set F =

{𝐹1, 𝐹2, … }, each 𝐹𝑖 takes a values set ℱ𝑖

• C is the feature probability matrix: 𝐶𝑟𝑖𝑓 ≜ 𝑃 𝐹𝑡𝑖 = 𝑓 𝑅𝑡 = 𝑟);

𝑟 ∈ 1, … , ℛ ; 𝑖 ∈ 1,… , 𝐹 ; 𝑓 ∈ ℱ𝑖; 𝑡 ∈ {1, 2, … }

• Examples: 𝐶101 = 𝑃( 𝑭𝒕𝟎 = 𝟏 𝑅𝑡 = 1) = 0.99

𝐶100 = 𝑃( 𝑭𝒕𝟎 = 𝟎 𝑅𝑡 = 1) = 0.01

SEQ-T

Model

…𝜋 …

𝑌𝑡

𝑋𝑡

𝑌𝑡−1

𝑋𝑡−1

𝜌

𝑅𝑡−2…

𝑅𝑡−1𝐹𝑡−11

𝐹𝑡−1|𝐹|

…𝐹𝑡−11

𝐹𝑡−1|𝐹|

… …

𝜌

Joint Model

11

SEQ*

Model

Main idea: Jointly capture user and context factors in a single model

Parameters: The six-tuple 𝜃 = 𝜋, 𝜎, 𝜌, 𝐴, 𝐵, 𝐶 as above

Prediction: 𝑦∗ = argmax𝑦𝑃(𝑌𝑇+1 = 𝑦| 𝑌1, . . , 𝑌𝑇 , 𝐹1, … , 𝐹𝑇; 𝜃∗)

Outline

• Motivating examples and Models

• Modeling sequential preferences (HMM)

• Modeling Dynamic User-Bias Emissions (SEQ-E)

• Modeling Dynamic Context-Biased Transitions (SEQ-T)

• Joint Model (SEQ*)

• Experiments

• Real-life Datasets: Twitter & Yes.com

• Synthetic Dataset

12

Experimental Setup – Real-life Datasets

• Research Question: Do the latent user and context

factors result in significant improvements over HMM ?

• Datasets:

• Features:

• Categories of tags (Yes.com)

• Tweet information (Twitter): #Retweet, Created Time, etc.

Dataset #Observation #SequenceAverage

Length

Song playlists

(Yes.com)3168 250k 7

Hashtag Sequences

(Twitter )2121 114k 19

13

Experimental Setup – Real-life Datasets

• Task: Last item prediction

– For each testing sequence 𝑠 = 𝑌1, 𝑌2, … , 𝑌𝑇−1, 𝑌𝑇 𝑇 ≥ 2

– Save 𝑌𝑇 as ground-truth target. Predict the last item given the

previous items 𝑃 𝑌candidate 𝑌1, 𝑌2, … , 𝑌𝑇−1)

• Evaluation Metrics

• 𝑅𝑒𝑐𝑎𝑙𝑙@𝐾 =# sequences with ground truth in top 𝐾

# sequences in the testing set

Example:

Given a sequence 𝑠 = 𝑖10, 𝑖2, 𝑖5, 𝑖8 ;

𝑃 𝑖candidate 𝑖10, 𝑖2, 𝑖5) ⇒ ranki8 = 9

𝑆test = {𝑠1, 𝑠2, 𝑠3} with respective ranks of actual items

are 3, 6, 11. Recall@5 = 1/3; Recall@10 = 2/3

• Mean Reciprocal Rank (MRR)

14

Result – Twitter - Recall@1%

15

#Group 𝒢 = 2, #Context Factor Level ℛ = 2, #Feature |F| = 7

12

16

20

24

28

32

5 10 15

Rec

all@

1%

Number of States Twitter

HMM

SEQ-T

SEQ-E

SEQ*

+4.1

+5.1

+4.8

Result – Yes.com - Recall@1%

16


12

16

20

24

28

32

5 10 15

Rec

all@

1%

Number of States Yes.com

HMM

SEQ-T

SEQ-E

SEQ*

+10

.3

+6.3

+4.5

Experimental Setup – Synthetic Dataset

• Research Questions:

– Can the implementation recover parameters from the synthetic

dataset?

– Could the effect of latent user and context factors be simulated by

increasing the number of HMM’s states?

• Generative process:

• Task: Last item prediction

• Evaluation metrics: Recall@1, MRR

Dataset #Observation #SequenceAverage

Length

Synthetic 4 10k 10

17

Result - Synthetic Dataset – Recall@1

50

60

70

80

90

2 6 10 14

Recall@1

NumberofStates

HMM SEQ-T

SEQ-E SEQ*

18


Conclusion

• Introduce and model dynamic user and context factors

to capture sequential preferences.

• The proposed model contributes statistically significant

improvement as compared to the baseline HMM in term

of top-K recommendations.

19

Thank you!

Q&A

Any further questions, please contact us:

[email protected]

[email protected]

[email protected]

Backup Slides

Result – Yes.com – Tuning Parameters

0.032

0.035

0.037

0.040

15

16

17

18

19

2 4 6 8 10

MRR

Recall@1%

NumberofFeatures

Recall@1%

MRR0.030

0.033

0.035

0.038

18

19

20

2 4 6 8

MRR

Recall@1%

NumberofContextFactorLevels

Recall@1%

MRR

0.035

0.045

0.055

0.065

20

23

26

29

32

2 4 6 8MRR

Recall@1%

NumberofGroups

Recall@1%MRR

12

Synthetic Dataset – Generative Process

• Initial Probability: 𝜋 = 0.8, 0.2

• Latent Context Factor Distribution: 𝜌 = 0.3, 0.7

• Group Distribution: 𝜎 = {0.9, 0.1}

• Transition Tensor A:

The first context factor favors self-transition to the same state

The second context factor encourages the state switching.

𝐴 = 𝐴0, 𝐴1 ; 𝐴1 =0.01 0.990.70 0.30

;𝐴0 =0.99 0.010.30 0.70

#Group 𝒢 2#Context Factor

Level ℛ2

#States 𝒳 2 #Feature |F| 4

#Observation 𝒴 4 #Feature values |ℱ| 2

10

Result – Twitter – Recall@K & MRR

13


Result – Yes.com – Recall@K & MRR

12


Result - Synthetic Dataset – MRR

0.75

0.80

0.85

0.90

0.95

2 6 10 14

MRR

NumberofStates

HMM SEQ-T

SEQ-E SEQ*

15


Synthetic Dataset – Generative Process

• Emission Tensor B:

Each pair of (state, group) favors one of the four items

𝐵 = 𝐵0, 𝐵1 ; 𝐵0 =

0.991 0.0030.0030.0030.003

0.0030.9910.003

; 𝐵1=

0.003 0.0030.9910.0030.003

0.0030.0030.991

• Feature matrix C:

Each context factor level is associated with 2 of the 4 features.

𝐶 = 𝐶0, 𝐶1 ; 𝐶0 =

0.10 0.900.200.900.90

0.800.100.10

; 𝐶1=

0.90 0.100.900.100.30

0.100.900.70

11

Modeling Sequential Preferences with Dynamic User and … · 2021. 6. 11. · Peter The notion of...

Documents

Transcript of Modeling Sequential Preferences with Dynamic User and … · 2021. 6. 11. · Peter The notion of...