Predicting The Next App That You Are Going To Use

20
Predicting The Next App That You Are Going To Use 1 Ricardo Baeza-Yates Di Jiang Fabrizio Silvestri Beverly Harrison

Transcript of Predicting The Next App That You Are Going To Use

Page 1: Predicting The Next App That You Are Going To Use

Predicting The Next App That You Are

Going To Use

1

Ricardo Baeza-YatesDi JiangFabrizio SilvestriBeverly Harrison

Page 2: Predicting The Next App That You Are Going To Use

The Idea

2

Page 3: Predicting The Next App That You Are Going To Use

Yahoo Aviate Dataset: Events Distribution

3

Page 4: Predicting The Next App That You Are Going To Use

Is App Prediction Easy?

4

Page 5: Predicting The Next App That You Are Going To Use

Why Frequency is Not the best Signal?

5

Timeslots

# apps opened in timeslot i

# times a is opened in timeslot i

Page 6: Predicting The Next App That You Are Going To Use

Classification Based Approach: Features

6

Basic Features Session Features

Time Last App Open

Latitude Last Location Update

Longitude Last Charge Cable

Speed Last Audio Cable

GPS Accuracy Last Context Trigger

Context Trigger Last Context Pulled

Context Pulled

Charge Cable

Audio Cable

Page 7: Predicting The Next App That You Are Going To Use

Classification Based Approach: +1’s and -1’s

7

+1

-1’s

Page 8: Predicting The Next App That You Are Going To Use

Session Features via Word2Vec

8

Open Skype

Location changed to XXX YYY

Opening Mail

Charge cable plugged

123 3245

543 56232

544 56830

42 32113

High cosinesimilarity

Page 9: Predicting The Next App That You Are Going To Use

Classification Based Approach: Models Tested

• Naïve Bayes

• SVM

• C4.5

• Tree Augmented Naïve Bayes

• Softmax

9

Page 10: Predicting The Next App That You Are Going To Use

Experiments: The Dataset

10

• Yahoo Aviate log data– From October 2013 to April 2014

• 480 active users chosen u.a.r.

• 80/20 train/test split– training done on a sliding window of 12 hours.

Page 11: Predicting The Next App That You Are Going To Use

Experiments Results: Dominant Apps Filtered

11

90.2%

State-of-the-art methods attains up to ∼20% of

precision for the same task.

Page 12: Predicting The Next App That You Are Going To Use

App Cold Start

• When a new app is installed we do not have any signal for it:– In particular, P(aui), the probability that u opens app i, is unknown.

– P(fi|pa(fi)), the prior probability for a given feature, can be instead obtained from other users.

12

Page 13: Predicting The Next App That You Are Going To Use

Short-Term vs. Long-Term Apps• We fit app usage data into a Beta(α,𝛽) variable and we compute the excess kurtosis

a positive value indicates a short-term app while a negative value is likely to indicate a long-term one.

13

Page 14: Predicting The Next App That You Are Going To Use

P(aui) Estimation Using Users’ Data

14

• Short-term apps:– P(aui) = P(ai)

– After a fixed amount of time (e.g. 2 hours) we start using P(aui) from user’s history.

• Long-term apps:– (Bayesian Average), where

– is the number of times app ai has been opened by u

– is the total number of apps opened by u, and

– is the average no. of times ai has been opened, in general.

– C is the weight we give to the “other users” component.

Page 15: Predicting The Next App That You Are Going To Use

App Cold Start Experiments• For each app install, we process the original dataset by splitting it into

two subsets:– one subset containing all the events referred to the period before the app install one

subset containing the remaining events. Newly installed apps in the dataset 1.42%

• Short-term apps– Not using P(aui) estimation: 86.3%; using P(aui) estimation 87.1%

– New apps precision: 91.3%; old apps precision 86.25%.

• Long-term apps

– The general method attains 89.3% of precision but no newly installed app are correctly predicted.

– Using our method we increase the precision up to 90.3%: New apps precision: 91.1%; old apps precision 89.26%.

15

Page 16: Predicting The Next App That You Are Going To Use

User Cold Start

• Most similar user: naïvely select the user with the most similar app inventory– Pros: easy to implement, high coverage

– Cons: the most similar user might be, in fact, very different from the user considered.

• Pseudo user: minimum set cover–min over the sum of inverse similarity

– Pros: the pseudo user is designed to by very similar from the user considered.

– Cons: NP-Hard (log n approximation exists)

16

Page 17: Predicting The Next App That You Are Going To Use

User Cold Start Experiments

• Most Similar User– Average precision of 32.7%

– “Scarcity” of similar users: Jaccard similarity between two different app inventories has an average value of 0.121465 (±0.038955) and a median of 0.117647

– Even when similarity is high accuracy increase is not satisfactory.

• Pseudo User– Average precision is 45.7%

17

Page 18: Predicting The Next App That You Are Going To Use

Conclusions

• Presented a (scalable) personalized app prediction methodology achieving up to 90.2% precision.

• Dealt with two cold-start problems:– On app cold start we achieve precision results up to 87.1% (short-

term) and 90.3% (long-term)

– On user cold start we achieve up to 45.7% of precision in the first day the user is using the homescreen launcher.

18

Page 19: Predicting The Next App That You Are Going To Use

Open Problems

• The biggest open problem is to improve the effectiveness of prediction at cold start, in particular user cold start.

• Is it possible to extend the technique used here to top-k personalized app recommendation?

19

Page 20: Predicting The Next App That You Are Going To Use

20