Predicting The Next App That You Are Going To Use

Predicting The Next App That You Are

Going To Use

1

Ricardo Baeza-YatesDi JiangFabrizio SilvestriBeverly Harrison

The Idea

2

Yahoo Aviate Dataset: Events Distribution

3

Is App Prediction Easy?

4

Why Frequency is Not the best Signal?

5

Timeslots

# apps opened in timeslot i

# times a is opened in timeslot i

Classification Based Approach: Features

6

Basic Features Session Features

Time Last App Open

Latitude Last Location Update

Longitude Last Charge Cable

Speed Last Audio Cable

GPS Accuracy Last Context Trigger

Context Trigger Last Context Pulled

Context Pulled

Charge Cable

Audio Cable

Classification Based Approach: +1’s and -1’s

7

+1

-1’s

Session Features via Word2Vec

8

Open Skype

Location changed to XXX YYY

Opening Mail

Charge cable plugged

123 3245

543 56232

544 56830

42 32113

High cosinesimilarity

Classification Based Approach: Models Tested

• Naïve Bayes

• SVM

• C4.5

• Tree Augmented Naïve Bayes

• Softmax

9

Experiments: The Dataset

10

• Yahoo Aviate log data– From October 2013 to April 2014

• 480 active users chosen u.a.r.

• 80/20 train/test split– training done on a sliding window of 12 hours.

Experiments Results: Dominant Apps Filtered

11

90.2%

State-of-the-art methods attains up to ∼20% of

precision for the same task.

App Cold Start

• When a new app is installed we do not have any signal for it:– In particular, P(aui), the probability that u opens app i, is unknown.

– P(fi|pa(fi)), the prior probability for a given feature, can be instead obtained from other users.

12

Short-Term vs. Long-Term Apps• We fit app usage data into a Beta(α,𝛽) variable and we compute the excess kurtosis

a positive value indicates a short-term app while a negative value is likely to indicate a long-term one.

13

P(aui) Estimation Using Users’ Data

14

• Short-term apps:– P(aui) = P(ai)

– After a fixed amount of time (e.g. 2 hours) we start using P(aui) from user’s history.

• Long-term apps:– (Bayesian Average), where

– is the number of times app ai has been opened by u

– is the total number of apps opened by u, and

– is the average no. of times ai has been opened, in general.

– C is the weight we give to the “other users” component.

App Cold Start Experiments• For each app install, we process the original dataset by splitting it into

two subsets:– one subset containing all the events referred to the period before the app install one

subset containing the remaining events. Newly installed apps in the dataset 1.42%

• Short-term apps– Not using P(aui) estimation: 86.3%; using P(aui) estimation 87.1%

– New apps precision: 91.3%; old apps precision 86.25%.

• Long-term apps

– The general method attains 89.3% of precision but no newly installed app are correctly predicted.

– Using our method we increase the precision up to 90.3%: New apps precision: 91.1%; old apps precision 89.26%.

15

User Cold Start

• Most similar user: naïvely select the user with the most similar app inventory– Pros: easy to implement, high coverage

– Cons: the most similar user might be, in fact, very different from the user considered.

• Pseudo user: minimum set cover–min over the sum of inverse similarity

– Pros: the pseudo user is designed to by very similar from the user considered.

– Cons: NP-Hard (log n approximation exists)

16

User Cold Start Experiments

• Most Similar User– Average precision of 32.7%

– “Scarcity” of similar users: Jaccard similarity between two different app inventories has an average value of 0.121465 (±0.038955) and a median of 0.117647

– Even when similarity is high accuracy increase is not satisfactory.

• Pseudo User– Average precision is 45.7%

17

Conclusions

• Presented a (scalable) personalized app prediction methodology achieving up to 90.2% precision.

• Dealt with two cold-start problems:– On app cold start we achieve precision results up to 87.1% (short-

term) and 90.3% (long-term)

– On user cold start we achieve up to 45.7% of precision in the first day the user is using the homescreen launcher.

18

Open Problems

• The biggest open problem is to improve the effectiveness of prediction at cold start, in particular user cold start.

• Is it possible to extend the technique used here to top-k personalized app recommendation?

19

Predicting The Next App That You Are Going To Use

Science

Transcript of Predicting The Next App That You Are Going To Use