Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

27
Mining TV-on-demand Services EPSRC project Dmytro Karamshuk

Transcript of Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Page 1: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Mining TV-on-demand Services

EPSRC project

Dmytro Karamshuk

Page 2: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Users - 32 M/month

IP address – 20 M/month

Sessions - 1.9 Billion

May 2013 – Jan 2014

≈ 50% of population

Large-scale study of BBC iPlayer

UK Population – 64M

2  x  INFOCOM’2015,  ToN’2015,  JSAC’2016

Page 3: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Longitudinal View across ISPs

Fixed-line Internet market (5 representative providers)

Mobile market is more dynamic than the fixed-line Internet market

Mobile Internet market (5 representative providers)

Page 4: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Data caps decrease market share

All-you-can-eat data(M1, M5)

Limited-cap data packages(M2 – M4)

All-you-can-eat plans boost user consumption

Page 5: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Temporal Patterns in different ISPsFixed-line accesses (F1-F5) peaks

in the evening hours

Mobile users watch more during commutes

Fixe

d Li

ned

ISPs

Mob

ile, l

imite

d da

ta

caps

Page 6: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

There is a problem…

Internet on trains in the UK is no good

A study shows that 23.2% 3G packets and 37.2% 4G packets on the major train routes failed

Page 7: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

A useful insight: users watch across networks

Users complete watching across different sessions and networks

Fixed-line ISPs Mobile ISPs

Per user completion ratio

Page 8: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Speculative Content Pre-fetching

Pre-fetch at home Watch during commutes

Page 9: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Speculative Content Pre-fetching

Not very efficient…

Per-user mobile savings with pre-fetching

Page 10: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Can we do better with predictive preloading?

Page 11: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Towards Predicting User PreferencesFeatured content

Most Popular Content

Page 12: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

How important are UI guidance?

For 20% of users > 60% of their access are from the Front Page

Page 13: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Content Types

11 channels

11 categories and 172 genres

thousands shows

Page 14: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

1 channel 2 channels 3 channels

20%0% 40% 60% 100%

1 category 2 category 3 categories

30%0% 75%55% 100%

1 genre 2 gen. 3 gen.

15%0% 40% 50%30%

4 gen.

100%

1 sh. 2 sh. 3

10%0% 25%20%

4 sh.

100%35%

User Focus on Different Content Types

share of users with all their sessions from:

out of 11 channels

out of 171 genres

out of thousands shows

out of 11 categ.

Page 15: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

importancecontent category 0.038

content genre 0.063 category affinity 0.042

genre affinity 0.103show affinity 0.179

channel affinity 0.043 content age 0.087

User PreferencesTotal importance: 0.555

importancefeatured content 0.061

featured position 0.061

content popularity rank 0.071

popularity position 0.008

featured probability 0.091

UI GuidanceTotal importance: 0.292

importancepreviously watched 0.066

completion ratio 0.081 probability of re-watching 0.007

Repeatedly Watched ContentTotal importance: 0.154

Engineering Features

Page 16: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Supervised Learning

Problem: For a given user U and an episode E predict whether U will watch E

Binary Classification Problem f(U,E) -> {0,1}

Random Forest: fast, good performance on high dimensional data

Negative Examples: randomly sample from what users did not watch

Predictions: Predict probability, rank all episodes by probability

Page 17: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Accuracy of Personalized Predictions

For 50% of users over 70% chance of fitting in Top-10 predictions

Page 18: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

When do we do predictions?

Front Pages are updated over night…

Page 19: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

When do we do predictions?

… and remain largely unchanged for 24h

Page 20: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

How much traffic can be saved?

Predictive pre-fetching can potentially save near 71% of mobile usage

Page 21: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

We made mobile users happy!How about the rest?

Page 22: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Access PatternsAverage per-user # sessions Correlation with Internet speed

Page 23: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Content Delivery for Home Broadband

Install more distributed caches

May requires significant investments

Any alternatives?

Problem: how to handle peak load from 32M users

Page 24: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Alternative: Peer-assisted Content Delivery

Content Serversuser

user user

user

useruser

average of 5K users online every sec in the first day after release

5K duplicates every second!!!

Ask users for assistance

Page 25: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Elegant Theoretical Model for very Complex Behavior

around 88% of savings can be achieved

Data AnalysisTh

eore

tical

Mod

el

G c 1 e c

Page 26: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Why it works?

Top-5% of the content corpus accounts for 80% of traffic

Most of accesses happen in the first day after release

Yes, it’s all about very popular content

Page 27: Take-away TV: Recharging Work Commutes with Greedy and Predictive Preloading of TV Content

Dmytro KaramshukKing’s College London

“True genius resides in the capacity for evaluation of uncertain, hazardous, and conflicting information” -

Winston Churchill