Forecasting Audience Increase on Youtube

22
Forecasting Audience Increase on YouTube Matthew Rowe Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom

description

 

Transcript of Forecasting Audience Increase on Youtube

Page 1: Forecasting Audience Increase on Youtube

Forecasting Audience Increase on YouTube

Matthew Rowe

Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom

Page 2: Forecasting Audience Increase on Youtube

Forecasting Audience Increase on YouTube 2

Reputation on the Social Web

• Reputation is:“the beliefs or opinions that are generally held about

someone or something”

• On the Social Web, reputation = greater influence– Important to information flow– Control information diffusion

• How to quantify reputation?– Greater audience = greater reputation– Greater reputation = greater influence– How to measure ‘reputation’?

• In-degree – i.e. number of ‘in links’• Audience levels, subscriber counts

Page 3: Forecasting Audience Increase on Youtube

Forecasting Audience Increase on YouTube 3

Influential Social Nodes

Page 4: Forecasting Audience Increase on Youtube

4

Why Forecast?

• Users want to expand their audience– What can users do to increase their audience?– What factors contribute to increases?

• Solution: explore the relation between– Audience levels - i.e. in-degree, and;– Behaviour – of user and content

• Discover patterns, then use patterns for forecasting– Given my behaviour, will my audience grow?

Forecasting Audience Increase on YouTube

Page 5: Forecasting Audience Increase on Youtube

5

Features

• User behaviour statistics– In-degree – i.e. number of followers– Out-degree – i.e. number follows– User view count – number of posts viewed by

the user– Post count – number of posts uploaded by the

user• Content statistics

– Post view count – i.e. number of views– Favourite count – i.e. number of likes of content

Forecasting Audience Increase on YouTube

Page 6: Forecasting Audience Increase on Youtube

6

Schema Barrier

• Social Web platforms provide data using bespoke schemas– i.e. communicating through different languages

• Data from platform A == data from platform B• Schema from platform A != schema from

platform B

• Models must function across platforms– Enabling portable behaviour patterns

• How can we interpret data from different platforms?

Forecasting Audience Increase on YouTube

Page 7: Forecasting Audience Increase on Youtube

7

Behaviour Ontology

• Solution: OU Behaviour Ontology

• Defines behaviour in a common format– Extending the SIOC ontology– Captures ‘impact’

• Vital to capture time-stamped user statistics• Two classes for impact

– User impact• Models user features

– Post impact• Models post statistics

Forecasting Audience Increase on YouTube

www.purl.org/NET/oubo/0.23/

Page 8: Forecasting Audience Increase on Youtube

8

Data Collection: YouTube

• Gathered a dataset from the video-sharing platform YouTube

• One aim of usage is to increase ‘channel’ popularity– Gain more subscriptions

• For 10 days, at 4 hour intervals:– Logged 100 most recently uploaded videos

• Stopping once 2k were logged– Logged user + content stats for each video

• Randomly chose 10% for analysis– Split dataset into 80/20 for training/testing

Forecasting Audience Increase on YouTube

Page 9: Forecasting Audience Increase on Youtube

9

Forecasting Audience Increase

• How can we predict audience levels given observed features?

Forecasting Audience Increase on YouTube

Page 10: Forecasting Audience Increase on Youtube

10

Forecasting Audience Increase

• How can we predict audience levels given observed features?

Forecasting Audience Increase on YouTube

Coefficient/weight

Predictor/independent variable

Error/residual vector

Page 11: Forecasting Audience Increase on Youtube

11

Forecasting Audience Increase

• How can we predict audience levels given observed features?

• What features are good predictors?– i.e. can we induce a better model than above?– Perform model selection

Forecasting Audience Increase on YouTube

Coefficient/weight

Predictor/independent variable

Error/residual vector

Page 12: Forecasting Audience Increase on Youtube

12

Model Selection I

• To perform model selection:– Aim: maximise the coefficient of

determination– Procedure: average features

within the training split in the same time period

Forecasting Audience Increase on YouTube

Page 13: Forecasting Audience Increase on Youtube

13

Model Selection I

• To perform model selection:– Aim: maximise the coefficient of

determination– Procedure: average features

within the training split in the same time period

• First Model: all features

Forecasting Audience Increase on YouTube

Page 14: Forecasting Audience Increase on Youtube

14

Model Selection I

• To perform model selection:– Aim: maximise the coefficient of

determination– Procedure: average features

within the training split in the same time period

• First Model: all features

Forecasting Audience Increase on YouTube

Page 15: Forecasting Audience Increase on Youtube

15

Model Selection II

• How can we improve upon the previous model?

• Feature selection– Exhaustive search of

all possible feature combinations

– Optimize coefficient of determination

Forecasting Audience Increase on YouTube

Page 16: Forecasting Audience Increase on Youtube

16

Model Selection II

• How can we improve upon the previous model?

• Feature selection– Exhaustive search of

all possible feature combinations

– Optimize coefficient of determination

• Shows improvements using certain models

Forecasting Audience Increase on YouTube

Page 17: Forecasting Audience Increase on Youtube

Forecasting Audience Increase on YouTube 17

Model Selection III

• Exhaustive feature selection drops user view count

Page 18: Forecasting Audience Increase on Youtube

Forecasting Audience Increase on YouTube 18

Model Selection III

• Exhaustive feature selection drops user view count

Page 19: Forecasting Audience Increase on Youtube

Forecasting Audience Increase on YouTube 19

Forecasting I

• Now have 2 models to forecast with:– All features– Best features

Which model is best?

• Two experiments to test predictive power:– One-step forecast

• Train model on previous k-steps, predict k+1– Final-step forecast

• Predict t=10, train on previous k-steps– Predictions are user dependent

• Evaluation measure: Root Mean Square Error

Page 20: Forecasting Audience Increase on Youtube

Forecasting Audience Increase on YouTube 20

Forecasting: Results

• One-step

• Final Step

Page 21: Forecasting Audience Increase on Youtube

Forecasting Audience Increase on YouTube 21

Conclusions and Future Work

• Quantified reputation by audience levels• Content reception linked to increased levels:

– More content views = increased audience levels– More favourites = increased audience levels

• Able to accurately predict audience levels– Post feature selection improves performance

• Behaviour ontology captures required features– Common conceptualisation of behaviour

• Future work:– Extend analysis to a larger dataset– Applying models to additional platforms

Page 22: Forecasting Audience Increase on Youtube

22

QUESTIONS

Questions?people.kmi.open.ac.uk/[email protected]@mattroweshow

Forecasting Audience Increase on YouTube