1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹,...

37
1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹Dept. of Computer Science, University of Calgary, Canada ²Enterprise Systems & Software Lab, HP Labs, USA ³Dept. of Computer Science and Engineering, IIT Delhi, India
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    214
  • download

    1

Transcript of 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹,...

Page 1: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

1

YouTube Traffic Characterization:

A View From the Edge

Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³

¹Dept. of Computer Science, University of Calgary, Canada

²Enterprise Systems & Software Lab, HP Labs, USA

³Dept. of Computer Science and Engineering, IIT Delhi, India

Page 2: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

2

Introduction

The way people use the Web is changing.

Creation and sharing of media: Fast, easy, cheap!

Volume of data associated with extremely popular online media.

Page 3: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

3

What is Web 2.0? User generated content

Text: Wordpress, Blogspot Photos: Flickr, Facebook Video: YouTube, MySpace

Social Networking Facebook, MySpace

Tagging Flickr, YouTube

Page 4: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

4

YouTube: Facts and Figures

Founded in February 2005 Enabled users to easily share movies

by converting them to Flash Largest video sharing Website on

the Internet [Alexa2007] Sold to Google for $1.65 billion in

November 2006

Page 5: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

5

How YouTube Works (1/2)

GET: /watch?v=wQVEPFzkhaM

OK (text/html)

GET: /vi/fNaYQ4kM4FE/2.jpg

OK (img/jpeg)

Page 6: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

6

How YouTube Works (2/2)

GET: swfobject.js

OK (application/x-javascript)

GET: /p.swf

OK (video/flv)

GET: /get_video?video_id=wQVEPFzkhaM

OK (application/shockwave-flash)

Page 7: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

7

Our Contributions Efficient measurement framework One of the first extensive

characterizations of Web 2.0 traffic File properties File access patterns Transfer properties

Implications for network and content providers

Page 8: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

8

Outline

Introduction & Background Contributions Methodology Results Implications Conclusions

Page 9: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

9

Our View Points

Edge (University Campus) 28,000 students 5,300 faculty & staff /16 address space 300Mb/s full-duplex network link

Global Most popular videos

Page 10: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

10

Campus Data Collection Goals:

Collect data on all campus YouTube usage Gather data for an extended period of

time Protect user privacy

Challenges: YouTube’s popularity Monitor limitations Volume of campus Internet usage

Page 11: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

11

Our Methodology

Identify servers providing YouTube content

Use bro to summarize each HTTP transaction in real time

Restart bro daily and compress the daily log

Map visitor identifier to a unique ID

Page 12: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

12

Categories of Transactions

Complete – the entire transaction was parsed successfully

Interrupted – TCP connection was reset

Gap – monitor missed a packet Failure – transaction could not be

parsed

Page 13: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

13

Categories of Transactions (2)

Status % of Total % of Video

Complete 90.82 24.66

Interrupted 1.88 24.25

Gap 1.56 51.09

Failure 5.75 -

Page 14: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

14

Our Traces

Start Date: Jan. 14, 2007End Date: Apr. 8, 2007Total Valid Transactions:

23,250,438

Total Bytes: 6.54 TBTotal Video Requests:

625,593

Total Video Bytes: 6.45 TBUnique Video Requests:

323,677

Unique Video Bytes: 3.26 TB

Page 15: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

15

HTTP Response Codes

Code % of Responses % of Bytes

200 (OK) 75.80 89.78

206 (Partial Content)

1.29 10.22

302 (Found) 0.05 0.00

303 (See Other) 5.33 0.00

304 (Not Modified)

17.34 0.00

4xx (Client Error) 0.19 0.00

5xx (Server Error) 0.01 0.00

Page 16: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

16

Global Data Collection Crawling all videos is infeasible Focus on top 100 most popular

videos Four time frames: daily, weekly,

monthly and all time. 2 step data collection:

Retrieve pages of most popular videos Use YouTube API to get details on

these videos

Page 17: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

17

Outline

Introduction & Background Contributions Methodology Results Implications Conclusions

Page 18: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

18

Results

Campus Usage Patterns File Properties File Access Patterns Transfer Properties

Page 19: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

19

Campus Usage Patterns

ReadingBreak

Page 20: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

20

Results

Campus Usage Patterns File Properties File Access Patterns Transfer Properties

Page 21: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

21

Unique File Sizes

Video data is significantly larger than the other content types

Page 22: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

22

Time Since Modification Videos and

images rarely modified

Text and application data modified more frequently

Page 23: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

23

Video Durations Spike around 3 minutes likely music videos Campus videos are relatively short: μ=3.3

min

Page 24: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

24

Summary of File Properties

Video content is much larger than other content types

Image and video content is more static than application and text content

Video durations are relatively shortVideos viewed on campus tend to be

more than 1 month old

Page 25: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

25

Results

Campus Usage Patterns File Properties File Access Patterns Transfer Properties

Page 26: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

26

Relative Popularity of Videos Video popularity

follows a weak Zipf distribution

Possibly due to edge network point of view

β = 0.56

Page 27: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

27

Commonality of Videos

~10% commonality between consecutive days during the week

~5% commonality between consecutive days on the weekend

Page 28: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

28

Summary of File Referencing

Zipf distribution is weak when observed from the edge of the network

There is some overlap between videos viewed on consecutive days

Significant amount of content viewed on campus is non-unique

Page 29: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

29

Results

Campus Usage Patterns File Properties File Access Patterns Transfer Properties

Page 30: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

30

Transfer SizesFlash player (p.swf, player2.swf)

Javascripts

Page 31: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

31

Transfer Durations Video transfers

have significantly longer durations than other content types

Page 32: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

32

Summary of Transfer Properties

Javascript and flash objects have an impact on the size of files transferred

Video transfers have significantly larger sizes and durations

Page 33: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

33

Outline

Introduction & Background Contributions Methodology Results Implications Conclusions

Page 34: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

34

Implications for Network Providers

Web 2.0 poses challenges to caching Larger multimedia files More diversity in content

Meta data may be used to improve caching efficiency

Page 35: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

35

Implications for Content Providers

Multimedia content is large! 65,000 videos/day x 10MB/video = 19.5 TB/month

Long tail effect -> much of the content will be unpopular Cheap storage solutions

Longer transfer durations for video files more CPU cycles required for transfers

Page 36: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

36

Conclusions Multimedia content has much larger

transfer sizes and durations than other content types

From the edge of the network, video popularity follows a weak Zipf distribution

Web 2.0 facilitates diversity in content which poses challenges to caching

New approaches are needed to efficiently handle the resource demands of Web 2.0 sites

Page 37: 1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.

37

Questions?

[email protected]