1
Having Fun with P2P at HUST(gridcast+pktown)
Bin Cheng, Xiaofei Liao
HUST & MSRAHuazhong University of Science & Technology Microsoft Research AsiaAisaSys, Shanghai, Nov. 12, 2008
2
GridCast: Video-on-Demand with Peer Sharing
Motivation―Desirable but costly
Key issues―Peer organization, scheduling, data distribution
Previous studies―Measurement, optimization, caching, replication
Ongoing work―Examine scheduling policy based on simulation
Future work―Analysis, model
3
Motivation of P2P VoD
VoD is more and more popular, but costly―Hulu, YouTube, Youku
P2P has helped lots of applications―File sharing―Live streaming
How does P2P help VoD?―Real-time playback―Random seek―The long tail of content―With acceptable UX, how to minimize server load?
4
GridCast: Hybrid Architecture
― Tracker: indexes all joined peers― Source Server: stores a complete copy of every video― Peer: fetches chunks from source servers or other peers― Web Portal: provides the video catalog
tracker
Source ServerWeb portal
5
What does GridCast look like?
http://www.gridcast.cn
6
Deployment of GridCast
GridCast has been deployed on CERNET since May of 2006―Network (CERNET)
• 1,500 Universities, 20 million hosts• Good bandwidth, 2 to 100Mbps to the desktop (core is complicated)
―Hardware• 1 Windows server 2003, shared by the tracker and the web portal• 2 source servers (share 100Mbps uplink)
―Content• 2,000 videos• 48 minutes on average• 400 to 800Kbps, 600 Kbps on average
―Users• 100,000 users (23% behind NATs) • 400 concurrent users at peak time (limited by our current infrastructure)
―Log (two logs, one for SVC, the other for MVC)• 40GB log (from Sep. 2006 to Oct. 2007)
7
Key research issues in GridCast
How to organize online peers for better sharing?
How to schedule requests for smooth playback?
How to optimize chunk distribution over peers?
8
Previous work: ring-assisted overlay
Assumptions―Huge number of peers watching the same video―Each peer only caches the recently-watched 5-minute
data RINDY: ring-assisted overlay network for
P2P VoD―Each peer keeps a set of neighbor―Near neighbor for sharing, far neighbor for routing―Gossip + exchange neighbor list
Advantages―Fast relocation of new neighborhood―Load balance―Efficient content sharing
9
Previous work: measurement study
User behavior―Random seek is not uncommon (4 seeks per view
session on average)―Forward is dominated (forward/backward = 7/3)―short seek is dominated (80% < 5 minutes)―The long tail
Performance―Simple prefetching helps to reduce seek latency―Even moderate concurrency improves system utilization
and UX―The correlation of UX to source server stress and
concurrency
10
Previous work: from SVC to MVC
Single Video Caching (SVC)― Only cache the current video for sharing
Multiple Video Caching (MVC)― Cache recently watched videos with at most 1GB disk space― Join in multiple swarming
From SVC to MVC― 90% users have over 90% unused upload and 60% unused
download― Upper bound achieved from simulation
0
20
40
60
80
100
Tue.Mon.Sun.Sat.Fri.Thur.
sour
ce s
erve
r lo
ad (
Mbp
s)
day of week
single video caching multiple video caching without resource constraints
Wed.0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100
110
un
use
d b
an
dw
idth
ca
pa
city
(%
)
users (normalized)
download
upload
11
Previous work: examining caching policies
Major results―Improve both UX and scalability!―Higher concurrency, better sharing―Larger scale, higher sharing utilization
12
Previous work: examining caching policies
Limitation―Larger cache is not always better
• Hot-spots, load imbalance is more serious
―Departure miss is a major issue• 43% chunk misses are caused by peer departure
5.3
27.6
15.6
11.3
4
new content peer departure peer eviction connection issue insuf. BW0
5
10
15
20
25
30
35
40
perc
enta
ge o
f al
l pla
yed
chun
ks (
%)
43%
13
Previous work: proactive replication
Basic idea―Push chunks to other peers before leaving
Fundamental tradeoff―Cost: use more bandwidth and disk space―Benefit: cover more future requests, reduce
misses
Three questions―Which, where, when?
Two predictors as its building blocks―Peer departure predictor―Chunk request predictor
14
Previous work: proactive replication
Major results―50% decrease of chunk misses, 15% decrease of server
load―Lazy simple is close to lazy oracle ―Aggressive replication leads to bad performance due to
higher cost
replication SS Load new content departure eviction connection bandwidth0
1000000
2000000
3000000
4000000
5000000
nu
mb
er
of ch
un
ks
before replication eager replication (efficiency = 0.21) lazy-oracle (a=0.0) (efficiency = 0.78) lazy-simple (a=0.0) (efficiency = 0.33)
15
Ongoing work: understanding scheduling
Scheduling―Which chunk to which neighbor―When
• Periodically• When the last requested chunk comes back
Adaptive scheduling algorithm―Be more suitable for random seek―Metric: continuity, chunk cost, # of
redundant transmission―Preliminary results generated now―More analysis required
16
Ongoing work: reducing hot-spots
Why does a peer become over-utilized?―Too many requests―Essentially, larger cache but limited
bandwidth
Solutions―Announce fake BM to other peers when
overloaded―Transfer hot content to other peers with
more upload capacity
17
Future work
Use a log-mining approach to helping scheduling
Optimize content distribution with social network
Develop some models to understand caching
18
PKTown: Gaming Platform with Peer-assistance
Basic idea―Objective, features
Research issues―Routing, relay and so on
Current status
19
Basic idea of PKTown
Launch a platform for gaming
Construct a large scale virtual LAN by using p2p
Using application-layer multicast to deliver broadcast message
Self-organized
20
Major research issues
How to organize all of peers with small latency together?
How to find out a better relay node to optimize the communication latency between two peers?
How to determine the best path to efficiently deliver one message to all of others, possibly with a latency bound?
21
Current status
Deployed over CERNET about1000 concurrency users at
peak time
22
Summary
Discuss major issues in P2P VoD and P2P gaming ( bandwidth-intensive & latency intensive)
Present some observations from a live P2P VoD system
Launch some open questions for further studies―Load balance―Best relay
23
Thanks!Any questions……
http://www.gridcast.cn
http://www.pktown.net
Top Related