Peer-to-Peer(P2P) Zhenxiang CHEN Network Center of Jinan University czx@ujn.edu.cn.

Post on 13-Jan-2016

217 views 0 download

Transcript of Peer-to-Peer(P2P) Zhenxiang CHEN Network Center of Jinan University czx@ujn.edu.cn.

Peer-to-Peer(P2P)

Zhenxiang CHENNetwork Center of Jinan University

czx@ujn.edu.cn

Table of Contents

• Background• Definitions• P2P based applications• P2P structure• Challenges• Platform and tools• Conclusion• References and workgroup

Background

What is peer

Firewall

因特网洲际主干

洲际主干

ISP

消费者用户

第三方内容

Web 服务器

应用服务器

因特网主干

地区网络

企业网提供商

ISP

专业提供商

本地 ISP

T1

社团用户社团网络

数据库

消费者用户

Peer

Peer

Peer

What is overlay network

• Overlay networks create a structured virtual topology above the basic transport protocol level that facilitates deterministic search and guarantees convergence.

IP

Overlay

IP

Overlay

Application Layer Network

• Application Layer Network• Overlay Network• 网络:定义主机之间通信的寻址

方式、路由方式和服务模型• 在现有的 Internet 传输网络之

上构建一个完全位于应用层的网络系统• 拓扑发现,路由等功能完全由

应用层自己完成,不依赖网络层

• 基于 Internet 网络的大规模的分布式应用

Futures of Application layer Network

• 优点:• 易于部署,不依赖于网络设备的升级• 可扩展性好

• 缺点:• 增加了复杂性和处理开销• 无法利用最佳路由,增加了延迟• 破坏了网络的分层结构模型

• 路由具有“自私”特性 (AS/bandwidth/get and offered)

Other Overlay Networks

• Peer-to-Peer systems• Application layer multicast• VPN• Service Overlay Networks• 6bone• Content distribution networks

6/4

Internet v4

6/4 4

6

6

6

6/4NAT

12

… …

n-1

A:Sarnoff A:Sarnoff ’’law :law : 规模是规模是 O(n)O(n)

B: Metcalfe B: Metcalfe ’’law :law : 规模是规模是 O(nO(n22))

CCnn33

CCnnn-1n-1

CCnn22

CCnn22

C: Reed C: Reed ’’lawlaw :规模是:规模是 O(2O(2nn))

Sarnoff Sarnoff ’’law(law( 萨萨罗夫定律罗夫定律 )) :效益规:效益规模是模是 O(n)O(n) :网络是:网络是广播媒介,任广播媒介,任 11 发送发送者(设备)和多个者(设备)和多个(n-1)(n-1) 接收者(设接收者(设备)。备)。

Metcalfe Metcalfe ’’lawlaw (( 梅特卡夫定律):效梅特卡夫定律):效益规模是益规模是 O(nO(n22)) 网络网络是全互连媒介,任何是全互连媒介,任何11 个设备可与其它个设备可与其它 n-n-11 个交互,同时存在个交互,同时存在n(n-1)=nn(n-1)=n22-n-n 个并个并发执行的事务发执行的事务

Reed Reed ’’lawlaw (瑞德(瑞德定律):效益规模是定律):效益规模是O(2O(2nn)) :网络是群组:网络是群组媒介。网络可建立媒介。网络可建立CnCn22+Cn+Cn33++……CnCnnn--1+Cn1+Cnnn = 2 = 2nn-n-1 -n-1 个个小组小组

Network service scale rules

Problem

• Client-Server and Web architectures are inherently centralized.

• Some problems involve distributed control, distributed data, or a hierarchical organizational structure.

• Fitting a centralized solution to a decentralized problem makes a poor solution.

Thick Thick ClientClient

Thick Thick ClientClient

ServerServer

Thin ClientThin Client

Web ServerWeb Server

Database Database ServerServer

Middle TierMiddle Tier

BrowserBrowserBrowserBrowser

P2P Architecture

• P2P means actors in the system talk directly with each other as equals.

• Can decentralize some or all of the solution.• Represents distributed or hierarchical information

models.• Moves data and control to where the action is.

Definitions

Definitions of P2P

• Intel 工作组:通过在系统之间直接交换来共享计算机资源和服务的一种应用模式

• R.l.Granham: 通过 3 个关键条件定义• 具有服务器质量的可运行计算机• 具有独立于 DNS 的寻址系统• 具有与可变连接合作的能力

• C.Shirky:• 利用因特网边界的存储 /CPU/ 内容 / 现场等资源的一种应用• 访问这些非集中资源意味着运行在不稳定连接和不可预知 IP 地址环境下, P2P

节点必须运行在 DNS 系统外边• 具备有效或全部的自治

• Milojicic et al. (HP) : P2P refers to a class of systems and applications that employ distributed resources to perform a critical function in a decentralized manner.

Controversy

• Is p2p a new approach?

Problems is peer-to-peer systems are neither new nor unique; they make us look for solutions

to old problems that we all worked around or tried to ignore before.

Andy Oram (O'Reilly & associates)speech at Free and Open Source

Software Developers's MeetingBrussels, BE, Feb. 2002

P2P based applications

Examples of p2p usage

• File-sharing applications• Distributed databases• Distributed computing (grid?)• Collaboration• Distributed games• Instant messaging• Ad hoc networks• Application-level multicast• Etc.

Peer-to-Peer Systems

Interesting P2P Applications

• Gnutella for dictionaries (with supernode)• Worldwide Lexicon, http://picto.weblogger.com

• Infrastructure for interoperability• Edutella (RDF-based Metadata Infrastructure),

http://edutella.jxta.org/

• Global-scale storage• Oceanstore, http://oceanstore.cs.berkeley.edu/

• Payments (not involving a bank)• PayPal (more than 0.4 billion accounts, payment

volume 15B/year per year, profit 230M/year)• eCount.com (email payment)

Interesting P2P Applications

• Instant Messaging• Jabber, http://www.jabber.org/• Skype, http://www.skype.com

• VoIP (good quality, latency and 256-bit encry, NAT/firewall traversal)• Skype , www.skype.com/

• Groupware• Groove, http://www.groove.net/

• FOAF (Friends-of-a-Friend)• FOAFNaut, http://www.foafnaut.org/ • Friends Reunited, http://www.friendsreunited.com/ • Orkut, http://www.orkut.com/ • Tribe.net, http://www.tribe.net/

• Application-layer multicast• PPlive,QQlive

Interesting P2P Applications

• 虚拟超级计算机 peer-to-peer technology 产生空前大量的计算能力

• 使医疗研究者能加速治疗方法的改进和药物的设计

• 加快癌研究的新发现

http://www.stanford.edu/group/pandegroup/Cosm/

http://members.ud.com/vypc/cancer/

Folding@home/ 蛋白质折叠和药物设计

Site@home/ 寻找地外文明计划http://www.equn.com/seticn/

P2P Structure

P2P Overlay Network structure

• Unstructured• Without prior knowledge

of the topology• Flooding

• Freenet• Gnutella• FastTrack/KaZaA• BitTorrent• Overnet/eDonkey

• Structured• Topology is tightly

controlled• DHT (distributed hash

table)

• CAN• Chord• Tapestry• Pastry• Kademlia• Viceroy

hybrid

Centralized model (Napster)

• File-sharing system• Almost distributed system

• The location of a document is centralized• The "transfer" is peer-to-peer

• Problems• Robustness• Scalability (?)

• Impacts• Lawsuits• Denial of service

INTERNET

locationserver

register

Document x?OK: Peer ZIP = a.b.c.d

Document x!

x

Non-structured system (Gnutella-like)

• Two phases (like Napster)• Localization + exchange

• No server• Open source

• gnutella.wego.com

• Distributed search• The query is flooded• Loop avoidance• Limited TTL (not all nodes are visited)

1

1

2

34

1

5

Freenet

• Anonymity• Replication, cache

• Routing• Local knowledge• cache• TTL limits search

FastTrack/KaZaA

5

3

2

11

4

metadata

metadata

Supernodes still use a broadcast protocolforsearch.

Related work: Skype From the KaZaA community

• Promote to super nodes• Peer cache of some super nodes• Based on availability, capacity

• Protocol among super nodes: ???• Other features

• Auto-detect NAT/firewall settings• Allows searching a user (e.g., kun*)• History of known buddies• All communication is encrypted• Conferencing

P

P

P

P

PP

PP

P

P P P

BitTorrent

seed

url

The tracker keeps track of all the owners and lookup peers.

Why structured?

• Query time, number of messages,network usage, per node state, etc.

Unstructured

P2P systems

Structured

Data availability• Decentralization• Scalability• Load balancing• Fault tolerance

Maintenance• Join/leave• Repair

Efficient searching• Proximity• Locality

• If present => find it• Flooding: not scalable• Blind search: inefficient

Core facility—DHT ( Distribute Hash Tables )

General concepts of DHTs

• Every object has a (hash) key• An object is stored at the node responsible for its

key• Every node maintains a small routing (hash) table

consisting of its neighboring nodes• All DHTs provide one elementary function

• lookup(key) node

The role of DHT in structured P2P

Chord lookup

Chord lookup w/ finger table

id-space = 2m

m = 6

size = m

Challenges

Technical Challenges of P2P

• Decentralization• Control• Security• Sustainability• Management

Decentralization

• Fully decentralized means every peer is an equal participant and no peers have special or administrative abilities

• Fully decentralized is difficult and many P2P systems are hybrids

• Decentralization is a tool, not a goal• Centralize the parts that need to be fast and need

to scale• Decentralize the parts required by the problem

model

Control

• Myth: P2P has no control over their systemsTruth: P2P has no central control, but each peer is constrained by its own rules

• Myth: P2P systems must rely on honor system and are prone to malicious usersTruth: P2P systems have a design tradeoff, openess vs. susceptibility

• Myth: There is no way to control the data in a P2P systemTruth: No one has super-user access to the data. But users control the data they create.

• Myth: P2P has anonymous users with no accountabilityTruth: Mechanisms like pseudonyms allow anonymity while enforcing accountability

• Myth: P2P systems can’t exclude known malicious usersTruth: Decentralized user access is possible but tricky

Security

• P2P applications can be made secure much like the IP protocol

• Encryption can ensure that a file is unread and unmodified even if it passes through the control of malicious peers (eg. Freenet)

• Data’s origin can be ensured even though anyone can add data to the system (eg. Groove)

Sustainability

• You need a cool idea and a critical mass• System must be easy to use• Normal use of the system needs to contribute to

the system • Imposition on users must be things they don’t mind

Managemnet

• Selfishness• Equitableness• Impact to IP networks• Copyright and laws

Platform and tools

Jini – a service broker

• Jini is a Java-based service toolkit• Provides service broker called Jini Lookup Service• Provides discovery and notification API• Service stubs passed to requester

Jini Lookup Service

ServiceRequester

Jini ServiceProvider 1

Jini ServiceProvider 2

Jini ServiceProvider 3

Need service X with attribute A

Service XAttributes: B, D

Service YAttributes: C

Service XAttributes: A, D

ServiceProvider 3

JXTA (Sun)

• Open platform for p2p cooperation

• Interoperability• Any system/peer/application

• Platform independency• Languages (C, Java, etc)• Systems platforms (Unix, Windows, etc)• Networking platforms (802.11, Bluetooth, TCP/IP, etc)

• Ubiquity• Sensors, PDAs, routers, desktops, laptops, storage

systems

JXTA (Sun)

• Objectives• Find peers and resources• Share files with anyone across the network • Create a particular group of peers across different

networks • Communicate securely with peers across public networks

• Projects• Applications (24 projects)• Core (13 projects)• Demos (3 projects)• Forge (15 projects)• Other (12 projects)• Services (24 projects)

JXTA (Sun) Protocols

• Peer discovery protocol• Peer resolver protocol• Peer information protocol• Pipe binding protocol• Endpoint routing protocol• ……

JXTA (Sun)

Peer (Desktop, cell phone, PDA, etc.)

Security

Peer Groups Peer Pipes Peer Monitoring

JXTA Community ServicesSun JXTAServices

JXTAShell

PeerCommands

JXTA Community Applications

CORE

JXTA

JXTA applications

PlanetLab

• Testbed to experiment with your networked applications. • >400 nodes, >150 sites, • PlanteLab consortium: 80+ universities, Intel, HP

• View presented to users: a distributed set of VMs• Allocation unit: a slice = a set of virtual machines (VM),

one VM at each node.

452 nodes162 sites450 research projects

VMM VMM VMM VMM

S

lice

K

OS S

lice

K

OS S

lice

K

OS S

lice

K

OS

http://www.planet-lab.org/

PlanetLab usage examples

• Stress-test your Grid services (Globus RLS)• GSLab: a playground to experiment with grid-services • ‘Better-than-Internet’ services:

• Resilient Overlays • Multipath TCP (mTCP)• Multicast Overlays

VMM VMM VMM VMM

OS

OS

OS

OS

Use

r ac

coun

t

Use

r ac

coun

t

Use

r ac

coun

t

Use

r ac

coun

t

Conclusions

Reviews

• P2P solutions can fit the problem model better than client-server or web solutions

• P2P solutions can do some cool things• P2P solutions can be production quality, but have

different issues than client-server or web solutions• It is not hard to code a P2P solution• Interesting application • Big challenge

Final remarks

• P2P implies a very large spectrum of areas• High interest in both academicals/industrials• Much has already been done, but no conclusions

are definitive• IPv6 and P2P

• NAT, firewalls, IPv6 as an overlay

• Many open issues• Trust, security, scalability, QoS, etc.

P2P related research in future

安全和保护诚信匿名声誉

智能代理 /Web-based 服务

比赛安排服务描述

网络结构和设计Network Topology

RoutingOverlay Networks

分布式数据库查询分解查询分布

仲裁

P2P

社会人际小世界现象

Power-Law 网络

商业和法律问题商业模式知识产权

分布式数据结构分布式 Hash 表

可扩展分布式数据

网络结构和设计网络拓扑路由

重叠网络

可扩展路由和对象可扩展路由和对象定位定位

性能提高性能提高 语义重叠网络语义重叠网络 P2PP2P 算法算法 复制复制 基于基于 WebWeb 的信息的信息

搜索搜索 激励和公平激励和公平 隐私隐私 // 安全安全 // 诚信诚信

What can we learn from P2P?

P2P 系统实例研究摘要 ----- 系统特点

分类 可选解 平台 语言 / 工具 不同点 网络

Avaki 分布式计算

单装置 HPC超级计算

Linux, WinSolaris

OO,ParalFort.Ada,C

分布管理、异质、安全高并

因特内网

SETI@home 分布式对象 所有通用OS

Closed source 大规模 因特

Groove

协同

Web-Based协同

Windows JavaScript,VB,Perl,C++,XML 回放、自更新 因特内网

Magi 分布文件聊天 / 消息

WindowsMac

Java,XML,HTTP,WebDAV

基于 HTTP平台独立

因特Ad-hoc

Freenet内容共享

匿名可信单点

Any withJava

Java 实现和 APIs 匿名保存 因特

Gnutella 中心服务 WindowsLinux

Java,C 协议 因特

JXTA平台

C/S Solaris Linux, Win

Java,C,Perl 开发源码 因特

.NET/My Service

Web-Based Windows C#,VC++,JScripVBScrip,VB

基于 MS 应用 因特移动

P2P系统

实例特点比较研究 -------- 系统特点

非集中化 可扩展性 匿名 自组织 权成本

Ad-hoc 性能 安全 透明性 容错 交互

Avaki 无中心 1000测 2-3千

N/A 失效重构 低 进出计算资源 加速 加密认证管理域

本地HW/SW异质

校验重启可靠报文

同 SUN网格

SETI@home

主从 百万 中 低 很低 进出计算资源 大加速 私有 高 定时校验 IP?

Groove 混合P2P

N/A 差 高 低 协同进出

中 共享空间 /认证授权

高 消息进队列

基于 IP

Magi 混合P2P

约 100 N/A N/A 低 伙伴进出

N/A 证书授权 离线伙伴通信

消息进队列

JXTA/WEebD

AV

Freenet 纯 P2P 理论LogN

高 高 低 Peers 的进出 中 匿名 / 防DOS

高 无单点故障

Gnutella 纯 P2P 千 低 高 低 Peers 的进出 低 不明 中 再用下载 IP?

JXTA 纯 P2P 嵌入式系统

N/A N/A 低 Peers 的进出 N/A 加密算法 / 分布

信任

低 低 低

.NET/My 混合 世界范围 N/A 中 低 Peers 的进出 高 基于护照 高 复制 SOAP/XML/UDDI/WSDL

P2P系统

实例商业模式的比较研究 ----- 系统特点

收入模式 支持应用 知名用户 竞争者 基金 商业模式

Avaki 产品和开放源码 计算网格共享安全数据

无 /科学实验室评价

平台计算Globus

Startup N/A

SETI@home

学术研究 关闭 学术 cancer@home...

政府 售机加屏保

Groove 产品 进销存 N/A Magi IPO 选 Lotus协同工具

Magi 产品和开放源码 共享文件消息聊天 全球 e 技术媒体软

Groove Startup N/A

Freenet 开放源码 文件共享 公共 N/A Startup N/A

Gnutella 开放源码 文件共享 公共 N/A 公共领域 选 P2P 算法

JXTA 开放源码& 所有权扩展

文件共享事件通知 多 P2P端口到 NET/Myservice

Sun支持的公域

公用 P2P 平台

.NET/My 所有权 & 开放源码标准

微软办公其它 MS 大基数 AOL/J2EE/JXTA

MS 内部 普适平台

系统和应用需求

解决方案比较 1---- 系统类型

Centralized C/S Peer to Peer

非集中化 低(无) 高 很高

Ad-hoc 行为 无 中 高

产权成本 很高 高 低

匿名 低(无) 中 很高

可扩展性 低 高 高

性能 单独高聚合低 中 单独低聚合高

容错 单独高聚合低 中 单独低聚合高

自组织 中 中 中

透明性 低 中 中

安全 很高 高 低

交互性 标准化 标准化 正在进行

目标 标准 解决方案比较 2---- 系统类型Centralized C/S P2P

用户普适性 低 中 高技术水平 低 高 中

复杂性 高 低 中信誉声望 高 中 低

开发者复杂性 高 直接 典型- N0

支撑能力 低 高 中工具 中(私有) 高-标准 低(少)兼容性 中 高 低

IT记帐能力 高 中 低

在控 高(全) 中 低管理能力 中 高 低标准 中(私有) 高 低(无)

Main references

• Eng Keong Lua et al. “A Survey and Comparison of Peer-to-A Survey and Comparison of Peer-to-Peer Overlay Network SchemesPeer Overlay Network Schemes,” IEEE Communications Surveys and Tutorials, Vol 7, No 2 (Second Quarter, 2005), pp. 72-93.

• Ion Stoica, Robert Morris, et al. “Chord: A Scalable Peer-to-Chord: A Scalable Peer-to-peer Lookup Service for Internet Applicationspeer Lookup Service for Internet Applications,” Proceedings of ACM SIGCOMM 2001, San Deigo, CA, August 2001, pp. 149-160.

• Diego Doval and Donal O’Mahony, “Overlay networks: a Overlay networks: a scalable alternative for P2Pscalable alternative for P2P,” IEEE Internet Computing, Vol 7, No 4 (July-August 2003), pp. 79-82.

References

• Distributed Computing• Distributed (www.distributed.net)• SETI@home (www.seti.org)• Genome@home (gah.stanford.edu)• Folding@home

(www.stanford.edu/group/pandegroup/folding)• Global Grid Forum (www.globalgridforum.org)• Globus Project (www.globus.org)

• File sharing• Napster (www.napster.com)• Gnutella (gnutella.wego.com)• Kazaa (www.kazaa.com)

References

• Distributed hash tables• CAN (www.acm.org/sigs/sigcomm/sigcomm2001/p13-

ratnasamy.pdf)• Pastry (research.microsoft.com/~antr/Pastry)• Chord (www.pdos.lcs.mit.edu/chord)• Tapestry (www.cs.berkeley.edu/~ravenben/tapestry)• Freenet (freenet.sourceforge.net)• Kademlia (kademlia.scs.cs.nyu.edu)

• Ad hoc networking• AODV (www.ietf.org/internet-drafts/draft-ietf-manet-

aodv-13.txt)• OLSR (www.ietf.org/internet-drafts/draft-ietf-manet-olsr-

10.txt)• Tribe (rp.lip6.fr/site_rp/_publications/350-79Viana.ps.gz)

References

• Platforms • JXTA (www.jxta.org)• .NET (www.microsoft.com/net)

• Collaboration• Groove (www.groove.net)• Endeavors (www.endeavors.com)

• IPv6 as a p2p overlay• Working Groups

• p2p.internet2.edu• www.openp2p.com

Slides borrowed

• Chord: A Scalable Peer-to-peer Lookup Service for Internet Applicationshttp://pdos.csail.mit.edu/~rtm/slides/sigcomm01.ppt

• P2P-SIP: Peer to peer Internet telephony using SIPhttp://www1.cs.columbia.edu/~kns10/research/p2p-sip/

Working groups et al.

• A generic site on p2p from O'Reilly• www.openp2p.com

• P2P working group• www.peer-to-peerwg.org/

• Internet2 p2p working group• p2p.internet2.edu

• Peer-to-peer development (p2p-hackers)• zgp.org/mailman/listinfo/p2p-hackers

• Interesting meeting• www.codecon.org

Reading

• CAN• Chord

• Tapestry• Pastry