CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John...

31
CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph http://www.cs.berkeley.edu/~kubitron/courses/cs294-4- F03
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    0

Transcript of CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John...

Page 1: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.1

8/25/03

CS294-4Peer-to-Peer systems

Introduction

August 25, 2003

John Kubiatowicz/Anthony Josephhttp://www.cs.berkeley.edu/~kubitron/courses/cs294-4-

F03

Page 2: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.2

8/25/03

What is Peer-to-Peer?• P2P is a communications model in which each

party has the same capabilities and either party can initiate a communication session.

Whatis.com• P2P is a class of applications that takes

advantage of resources – storage, cycles, content, human presence – available at the edges of the internet.

Clay Shirky, openp2p.com• A type of network in which each workstation has

equivalent capabilities and responsibilities.Webopedia.com

• A P2P computer network refers to any network that does not have fixed clients and servers, but a number of peer nodes that function as both clients and servers to other nodes on the network.

Wikipedia.org

Page 3: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.3

8/25/03

Is Peer-to-peer new?• Certainly doesn’t seem like it

– What about Usenet? News groups first truly decentralized system

– DNS? Handles huge number of clients– Basic IP? Vastly decentralized, many equivalent routers

• One view: P2P is a reverting to the old internet– Remember? (Perhaps you don’t)– Once upon a time, all members on the internet were trusted.– Every machine had an IP address.– Every machine was a client and server.– Many machines were routers.

• What is new?– Scale: people are envisioning much larger scale– Security: Systems must deal with privacy and integrity– Anonymity: Protect identity and prevent censorship– (In)Stability: Deal with unstable components as the edges

» But, can systems designed this way be more stable?

Page 4: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.4

8/25/03

Why the hype???• File Sharing: Napster (+Gnutella, KaZaa, etc)

– Is this peer-to-peer? Hard to say.– Suddenly people could contribute to active global network

» High coolness factor– Served a high-demand niche: online jukebox

• Anonymity/Privacy/Anarchy: FreeNet, Publis, etc– Libertarian dream of freedom from the man

» (ISPs? Other 3-letter agencies)– Extremely valid concern of Censorship/Privacy– In search of copyright violators, RIAA challenging rights to

privacy

• Computing: The Grid– Scavenge the numerous free cycles of the world to do work– Seti@Home most visible version of this

• Management: Businesses– Suddenly Businesses have discovered extreme distributed

computing– Does P2P mean “self-configuring” from equivalent resources?– Bound up in “Autonomic Computing Initiative”?

Page 5: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.5

8/25/03

Who am I (Kubi)?OceanStore Pundit

• Computing everywhere:– Desktop, Laptop, Palmtop– Cars, Cellphones– Shoes? Clothing? Walls?

• Connectivity everywhere:– Rapid growth of bandwidth in the interior of the net– Broadband to the home and office– Wireless technologies such as CMDA, Satelite, laser

• Where is persistent data????

Page 6: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.6

8/25/03

Utility-based Infrastructure

Pac Bell

Sprint

IBMAT&T

CanadianOceanStore

IBM

• Data service provided by storage federation• Cross-administrative domain • Pay for Service

Page 7: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.7

8/25/03

OceanStore: Everyone’s Data, One Big

Utility “The data is just out there”

• How many files in the OceanStore?– Assume 1010 people in world– Say 10,000 files/person (very conservative?)– So 1014 files in OceanStore!

– If 1 gig files (ok, a stretch), get 1 mole of bytes!

Truly impressive number of elements…… but small relative to physical constants

Aside: new results: 1.5 Exabytes/year (1.51018)

Page 8: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.8

8/25/03

OceanStore Assumptions• Untrusted Infrastructure:

– The OceanStore is comprised of untrusted components– Individual hardware has finite lifetimes– All data encrypted within the infrastructure

• Responsible Party:– Some organization (i.e. service provider) guarantees that

your data is consistent and durable– Not trusted with content of data, merely its integrity

• Mostly Well-Connected: – Data producers and consumers are connected to a high-

bandwidth network most of the time– Exploit multicast for quicker consistency when possible

• Promiscuous Caching: – Data may be cached anywhere, anytime

Page 9: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.9

8/25/03

Some Basic Questions about

Peer-to-Peer

Page 10: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.10

8/25/03

Does Full Symmetry Make Sense?

“All nodes are created equal”

• Most distributed algorithms need points of serialization– UseNet may be one of few classes of algorithms that don’t

• Nodes have distinguished capabilities– Better Connectivity, More memory, Better management, …

Page 11: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.11

8/25/03

Possible advantages of Full Symmetry

• We will call this “purist peer-to-peer”• Anonymity/Deniability

– When combined with cryptographic techniques, no distinguished nodes to go after

– This is important property today!

• Algorithms Easier to Analyze– Differentiation always makes things harder to analyze– Almost all recent P2P papers use Purist Assumption

• How does (should?) Hierarchy and Equality interact??– This is a question we should answer by end of term

Page 12: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.12

8/25/03

The Path of an OceanStore UpdateSecond-Tier

Caches

Multicasttrees

Inner-RingServers

Clients

Page 13: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.13

8/25/03

Can P2P Overlay Networking Buy you

Something?

Client Server

IP Network

Traditional System

IP Network

Overlay

Client Server

P2P Tunneling System

Page 14: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.14

8/25/03

Enabling Technology: DOLR(Decentralized Object Location and Routing)

GUID1

DOLR

GUID1GUID2

Page 15: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.15

8/25/03

Planetlab DOLR Service

(May 2003: 1.5 TB over 4 hours)Could experiment with Tapestry as a real service

DOLR Model generalizes to many simultaneous apps

Page 16: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.16

8/25/03

Can Replication be used more Effectively in Peer-to-Peer

systems?

• Exploit law of large numbers for durability!• 6 month repair, FBLPY:

– Replication: 0.03– Fragmentation: 10-35

Page 17: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.17

8/25/03

Archival Disseminationof Fragments

Page 18: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.18

8/25/03

The Dissemination Process:Achieving Failure Independence

Model Builder

Set Creator

IntrospectionHuman Input

Network

Monitoringmodel

Inner Ring

Inner Ringse

t

set

probe

type

fragments

fragments

fragments

Page 19: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.19

8/25/03

Peer-to-peer Goal: Stable, large-scale systems

• State of the art:– Chips: 108 transistors, 8 layers of metal– Internet: 109 hosts, terabytes of bisection bandwidth– Societies: 108 to 109 people, 6-degrees of separation

• Complexity is a liability!– More components Higher failure rate– Chip verification > 50% of design team– Large societies unstable (especially when centralized)– Small, simple, perfect components combine to generate

complex emergent behavior!

• Can complexity be a useful thing?– Redundancy and interaction can yield stable behavior – Better figure out new ways to design things…

Page 20: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.20

8/25/03

Oceanstore Observation:Want Automatic Maintenance

• Can’t possibly manage billions of servers by hand!

• System should automatically:– Adapt to failure – Exclude malicious elements– Repair itself – Incorporate new elements

• System should preserve data over the long term (accessible for 1000 years):– Geographic distribution of information– New servers added from time to time– Old servers removed from time to time– Everything just works

Page 21: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.21

8/25/03

The Thermodynamic Analogy

• Large Systems have a variety of latent order– Connections between elements– Mathematical structure (erasure coding, etc)– Equivalent/interchangeable elements – Distributions peaked about some desired behavior

• Permits “Stability through Statistics”– Exploit the behavior of aggregates (redundancy)

• Subject to Entropy– Servers fail, attacks happen, system changes

• Requires continuous repair – Apply energy (i.e. through servers) to reduce entropy– Introspection restores distributions

Page 22: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.22

8/25/03

Statistical Advantage of Fragments

• Latency and standard deviation reduced:– Memory-less latency model– Rate ½ code with 32 total fragments

Time to Coalesce vs. Fragments Requested (TI5000)

0

20

40

60

80

100

120

140

160

180

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Objects Requested

La

ten

cy

Page 23: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.23

8/25/03

The Biological Inspiration• Biological Systems are built from (extremely)

faulty components, yet:– They operate with a variety of component failures

Redundancy of function and representation– They have stable behavior Negative feedback– They are self-tuning Optimization of common case

• Introspective (Autonomic)Computing:– Components for performing– Components for monitoring and

model building– Components for continuous

adaptation

Adapt

Dance

Monitor

Page 24: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.24

8/25/03

ThermoSpective• Many Redundant Components (Fault

Tolerance)• Continuous Repair (Entropy Reduction)

Adapt

Dance

Monitor

Page 25: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.25

8/25/03

What does this really mean?• Redundancy, Redundancy, Redundancy:

– Many components that are roughly equivalent– System stabilized by consulting multiple elements– Voting/signature checking to exclude bad elements– Averaged behavior/Median behavior/First Arriving

• Passive Stabilization– Elements interact to self-correct each other– Constant resource shuffling

• Active Stabilization– Reevaluate and Restore good properties on wider scale– System-wide property validation– Negative feedback/chaotic attractor

• Observation and Monitoring– Aggregate external information to find hidden order– Use to tune functional behavior and recognize disfunctional

behavior.

Page 26: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.26

8/25/03

Problems?• Most people don’t know how to think about this

– Requires new way of thinking– Some domains closer to thermodynamic realm than others:

peer-to-peer networks fit well

• Stability?– Positive feedback/oscillation easy to get accidentally

• Cost?– Power, bandwidth, storage, ….

• Correctness?– System behavior achieved as aggregate behavior– Need to design around fixed point or chaotic attractor behavior

(How does one think about this)?– Strong properties harder to guarantee

• Bad case could be quite bad!– Poorly designed Fragile to directed attacks– Redundancy below threshold failure rate increases

drastically

Page 27: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.27

8/25/03

What about Game Theory?

• If the individual components are selfish can we:– Somehow get good aggregate behavior?– The search landscape for Game Theoretic and

Thermodynamic systems is different!

• Are there aggregate mechanisms for enforcing correctness among selfish peers?

Page 28: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.28

8/25/03

Course Information• Time: Mon/Wed 10:30 – 12:00, 405 Soda Hall• Faculty: John Kubiatowicz, Anthony Joseph• Web: http://www.cs.berkeley.edu/~adj/cs294-

4.f03

• Format: Two papers/day + occasional guest• Course Prereqs: CS122 (networking), CS162 (OS)• Breakdown (tentative):

– Class presentation: 30% (two presentations each worth 15%)– Class project: 70%

• Projects: team-based and open ended.– Choose your own– Pick one from our projects page (still not totally decided)

Page 29: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.29

8/25/03

What is a presentation?

• Presentations are in PowerPoint• Target a 30minute presentation

– That’s about 15 slides

• You will be graded on several things:– Do you manage to present the high-level points of the

paper?– Do you manage to get a good discussion going?– Do you relate this paper to other papers/topics in the

course (CONTEXT!)

• For every paper find:– 3 most important points– 3 flaws with the assumptions/methodology/etc.

Page 30: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.30

8/25/03

Why have Students Present?

• This is a lot of fun• You get more viewpoints than just ours• We (Kubi and Anthony) get to learn

something from you!!!!

• I (Kubi) will consider this a great success if I learn something new from you this term.

Page 31: CS252/Kubiatowicz Lec 1.1 8/25/03 CS294-4 Peer-to-Peer systems Introduction August 25, 2003 John Kubiatowicz/Anthony Joseph kubitron/courses/cs294-4-F03.

CS252/KubiatowiczLec 1.31

8/25/03

List of Topics• Study existing systems:

– Gnutella, Freenet, Tapestry, RON, OceanStore, FarSite, Ivy, PAST, Pistache, …

• Study Fundamentals:– Byzantine Agreement, Logical Clocks, Reliable Group

communication, Quorum systems, Game Theory, Authentication, Security..

• Study Behaviors– Measurements of existing systems

• Answer Questions: – Is peer-to-peer something new?– Can peer-to-peer philosophy offer something new?– Should we give up and do something else?