Logically Centralized, Physically Distributed

Logically Centralized, Physically Distributed

Mark Stuart Day

Cisco Systems

Standard disclaimer

No matter what I say in this talk, I’m not making any Lotus product commitments.

Cisco

Outline

• What people want

• What people can have

• An ancient example: – Replicated mail repository

• A recent example:– Content distribution network

• Conclusions

What people want

• Single name/location for single logical service

• Service never goes down

• Service grows/shrinks smoothly

What people can have

• Single name/location for single logical service

• Service never goes down

• Service grows/shrinks smoothly

• Occasional weird errors that violate user expectations

Some ancient history

MIT-LCS-TR-376, Date: May 1987 REPLICATION AND RECONFIGURATION IN A DISTRIBUTED MAIL REPOSITORY Author(s): Day, M.S. Pages: 110 Price: $18.00AD Number: A186967 Keywords: data replication, software reconfiguration, availability, reliability, scalable systems, distributed programs, electronic mail repositories, programming languages

Mail system architecture(think of Grapevine)

Mailbox 1

Mailbox 2

Mailbox 3

Mailbox 4

Client

Directory

Highly available email

Mailbox 1

Mailbox 2

Mailbox 1

Mailbox 2

Mailbox 1

Mailbox 2

Client

Directory

How did it work?

• Systems success– Nice capability for quorum adjustment– New directory algorithm for deletions– Cool dynamic reconfiguration

• User failure– “What do you mean I can’t delete that

message?”– “Where’s that message gone?”

A recent example:Content distribution networks

• Akamai, Digital Island, Mirror Image, Adero, …

• $Millions in revenue

• $Billions in market capitalization

• Might be worth knowing something about

The bad old days (without content distribution)

ClientOriginServer

GET some/piece/o/content

New and improved (with content distribution)

ClientOriginServer

DeliveryNode

DeliveryNode

DeliveryNode

RequestRouter

RequestRouter

ContentRouter

ContentRouter

GETGET

Virtues

• Client unchanged

• Origin server mostly unchanged– Content URLs may be modified

• Add delivery nodes transparently

• Move content around transparently

Caveats

• Lots of detail missing– Request routing: HTTP redirection, DNS

interception, IP hijacking– Content routing: application-level multicast, IP

multicast

• Both request routing and content routing are nontrivial problems

Weird user-visible errors

• Routed to failed box– Content fails to appear– Depending on routing/caching, maybe no

content from that domain ever appears again for that client

Making weird errors into not-so-weird errors

• Deploy “next-click failover”– Delivery nodes clustered into “supernodes”

with switch– Supernode monitors failures– IP addresses of failed nodes remapped onto live

nodes

• Result is similar to common Web behavior– “What the hey?” [click] “Oh, OK.”

Conclusion

• People want something that’s logically centralized, physically distributed

• But they don’t want the weird errors that come with distribution

• A great thing about the Web:– People are already used to some weird errors

Logically Centralized, Physically Distributed

Documents

Transcript of Logically Centralized, Physically Distributed