We're all distributed systems devs now: a crash course in distributed programming

Post on 16-Apr-2017

385 views 1 download

Transcript of We're all distributed systems devs now: a crash course in distributed programming

We're All Distributed Systems Developers Now

By Aaron Stannard,Founder & CEO

Petabridge

Load Balancer

User Requests

WebServer

WebServer

WebServer

WebServer

WebServer

PRIVATE ZONE

SQL(Master)

SQL(Slave)

OLAP

InternalAdmin Reports BI

Load Balancer

WebServer

WebServer

WebServer

WebServer

WebServer

OLAP

InternalAdmin Reports BI

SQL(Master)

SQL(Slave)

PRIVATE ZONE

SQL(Master)

SQL(Slave)

Obvious Solution: Sharding

Master

Slave

Slave

Master

Slave

Slave

Master

Slave

Slave

Master

Slave

Slave

Master

Slave

Slave

Coordinator

Client Client Client

Brittle

Master

Slave

Slave

Master

Slave

Slave

Master

Slave

Slave

Master

Slave

Slave

Master

Slave

Slave

Coordinator

Client Client Client

Master

Master

Master

Master

Scenario 2: Real-time User Interactivity

User generatesevent

Has userproduced

events 0-3?

Yes

Send messageback to user

NoWait for more

events

Obvious Solution: Read-after-Write

USER

HTTP LOAD BALANCER

Web Tier

Data Tier

Events 0...3

0

0

1

0 10,1

2

20,1,2

3

30,1,2,3

MESSAGE

Reality

HTTP LOAD BALANCER

Web Tier

Data Tier

USER

Events 0...3

0 3

0?? 3??

1 2

1?? 2 ??

Distributed Systems 101

DistributedSystems Theories

and Concepts

DecentralizedArchitectures

Event & MessageDriven Programming

Stateful Applications

CAP Theorem

Fault & ResourceIsolation

Elastic (Leave)C

DA

B E

F

All other nodes are notified of F leaving

Recover from FailuresC

DA

B E

F(CriticalState)

All other nodes are notified of failure

D(takeover)

Availability through Replication

C

D(replica 2)

A(replica 1)

B E

F(replica 3)

Event and Message Driven Programming

Front End Application Server

RPC / WebService Call

HTTP POST ....

HTTP 201 ....

Message Passing

Front End Application Server

Serialized Message

0-N Response Messages

Properties of Messages

Propertiesof Messages

Always comprised oftwo parts

Payload (data)

Reply-to address

Alwaysasynchronous

Can be serialized andstored

Can be ordered andre-ordered Deferrals

Can be forwardedand delegated

Can be received bymultiple parties

Messaging Patterns

Broadcast Node

Node

Node

Node

Proxy Node Node

Node

Forward

Pub-sub Node

Node

Node

NodeSubs

msg

NodeOne-way Node

Messaging ProtocolsC

(R2)

DA

B(R1)

E(R3)

FWrite

Can we accept?

Can we accept?

YesYes

COMMIT

COMMITCOMMIT

Gossip: How Nodes Discover Each Other

A(Seed)

B C

1. Join

2. Share gossipabout other nodes

3. Join

4. Share gossipabout other nodes

5. A --> C: did youknow about B?

5. A --> B: did youknow about C?

6. Connect

Stateful Apps Serve Results from Memory

Web Server(Stateless)

App Server(Stateful)

Database Server(Stateful)

1. Request

2. Get orUpdate

3. Response

4. Response

Aync writesAsync reads

Fastest Response Time?

App Server(Stateless)

App Server(Stateful)

Database Server(Stateful)

State

Which of these two service architectureswill produce the fastest response timefor the same data?

Request /Response

State Makes Protocols WorkAt Most Once Sender Receiver

message may be lost

At Least Once Sender Receiver

Statemessage may be duplicated

message ordering issues

Exactly Once Sender Receiver

State Statemessage ordering issues

expensive

CAP Theorem

CConsistency

PPartitionTolerance

AAvailability

CAP Gradient - have to trade offbetween all three.

CAP Terminologies

CAPTerminologies

Consistency All nodes see the same dataat the same time

AvailabilityGuarantee that every

request receives an explicitresponse

PartitionTolerance

System is able to continuedespite arbitrary

partitioning due to networkfailures

CAP Trade-offs

CAPTradeoffs

Want higherconsistency?

Be lessavailable More latency

Or run onfewer

machines

Want higheravailability?

Be lessconsistent

Potential forstale reads

Run on moremachines

Want morepartition

tolerance?

Design a wayfor nodes towork when

disconnected

(This is hard)

Requiresconsistency /availability

compromises

Highest Consistency?

App Server(Stateful)

App Server(Stateful)

App Server(Stateful)

App Server(Stateful)

App Server(Stateful)

OR

1. Write {X}

2. Accept {X}?

2. Accept {X}?

3. Agree

3. Agree

4. Commit {X}

1. Write {X}

2. Commit {X}

3. Notify:{X} committed

Greater Availability

Greater Consistency

Consistency vs. Availability

Consistency vs.Availability

HighAvailability

Low latency

Multiple nodes allable to handle same

request

Less consistent!

High Consistency

Multiple nodes allagree on what thecurrent "state" of

something is

More predictablebehavior

Less available

Higher latency

Fault and Resource Isolation with Microservices

WebCrawler Microservices

Join cluster,Receive gossip

Join cluster,Receive gossip

Run jobs,get progress reports

WebCrawler.WebASP.NETMVC, SignalR

Cluster Role: Web

WebCrawler.WebASP.NETMVC, SignalR

Cluster Role: Web

All Web Roles

All Tracker Roles

WebCrawler.TrackingServiceWindows Service

Cluster Role: Tracker

WebCrawler.TrackingServiceWindows Service

Cluster Role: Tracker

Lighthouse Role

Cluster-deployprocessing hierarchies

Join cluster,Receive gossip

WebCrawler.CrawlServiceWindows Service

Cluster Role: Crawler

WebCrawler.CrawlServiceWindows Service

Cluster Role: Crawler

All Crawler Roles

Stateless Stateless

StatelessStateful

WebCrawler Network Topology

C[WEB]

D[WEB]

A[Lighthouse]

B[Crawler]

E[Crawler]

F[Tracker]

Try to make CPU / Memory-intensive tasks into stateless services

Stateful services should increase CPU / memory

utilization slowly