Building Distributed Systems

87
Building Distributed Systems decision guide

Transcript of Building Distributed Systems

Page 1: Building Distributed Systems

Building Distributed Systems

decision guide

Page 2: Building Distributed Systems

Oleksiy Kurnenkov

Core InfrastructureDomain leadOnApp

Page 3: Building Distributed Systems

Building Distributed Systems

decision guide

Page 4: Building Distributed Systems

Introduction

Part I

Page 5: Building Distributed Systems

Distributed system

is a set of interconnected functional units with separate runtime context

Page 6: Building Distributed Systems

MONOLITH : DISTRIBUTED

1 Runtime context

Coroutines

Threads

Process

N Runtime contexts

Nodes

Components

Clusters

Grids

Page 7: Building Distributed Systems

SCALE

Small ->

Middle ->

Global ->

OS, LAN

LAN

WAN

Page 8: Building Distributed Systems

TYPES

CDN

Grids

Clouds

Clusters

Enterprise

Page 9: Building Distributed Systems

CDN

Page 10: Building Distributed Systems

Grid

Page 11: Building Distributed Systems

Globus Toolkit 6

BOINC

x-Torrent

Page 12: Building Distributed Systems

Cloud

Page 13: Building Distributed Systems

Cluster

Page 14: Building Distributed Systems

Enterprise

Page 15: Building Distributed Systems

TASKS

MMOG

Analysis

Sensors

Real Time Enterprise

Media

Storage

Computation

Transactions

Page 16: Building Distributed Systems

Challenges

AvailabilityScalabilitySecurityOpennessHeterogeneityConcurrencyTransparencyFeasibility

Page 17: Building Distributed Systems

It IS there

99.9999999%

Availability

Page 18: Building Distributed Systems

It grows and shrinks

Scalability

Page 19: Building Distributed Systems

U ain’ gonna crack it >:|

Security

Page 20: Building Distributed Systems

It is open for development and integration

Opennes

Page 21: Building Distributed Systems

Everything changes

Heterogeneity

Page 22: Building Distributed Systems

Parallel workloads

Concurrency

Page 23: Building Distributed Systems

Less know - less depend

Transparency

Page 24: Building Distributed Systems

Simplify as possible

Feasibility

Page 25: Building Distributed Systems

Fallacies

The network is reliableLatency is zeroBandwidth is infiniteThe network is secureTopology doesn't changeThere is one administratorTransport cost is zeroThe network is homogeneous

Page 26: Building Distributed Systems

Mechanics

Part II

Page 27: Building Distributed Systems

MechanicsI

MembershipConcurrency / ParallelismSynchronizationCoordinationConsensusClustering

Page 28: Building Distributed Systems

Who’s there?

Membership

Page 29: Building Distributed Systems

Membership

Heartbeats

Up-2-date list of nodes

Page 30: Building Distributed Systems

We are Legion

Concurrency / Parallelism

Page 31: Building Distributed Systems

Concurrency / Parallelism

Actors

Process Calculi

Shared memory

Page 32: Building Distributed Systems

MY precious!

Synchronization

Page 33: Building Distributed Systems

Synchronisation

Mutex | Semaphore

Distributed Lock

Page 34: Building Distributed Systems

Roger that!

Coordination

Page 35: Building Distributed Systems

Coordination

Leader election

Orchestration

the ‘Two Generals Problem’

Page 36: Building Distributed Systems

Deal

Consensus

Page 37: Building Distributed Systems

Consensus

Quorum

Consistency Availability Partition tolerance

Page 38: Building Distributed Systems

Altogether

Clustering

Page 39: Building Distributed Systems

Clustering

Load Balancing

Redundancy

Availability

Replication

Partition tolerance

Page 40: Building Distributed Systems

MechanicsII

ScalingFault DetectionFailoverRecoverFail back

Page 41: Building Distributed Systems

Scaling

Vertical

Horisontal

Automatic

Page 42: Building Distributed Systems

Fault Detection

Monitoring

Supervision

Activity Analysis

Page 43: Building Distributed Systems

Failover

Data | State Accessibility

Resource allocation

Spare Capacity

Business Continuity

Page 44: Building Distributed Systems

Recovery

State Reconstruction

Data backup

State dump

Page 45: Building Distributed Systems

Failback

Data consistency

Conflict resolution

Page 46: Building Distributed Systems

Apache YARN

Apache Zookeeper

Apache Mesos

Apache Drill

Google Chubby

CoreOS

Google Kubernetes

Google Dremel

Page 47: Building Distributed Systems

Architecture

Part III

Page 48: Building Distributed Systems

Client - Server

Page 49: Building Distributed Systems

N-tier

Page 50: Building Distributed Systems

Service oriented Architecture

Page 51: Building Distributed Systems

Event Driven Architecture

Page 52: Building Distributed Systems

P2P

Page 53: Building Distributed Systems

Context

Page 54: Building Distributed Systems
Page 55: Building Distributed Systems

System Allocation

Page 56: Building Distributed Systems

Architecturalentitiesdecomposition

SubsystemServiceComponentProcessObjectMiddleware

Page 57: Building Distributed Systems

Fractal

Page 58: Building Distributed Systems
Page 59: Building Distributed Systems

Subsystem -> Component -> Process -> Object

Page 60: Building Distributed Systems

Subsystem -> Component -> Process -> Object

Page 61: Building Distributed Systems

Subsystem -> Component -> Process -> Object

Page 62: Building Distributed Systems

CommunicationInfrastructure

aka Middleware

IPC-----------------------------Remote InvocationGroup CommunicationMessaging

Page 63: Building Distributed Systems

IPC

Share data

Send data

Page 64: Building Distributed Systems

Remote Invocation

Request - Reply

RPC

RMI

Page 65: Building Distributed Systems

Group Communication

{ Uni | Multi | Broad } cast

Publish | Subscribe

Page 66: Building Distributed Systems

Messaging

Message Queue

Message Broker

Pipe | Filter | Translator

Page 67: Building Distributed Systems

Implementation

Part IV

Page 68: Building Distributed Systems

Microcosm -> Macrocosm

Middleware

[ * ] Process

Host

Network

Page 69: Building Distributed Systems

Within a process

Threads

Shared memory

Mutex | Monitor

Page 70: Building Distributed Systems

require 'thread'

class RaceCondition def initialize(resource, concurrency_level) @resource = resource @concurrency_level = concurrency_level end

def perform @concurrency_level.times do threads << Thread.new { action } end

threads.each(&:join) end

private

def threads @threads ||= [] end

def action @resource += 1 puts @resource endend

RaceCondition.new(0, 10).perform

# 12# 3# 5# 6# 7# 8# 910# 4

Page 71: Building Distributed Systems

require 'thread'

class Sycnhronizer def initialize(resource, concurrency_level) @resource = resource @concurrency_level = concurrency_level end

def perform @concurrency_level.times do threads << Thread.new { action } end

threads.each(&:join) end

private

def threads @threads ||= [] end

def action lock.synchronize do @resource += 1 puts @resource end end

def lock @lock ||= Mutex.new endend

Sycnhronizer.new(0, 10).perform

# 1# 2# 3# 4# 5# 6# 7# 8# 9# 10

Page 72: Building Distributed Systems

Microcosm -> Macrocosm

Process

[ * ] Host

NetworkMiddleware

Page 73: Building Distributed Systems

Within a host

Processes

MQ | Shared memory | Pipes | UNIX Socket

Semaphore

Page 74: Building Distributed Systems

# http://bogomips.org/ruby_posix_mq/

require 'posix_mq'

class Producer attr_reader :mq

def initialize(mq_name) @mq = POSIX_MQ.new("/foo", :rw) end

def send(message, prio = 0) puts "Send: #{message}. Priority #{prio}" mq.send("#{message} #{prio}", prio) endend

p = Producer.new("/test_ipc")

p.send("Hello from #{Process.pid}", 10)p.send("Hello from #{Process.pid}", 2)p.send("Hello from #{Process.pid}", 0)p.send("Hello from #{Process.pid}", 1)p.send("Hello from #{Process.pid}", 20)

# ruby posix_mq/producer.rb## Send: Hello from 12635. Priority 10# Send: Hello from 12635. Priority 2# Send: Hello from 12635. Priority 0# Send: Hello from 12635. Priority 1# Send: Hello from 12635. Priority 20

Page 75: Building Distributed Systems

# http://bogomips.org/ruby_posix_mq/

require 'posix_mq'

class Consumer attr_reader :mq

def initialize(mq_name) @mq = POSIX_MQ.new("/foo", :rw) end

def receive mq.receive.first end

def receive_non_block mq.nonblock = true

begin receive rescue Errno::EAGAIN mq.nonblock = false puts "Nothing" end end

def shift mq.tryshift endend

c = Consumer.new("/test_ipc")

while m = c.shift puts "got: #{m}"end

# ruby posix_mq/consumer.rb

# got: Hello from 12635 10# got: Hello from 12635 20# got: Hello from 12635 2# got: Hello from 12635 1# got: Hello from 12635 0

Page 76: Building Distributed Systems

# https://github.com/pmahoney/process_shared

require 'process_shared'

mutex = ProcessShared::Mutex.newmem = ProcessShared::SharedMemory.new(:int)mem.put_int(0, 0)

pid1 = fork do puts "in process 1 (#{Process.pid})" 10.times do sleep 0.01 mutex.synchronize do value = mem.get_int(0) sleep 0.01 puts "process 1 (#{Process.pid}) incrementing" mem.put_int(0, value + 1) end endend

pid2 = fork do puts "in process 2 (#{Process.pid})" 10.times do sleep 0.01 mutex.synchronize do value = mem.get_int(0) sleep 0.01 puts "process 2 (#{Process.pid}) decrementing" mem.put_int(0, value - 1) end endend

Process.wait(pid1)Process.wait(pid2)

# ruby shm.rb## in process 1 (8038)# in process 2 (8041)# process 1 (8038) incrementing# process 2 (8041) decrementing# process 1 (8038) incrementing# process 2 (8041) decrementing# process 1 (8038) incrementing# process 2 (8041) decrementing# process 1 (8038) incrementing# process 2 (8041) decrementing# process 1 (8038) incrementing# process 2 (8041) decrementing# process 1 (8038) incrementing# process 2 (8041) decrementing# process 1 (8038) incrementing# process 2 (8041) decrementing# process 1 (8038) incrementing# process 2 (8041) decrementing# process 1 (8038) incrementing# process 2 (8041) decrementing# process 1 (8038) incrementing# process 2 (8041) decrementing# value should be zero: 0

Page 77: Building Distributed Systems

rd_child, wr_parent = IO.piperd_parent, wr_child = IO.pipe

pid = fork do rd_parent.close wr_parent.close

wr_child.puts "sent from child process" puts rd_child.getsend

rd_child.closewr_child.close

wr_parent.write "sent from parent process"puts rd_parent.gets

# ruby pipes/pipes.rb## sent from child process# sent from parent process

Page 78: Building Distributed Systems

require 'eventmachine'

module UnixServer def receive_data(data) puts data

EM.stop if data.chomp == "exit"

send_data("Server #{Process.pid}: Got #{data.chomp} from you") close_connection_after_writing endend

EM.run do puts "Started UNIX socket server on /tmp/sock"

EM::start_unix_domain_server("/tmp/sock", UnixServer)end

# ruby server.rb## Started UNIX socket server on /tmp/sock## HELLO! My pid is 13847

Page 79: Building Distributed Systems

require 'socket'

UNIXSocket.open("/tmp/sock") do |c| c.write("HELLO! My pid is #{Process.pid}")

puts c.readend

# ruby client.rb## Server 13843: Got HELLO! My pid is 13847 from you

Page 80: Building Distributed Systems

Microcosm -> Macrocosm

Process

Host

[ * ] NetworkMiddleware

Page 81: Building Distributed Systems

Over the Network

{ TCP | UDP } Socket

HTTP Client - Server

Async messaging

Page 82: Building Distributed Systems

require "bunny"

conn = Bunny.newconn.start

ch = conn.create_channelq = ch.queue("hello")

puts " [*] Waiting for messages in #{q.name}. To exit press CTRL+C"q.subscribe(:block => true) do |delivery_info, properties, body| puts " [x] Received #{body}"

# cancel the consumer to exit delivery_info.consumer.cancelend

require "bunny"

conn = Bunny.new(:hostname => "rabbit.local")conn.start

ch = conn.create_channel

q = ch.queue("hello")ch.default_exchange.publish("Hello World!", :routing_key => q.name)puts " [x] Sent 'Hello World!'"

conn.close

Page 83: Building Distributed Systems

require "bunny"

conn = Bunny.newconn.start

ch = conn.create_channelx = ch.fanout("logs")

msg = ARGV.empty? ? "Hello World!" : ARGV.join(" ")

x.publish(msg)puts " [x] Sent #{msg}"

conn.close

require "bunny"

conn = Bunny.newconn.start

ch = conn.create_channelx = ch.fanout("logs")q = ch.queue("", :exclusive => true)

q.bind(x)

puts " [*] Waiting for logs. To exit press CTRL+C"

begin q.subscribe(:block => true) do |delivery_info, properties, body| puts " [x] #{body}" endrescue Interrupt => _ ch.close conn.closeend

Page 84: Building Distributed Systems
Page 85: Building Distributed Systems

Apache Kafka

RabbitMQ

Page 86: Building Distributed Systems

Next time

To be continued:

Algorithms, patterns,

code!

Page 87: Building Distributed Systems

Thank you!

Q&A