Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator...

27
Orchestrator on Ra: internals, benefits and considerations Shlomi Noach GitHub PerconaLive 2018

Transcript of Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator...

Page 1: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

Orchestrator on Raft: internals, benefits and considerations

Shlomi Noach GitHub

PerconaLive 2018

Page 2: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

About me

@github/database-infrastructure

Author of orchestrator, gh-ost, freno, ccql and others.

Blog at http://openark.org

@ShlomiNoach

Page 3: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

Agenda

Raft overview

Why orchestrator/raft

orchestrator/raft implementation and nuances

HA, fencing

Service discovery

Considerations

Page 4: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

Raft

Consensus algorithm

Quorum based

In-order replication log

Delivery, lag

Snapshots! !

!!

!

Page 5: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

HashiCorp raft

golang raft implementation

Used by Consul

Recently hit 1.0.0

github.com/hashicorp/raft

Page 6: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

orchestrator

MySQL high availability solution and replication topology manager

Developed at GitHub

Apache 2 license

github.com/github/orchestrator

"

"

"

" ""

"

" ""

"

" ""

"

""

Page 7: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

Why orchestrator/raft

Remove MySQL backend dependency

DC fencing

And then good things happened that were not planned:

Better cross-DC deployments

DC-local KV control

Kubernetes friendly

Page 8: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

orchestrator/raft

n orchestrator nodes form a raft cluster

Each node has its own,dedicated backend database (MySQL or SQLite)

All nodes probe the topologies

All nodes run failure detection

Only the leader runs failure recoveries

"

"

"

" ""

"

" ""

"

" ""

"

""

Page 9: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

Implementation & deployment @ GitHub5 Nodes (2xDC1, 2xDC2, 1xDc3)

1 second raft polling interval

step-down

raft-yield

SQLite-backed log store

MySQL backend (SQLite backend use case in the works)

"

"

"

"

"

"

2xDC1

2xDC2

DC3

Page 10: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

A high availability scenario

o2 is leader of a 3-node orchestrator/raft setup

"

"

" ""

"" ""

"""

o1

o2

o3

Page 11: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

Injecting failure

master: killall -9 mysqld

o2 detects failure. About to recover, but…

"

"

" ""

"" ""

"""

o1

o2

o3

Page 12: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

Injecting 2nd failure

o2: DROP DATABASE orchestrator;

o2 freaks out. 5 seconds later it steps down

"

"

" ""

"" ""

"""

o1

o2

o3

Page 13: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

orchestrator recovery

o1 grabs leadership

"

"

" ""

"" ""

"""

o1

o2

o3

Page 14: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

MySQL recovery

o1 detected failure even before stepping up as leader.

o1, now leader, kicks recovery, fails over MySQL master

"

"

" ""

"

"

"

"""

o1

o3

o2

Page 15: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

orchestrator self health tests

Meanwhile, o2 panics and bails out.

"

"

" ""

"

"

"

"""

o1

o3

o2

Page 16: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

puppet

Some time later, puppet kicks orchestrator service back on o2.

"

"

" ""

"

"

"

"""

o1

o3

o2

Page 17: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

orchestrator startup

orchestrator service on o2 bootstraps, creates orchestrator schema and tables.

"

"

" ""

"

"

"

"""

o1

o3

o2

Page 18: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

Joining raft cluster

o2 recovers from raft snapshot, acquires raft log from an active node, rejoins the group

"

"

" ""

"

"

"

"""

o1

o3

o2

Page 19: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

Grabbing leadership

Some time later, o2 grabs leadership

"

"

" ""

"

"

"

"""

o1

o3

o2

Page 20: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

DC fencing

Assume this 3 DC setup

One orchestrator node in each DC

Master and a few replicas in DC2

What happens if DC2 gets network partitioned?

i.e. no network in or out DC2

"

"

" ""

"" ""

"""

DC1

DC2

DC3

Page 21: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

DC fencing

From the point of view of DC2 servers, and in particular in the point of view of DC2’s orchestrator node:

Master and replicas are fine.

DC1 and DC3 servers are all dead.

No need for fail over.

However, DC2’s orchestrator is not part of a quorum, hence not the leader. It doesn’t call the shots.

"

"

" ""

"" ""

"""

DC1

DC2

DC3

Page 22: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

DC fencing

In the eyes of either DC1’s or DC3’s orchestrator:

All DC2 servers, including the master, are dead.

There is need for failover.

DC1’s and DC3’s orchestrator nodes form a quorum. One of them will become the leader.

The leader will initiate failover.

"

"

" ""

"" ""

"""

DC1

DC2

DC3

Page 23: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

DC fencing

Depicted potential failover result. New master is from DC3.

"

"

"""

"

"

"

"

"""

DC1

DC2

DC3

Page 24: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

orchestrator/raft & consul

orchestrator is Consul-aware

Upon failover orchestrator updates Consul KV with identity of promoted master

Consul @ GitHub is DC-local, no replication between Consul setups

orchestrator nodes, update Consul locally on each DC

Page 25: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

Considerations, watch out for

Eventual consistency is not always your best friend

What happens if, upon replay of raft log, you hit two failovers for the same cluster?

NOW() and otherwise time-based assumptions

Reapplying snapshot/log upon startup

Page 26: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

orchestrator/raft roadmap

Kubernetes

ClusterIP-based configuration in progress

Already container-friendly via auto-reprovisioning of nodes via Raft

Page 27: Orchestrator on Ra : internals, benefits and considerations · orchestrator startup orchestrator service on o2 bootstraps, ... Joining ra! cluster o2 recovers from raft snapshot,

Thank you!

Questions?github.com/shlomi-noach @ShlomiNoach