Akka Cluster in Production
• Reactive?• Actors?• Akka?• Clustering?• Onlineshopping?• Online payment?
Device Ident
• Detect fraudulent devices in real time• Retail, telco, finance, …
• Integrated via snippet into webshop• Analyses client devices• Results obtained via REST API • > 300 sites,• > 100M devices• 24/7, 3 nines
Reactive Manifesto
Responsive
Message Driven
Elastic Resilient
Actors
„The actor model in computer science is a mathematical model of concurrent computation that treats "actors" as the universal primitives of concurrent computation. In response to a message that it receives, an actor can: • make local decisions, • create more actors, • send more messages, and • determine how to respond to the next message received.Actors may modify private state, but can only affect each other through messages (avoiding the need for any locks).“
https://en.wikipedia.org/wiki/Actor_model
Akka Cluster Components
Remoting
Clustering
Distributed PubSub
Cluster Singleton
Sharding
Distributed Data
Persistence
Cluster Client
Cluster Aware Routers
Actors
Akka Remote
• replace LocalActorRefProvider by RemoteActorRefProvider
• ActorRef: akka://systemName/user/parent/actorName
• Remote ActorRef: akka.tcp://systemName@hostName:1234/user/parent/actorName
• Look up remote actors• Start remote actors• Cluster aware routing• death watch
Failure Detector
• The Phi Accrual Failure Detector phi = -log10(1 - F(timeSinceLastHeartbeat))• Each node monitored by small number of other nodes determined using Hash Ring• Output: confidence that a node is unreachable• Also notices when it becomes reachable again
Akka Cluster
• Cluster Membership managed using Gossip Protocol• Dynamo based system• Subscribe to cluster state events• Roles• Restrictions on number of nodes possible (also per role)
• When Gossip Convergence is reached, a leader can deterministically be determined• Head of list of nodes in alphanumeric order• Leader joins / removes members• Leader can auto-down members
• Join manually or to Seed Nodes • The first seed node joins itself
1
5
37
28
6 4
case class Gossip( members: immutable.SortedSet[Member], // sorted set of members with their status, sorted by address overview: GossipOverview = GossipOverview(), version: VectorClock = VectorClock()) // vector clock version
case class GossipOverview( seen: Set[UniqueAddress] = Set.empty, reachability: Reachability = Reachability.empty)
case class VectorClock( versions: TreeMap[VectorClock.Node, Long] = TreeMap.empty[VectorClock.Node, Long]) {
/** * Compare two vector clocks. The outcome will be one of the following: * <p/> * {{{ * 1. Clock 1 is SAME (==) as Clock 2 iff for all i c1(i) == c2(i) * 2. Clock 1 is BEFORE (<) Clock 2 iff for all i c1(i) <= c2(i) * and there exist a j such that c1(j) < c2(j) * 3. Clock 1 is AFTER (>) Clock 2 iff for all i c1(i) >= c2(i) * and there exist a j such that c1(j) > c2(j). * 4. Clock 1 is CONCURRENT (<>) to Clock 2 otherwise. * }}} */ def compareTo(that: VectorClock): Ordering = { compareOnlyTo(that, FullOrder) } }
Akka Cluster Lifecycle
Cluster Singletons
• e.g. single point of entry, centralized routing logic, … • live on the oldest node• ClusterSingletonManager started on each node• ClusterSingletonProxy for accessing current Singleton
Cluster Singleton
Singleton Manager
Singleton Proxy
Singleton Manager
Node1 Node2 Node3
Singleton Proxy
Singleton Proxy
Singleton
Cluster Singleton
Singleton Manager
Singleton Proxy
Singleton Manager
Node1 Node2 Node3
Singleton Proxy
Singleton Proxy
Singleton
Cluster Singletons
• e.g. single point of entry, centralized routing logic, … • live on the oldest node• ClusterSingletonManager started on each node• ClusterSingletonProxy for accessing current Singleton• caveats:
• Single point of bottleneck• Must recover state on migration• In case of split brain, multiple singletons
Distributed PubSub
• DistributedPubSubMediator started on all nodes• Subscriptions are gossiped, eventually consistent• modes Publish, Group Publish, Send• used e.g. for cluster wide config, chat system, …
Cluster Sharding
• Distribute Work• Workload partitioned by shard key derived from message• Messages must be serializable• Each node is responsible for n shards and each shard is allocated to one node• ShardRegion is entry point for messages and controls workers• ShardCoordinator singleton assigns shards• Shards distributed by no of workers by default• Shards migrate for rebalancing or on failure• Shard assignments can be persisted• Running Workers per shard can be remembered.
• workers must step down• Workers must persist state if they need it after migration
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard NodeShard Node Shard Node Shard Node
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard NodeShard Node Shard Node Shard Node
! „Hello World“ key = 1234 shard = 12
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard NodeShard Node Shard Node Shard Node
! „Hello World“ key = 1234 shard = 12
? 12
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard NodeShard Node Shard Node12 Node1
Shard Node
! „Hello World“ key = 1234 shard = 12
! Node1
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard NodeShard Node12 Node1
Shard Node12 Node1
Shard Node
Entity 1234
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard NodeShard Node12 Node1
Shard Node12 Node1
Shard Node
Entity 1234
! „H
ello
Wor
ld
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard NodeShard Node12 Node1
Shard Node12 Node1
Shard Node
Entity 1234
! „Hello World“ key = 1234 shard = 12
? 12
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard NodeShard Node12 Node1
Shard Node12 Node1
Shard Node
Entity 1234
! „Hello World“ key = 1234 shard = 12
! Node1
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard Node12 Node1
Shard Node12 Node1
Shard Node12 Node1
Shard Node
Entity 1234
! „Hello World“ key = 1234 shard = 12
! Node1
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard Node12 Node1
Shard Node12 Node1
Shard Node12 Node1
Shard Node
Entity 1234
! „Hello World
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard Node12 Node1
Shard Node12 Node1
Shard Node12 Node1
Shard Node
Entity 1234
! „Happy Day“ key = 1299 shard = 12
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard Node12 Node1
Shard Node12 Node1
Shard Node12 Node1
Shard Node
Entity 1234
Entity 1299
! „Happy Day“
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard Node12 Node19 Node2
Shard Node12 Node19 Node23 Node1
Shard Node12 Node19 Node23 Node1
Shard Node
9 Node23 Node1
Entity 1234
Entity 0902
Entity 0901
Entity 345
Entity 1299
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard Node
9 Node2
Shard Node
9 Node2
Entity 1234
Entity 0902
Entity 0901
Entity 345
Entity 1299
Shard Coordinator
Shard Node
Not remembering shards
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard Node
9 Node2
Shard Node
9 Node2
Entity 1234
Entity 0902
Entity 0901
Entity 345
Entity 1299
Shard Coordinator
Shard Node
9 Node2
Not remembering shards / ddata https://github.com/akka/akka/issues/19003
Cluster Sharding
Shard Region
Shard Coordinator
Shard Region
Shard Region Proxy
Node1 Node2 Node3
Shard Node
9 Node2
Shard Node
Entity 1234
Entity 0902
Entity 0901
Entity 345
Entity 1299
Shard Coordinator
Shard Node12 Node29 Node23 Node2
Entity 1234
Entity 345
Entity 1299
remembering shards / persistence
Sharding and Persistence
• Persist ShardCoordinator state• akka.cluster.sharding { state-store-mode = „persistence" journal-plugin-id = "akka-contrib-mongodb-persistence-journal-sharding" snapshot-plugin-id = "akka-contrib-mongodb-persistence-snapshot-sharding" }
• akka.cluster.sharding.state-store-mode = "ddata"
• Remember Shard entities• akka.cluster.sharding.remember-entities = "on"
• Step down of workers• super ! Passivate(StopMessage)
Distributed Data
• KV-store based on Conflict Free Replicated Data Types (CRDTs) • Counters: GCounter, PNCounter• Sets: GSet, ORSet• Maps: ORMap, ORMultiMap, LWWMap, PNCounterMap• Registers: LWWRegister, Flag
• not intended for Big Data: In memory, full state replicated• start ddata.Replicator
val Counter1Key = PNCounterKey("counter1")replicator ! Update(Counter1Key, PNCounter(), WriteLocal)(_ + 1)
val readFrom3 = ReadFrom(n = 3, timeout = 1.second)replicator ! Get(Counter1Key, readFrom3)
• Consistency levels• ReadLocal / WriteLocal• ReadFromN / WriteToN• ReadMajority / WriteMajority• WriteAll and ReadAll
• Can subscribe to changes
Putting everything together
SHOPXhttp://shop.com SHOP
Live Example...
• config• scaling horizontally• cluster client• multi JVM test• singleton monitoring throughput and lifecycle
Caveats and Lessons learned
• Remoting Setup• TLS certificates rolling update
• difficult to test if new settings work whole the old ones are still there • In our case: export restricted crypto• Not too critical, if noticed early in rolling upgrade
• Adjust Failure detector settings to your environment• quite strict for us• must accept higher latencies & short interruptions in cloud environment
• Configure internal & external hostname in containerized / NATed / … environment• hostname and IP are completely different for Akka!
Caveats and Lessons learned
• Cluster Setup• Currently rather static hardware environment
• Joining to list of seed nodes. • Had a split brain once, need to carefully restart the right part of the cluster
• Log cluster state each node sees (or use JMX)!• preventing split brain
• maybe disable auto down• split brain resolver• adjust failure detector settings to your environment
• restart order youngest to oldest to minimize singleton migrations
Caveats and Lessons learned
• Sharding• Recovery of persistent actors should be planned (might take time)• If shard coordinator fails to recover you're doomed
• separate journal for internal sharding state• can be cleaned if cluster is shutdown• ICE delete journal
• Will allow for ShardCoordinator recovery• Will make shard allocation state in ShardRegions inconsistent• Fix state by rolling restart
Shutdown
// Play can not stop accepting requests. // Fail healthcheck, so the loadbalancer removes this node. GlobalHealthcheck.fail(reason = shuttingDown)
// migrate all shards to other nodes, stop accepting new ones val cluster = Cluster(context.system) context.watch(region) region ! ShardRegion.GracefulShutdown
// After shutdown of ShardRegion, shutdown the ActorSystemcase Terminated(`region`) => cluster.leave(cluster.selfAddress) cluster.registerOnMemberRemoved { system.terminate }
// After shutdown of the ActorSystem, shutdown the App system.registerOnTermination { System.exit(0)
}
Q&Ahttps://riskident.com/en/about/jobs/