Akka in Production - ScalaDays 2015

56
Akka in Production Evan Chan Scala Days 2015 March 17, 2015

Transcript of Akka in Production - ScalaDays 2015

Page 1: Akka in Production - ScalaDays 2015

Akka in ProductionEvan Chan

Scala Days 2015 March 17, 2015

Page 2: Akka in Production - ScalaDays 2015

Who is this guy?

•Principal Engineer, Socrata, Inc.

• http://github.com/velvia

• Author of multiple open source Akka/Scala

projects - Spark Job Server, ScalaStorm, etc.

•@evanfchan

Page 3: Akka in Production - ScalaDays 2015

A plug for a few projects…

• http://github.com/velvia/links - my stash of

interesting Scala & big data projects

• http://github.com/velvia/filo - a new, extreme

vector serialization library for fast analytics

• Talk to me later if you are interested in fast

serialization or columnar/analytics databases

Page 4: Akka in Production - ScalaDays 2015

Who is Socrata?!

We are a Seattle-based software startup. !

We make data useful to everyone.

Open, Public Data

Consumers

Apps

Page 5: Akka in Production - ScalaDays 2015

Socrata is…The most widely adopted Open Data platform

Page 6: Akka in Production - ScalaDays 2015

Scala at Socrata

•Started with old monolithic Java app •Started writing new features in Scala - 2.8 • Today - 100% backend development in Scala,

2.10 / 2.11, many micro services • custom SBT plugins, macros, more

• socrata-http • rojoma-json

Page 7: Akka in Production - ScalaDays 2015

Want Reactive?event-driven, scalable, resilient and responsive

Page 8: Akka in Production - ScalaDays 2015
Page 9: Akka in Production - ScalaDays 2015

Agenda

• How does one get started with Akka?

• To be honest, Akka is what drew me into Scala

• Examples of Akka use cases

• Compared with other technologies

• Tips on using Akka in production

• Including back pressure, monitoring, VisualVM usage,

etc.

Page 10: Akka in Production - ScalaDays 2015

Ingestion Architectures with Akka

Page 11: Akka in Production - ScalaDays 2015

Akka Stack

• Spray - high performance HTTP

• SLF4J / Logback

• Yammer Metrics

• spray-json

• Akka 2.x

• Scala 2.10

Page 12: Akka in Production - ScalaDays 2015

Ingesting 2 Billion Events / Day

NginxRaw Log Feeder Kafka

Storm

New Stuff

Consumer watches video

Page 13: Akka in Production - ScalaDays 2015

Livelogsd - Akka/Kafka file tailer

Current File

Rotated File

Rotated File 2

File Reader Actor

File Reader Actor

Kafka Feeder

CoordinatorKafka

Page 14: Akka in Production - ScalaDays 2015

Storm - with or without Akka?

Kafka Spout

Bolt

Actor

Actor

• Actors talking to each other within a bolt for locality

• Don’t really need Actors in Storm

• In production, found Storm too complex to troubleshoot

• It’s 2am - what should I restart? Supervisor? Nimbus? ZK?

Page 15: Akka in Production - ScalaDays 2015

Akka Cluster-based Pipeline

Kafka Consumer

Spray endpoint

Cluster Router

Processing Actors

Kafka Consumer

Spray endpoint

Cluster Router

Processing Actors

Kafka Consumer

Spray endpoint

Cluster Router

Processing Actors

Kafka Consumer

Spray endpoint

Cluster Router

Processing Actors

Kafka Consumer

Spray endpoint

Cluster Router

Processing Actors

Page 16: Akka in Production - ScalaDays 2015

Lessons Learned

• Still too complex -- would we want to get paged for this system?

• Akka cluster in 2.1 was not ready for production (newer 2.2.x version is stable)

• Mixture of actors and futures for HTTP requests became hard to grok

• Actors were much easier for most developers to understand

Page 17: Akka in Production - ScalaDays 2015

Simplified Ingestion Pipeline

Kafka Partition

1

Kafka SimpleConsumer

Converter Actor

Cassandra Writer Actor

Kafka Partition

2

Kafka SimpleConsumer

Converter Actor

Cassandra Writer Actor

• Kafka used to partition messages

• Single process - super simple!

• No distribution of data

• Linear actor pipeline - very easy to understand

Page 18: Akka in Production - ScalaDays 2015

Stackable Actor Traits

Page 19: Akka in Production - ScalaDays 2015

Why Stackable Traits?

• Keep adding monitoring, logging, metrics, tracing code gets pretty ugly and repetitive

• We want some standard behavior around actors -- but we need to wrap the actor Receive block:

class someActor extends Actor {! def wrappedReceive: Receive = {! case x => blah! }! def receive = {! case x =>! println(“Do something before...”)! wrappedReceive(x)! println(“Do something after...”)! }!}

Page 20: Akka in Production - ScalaDays 2015

Start with a base trait...

trait ActorStack extends Actor {! /** Actor classes should implement this partialFunction for standard! * actor message handling! */! def wrappedReceive: Receive!! /** Stackable traits should override and call super.receive(x) for! * stacking functionality! */! def receive: Receive = {! case x => if (wrappedReceive.isDefinedAt(x)) wrappedReceive(x) else unhandled(x)! // or: (wrappedReceive orElse unhandled)(x)! }!}!

Page 21: Akka in Production - ScalaDays 2015

Instrumenting Traits...

trait Instrument1 extends ActorStack {! override def receive: Receive = {! case x =>! println("Do something before...")! super.receive(x)! println("Do something after...")! }!}

trait Instrument2 extends ActorStack {! override def receive: Receive = {! case x =>! println("Antes...")! super.receive(x)! println("Despues...")! }!}

Page 22: Akka in Production - ScalaDays 2015

Now just mix the Traits in....

class DummyActor extends Actor with Instrument1 with Instrument2 {! def wrappedReceive = {! case "something" => println("Got something")! case x => println("Got something else: " + x)! }!}

• Traits add instrumentation; Actors stay clean!

• Order of mixing in traits matter

Antes...!Do something before...!Got something!Do something after...!Despues...

Page 23: Akka in Production - ScalaDays 2015

Productionizing Akka

Page 24: Akka in Production - ScalaDays 2015

On distributed systems: “The only thing that matters is visibility”

Page 25: Akka in Production - ScalaDays 2015

Akka Performance Metrics

• We define a trait that adds two metrics for every actor:

• frequency of messages handled (1min, 5min, 15min moving averages)

• time spent in receive block

• All metrics exposed via a Spray route /metricz

• Daemon polls /metricz and sends to metrics service

• Would like: mailbox size, but this is hard

Page 26: Akka in Production - ScalaDays 2015

Akka Performance Metrics

trait ActorMetrics extends ActorStack {! // Timer includes a histogram of wrappedReceive() duration as well as moving avg of rate of invocation! val metricReceiveTimer = Metrics.newTimer(getClass, "message-handler",! TimeUnit.MILLISECONDS, TimeUnit.SECONDS)!! override def receive: Receive = {! case x =>! val context = metricReceiveTimer.time()! try {! super.receive(x)! } finally {! context.stop()! }! }!}

Page 27: Akka in Production - ScalaDays 2015

Performance Metrics (cont’d)

Page 28: Akka in Production - ScalaDays 2015

Performance Metrics (cont’d)

Page 29: Akka in Production - ScalaDays 2015

VisualVM and Akka• Bounded mailboxes = time spent enqueueing msgs

Page 30: Akka in Production - ScalaDays 2015

VisualVM and Akka

• My dream: a VisualVM plugin to visualize Actor utilization across threads

Page 31: Akka in Production - ScalaDays 2015

Tracing Akka Message Flows

• Stack trace is very useful for traditional apps, but for Akka apps, you get this:

at akka.dispatch.Future$$anon$3.liftedTree1$1(Future.scala:195) ~[akka-actor-2.0.5.jar:2.0.5]!

at akka.dispatch.Future$$anon$3.run(Future.scala:194) ~[akka-actor-2.0.5.jar:2.0.5]!

at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:94) [akka-actor-2.0.5.jar:2.0.5]!

at akka.jsr166y.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1381) [akka-actor-2.0.5.jar:2.0.5]!

at akka.jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:259) [akka-actor-2.0.5.jar:2.0.5]!

at akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975) [akka-actor-2.0.5.jar:2.0.5]!

at akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479) [akka-actor-2.0.5.jar:2.0.5]!

at akka.jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) [akka-actor-2.0.5.jar:2.0.5]

--> trAKKAr message trace <--! akka://Ingest/user/Super --> akka://Ingest/user/K1: Initialize! akka://Ingest/user/K1 --> akka://Ingest/user/Converter: Data

• What if you could get an Akka message trace?

Page 32: Akka in Production - ScalaDays 2015

Tracing Akka Message Flows

Page 33: Akka in Production - ScalaDays 2015

Tracing Akka Message Flows

• Trait sends an Edge(source, dest, messageInfo) to a local Collector actor

• Aggregate edges across nodes, graph and profit!

trait TrakkarExtractor extends TrakkarBase with ActorStack {! import TrakkarUtils._!! val messageIdExtractor: MessageIdExtractor = randomExtractor!! override def receive: Receive = {! case x =>! lastMsgId = (messageIdExtractor orElse randomExtractor)(x)! Collector.sendEdge(sender, self, lastMsgId, x)! super.receive(x)! }!}!

Page 34: Akka in Production - ScalaDays 2015

Akka Service Discovery

• Akka remote - need to know remote nodes

• Akka cluster - need to know seed nodes

• Use Zookeeper or /etcd

• http://blog.eigengo.com/2014/12/13/akka-cluster-inventory/ - Akka cluster inventory extension

• Be careful - Akka is very picky about IP addresses. Beware of AWS, Docker, etc. etc. Test, test, test.

Page 35: Akka in Production - ScalaDays 2015

Akka Instrumentation Libraries

• http://kamon.io

• Uses AspectJ to “weave” in instrumentation. Metrics, logging, tracing.

• Instruments Akka, Spray, Play

• Provides statsD / graphite and other backends

• https://github.com/levkhomich/akka-tracing

• Zipkin distributed tracing for Akka

Page 36: Akka in Production - ScalaDays 2015

Backpressure and Reliability

Page 37: Akka in Production - ScalaDays 2015

Intro to Backpressure

• Backpressure - ability to tell senders to slow down/stop

• Must look at entire system.

• Individual components (eg TCP) having flow control does not mean entire system behaves well

Page 38: Akka in Production - ScalaDays 2015

Why not bounded mailboxes?

• By default, actor mailboxes are unbounded

• Using bounded mailboxes

• When mailbox is full, messages go to DeadLetters

• mailbox-push-timeout-time: how long to wait when mailbox is full

• Doesn’t work for distributed Akka systems!

• Real flow control: pull, push with acks, etc.

• Works anywhere, but more work

Page 39: Akka in Production - ScalaDays 2015

Backpressure in Action

• A working back pressure system causes the rate of all actor components to be in sync.

• Witness this message flow rate graph of the start of event processing:

Page 40: Akka in Production - ScalaDays 2015

Akka Streams

• Very conservative (“pull based”)

• Consumer must first give permission to Publisher to send data

• How does it work for fan-in scenarios?

Page 41: Akka in Production - ScalaDays 2015

Backpressure for fan-in

• Multiple input streams go to a single resource (DB?)

• May come and go

• Pressure comes from each stream and from # streams

Stream 1

Stream 2

Stream 3

Stream 4

Writer Actor

DB

Page 42: Akka in Production - ScalaDays 2015

Backpressure for fan-in

• Same simple model, can control number of clients

• High overhead: lots of streams to notify “Ready”

Stream 1

Stream 2

Writer Actor

Register

Ready for data

Data

Page 43: Akka in Production - ScalaDays 2015

At Least Once Delivery

What if you can’t drop messages on the floor?

Page 44: Akka in Production - ScalaDays 2015

At Least Once Delivery

• Let every message have a unique ID.

• Ack returns with unique ID to confirm message send.

• What happens if you don’t get an ack?

Actor A

Actor B

Msg 100 Msg 101 Msg 102

Ack 100 Ack 101?

Page 45: Akka in Production - ScalaDays 2015

At Least Once Delivery

• Resend unacked messages until confirmed == “at least once”

Actor A

Actor B

Msg 100 Msg 101 Msg 102

Ack 100 Ack 101?

Resend 101

Ack timeout

Page 46: Akka in Production - ScalaDays 2015

At Least Once Delivery & Akka

• Resending messages requires keeping message history around

• Unless your source of messages is Kafka - then just replay from the last successful offset + 1

• Use Akka Persistence - has at-least-once semantics + persistence of messages for better durability

• Exactly Once = at least once + deduplication

• Akka Persistence has this too!

Page 47: Akka in Production - ScalaDays 2015

Backpressure and at-least-once

• How about a system that works for fan-in, and handles back pressure and at-least-once too?

• Let the client have an upper limit of unacked messages

• Server can reject new messages

Stream 1

Stream 2

Writer Actor

Msg 100

Ack 100

Msg 101

Msg 200

Reject!

Page 48: Akka in Production - ScalaDays 2015

Backpressure and Futures• Use an actor to limit # of outstanding futures

class CommandThrottlingActor(mapper: CommandThrottlingActor.Mapper, maxFutures: Int) extends BaseActor { import CommandThrottlingActor._ import context.dispatcher // for future callbacks ! val mapperWithDefault = mapper orElse ({ case x: Any => Future { NoSuchCommand } }: Mapper) var outstandingFutures = 0 ! def receive: Receive = { case FutureCompleted => if (outstandingFutures > 0) outstandingFutures -= 1 case c: Command => if (outstandingFutures >= maxFutures) { sender ! TooManyOutstandingFutures } else { outstandingFutures += 1 val originator = sender // sender is a function, don't call in the callback mapperWithDefault(c).onSuccess { case response: Response => self ! FutureCompleted originator ! response } } } }

Page 49: Akka in Production - ScalaDays 2015

Good Akka development practices

• Don't put things that can fail into Actor constructor

• Default supervision strategy stops an Actor which cannot initialize itself

• Instead use an Initialize message

• Put your messages in the Actor’s companion object

• Namespacing is nice

Page 50: Akka in Production - ScalaDays 2015

Couple more random hints

• Learn Akka Testkit.

• Master it! The most useful tool for testing Akka actors.

• Many examples in spark-jobserver repo

• gracefulStop()

• TestKit.shutdownActorSystem(system)

Page 51: Akka in Production - ScalaDays 2015

Thank you!!

• Queues don’t fix overload

• Stackable actor traits - see ActorStack in spark-jobserver repo

Page 52: Akka in Production - ScalaDays 2015

Extra slides

Page 53: Akka in Production - ScalaDays 2015

Putting it all together

Page 54: Akka in Production - ScalaDays 2015

Akka Visibility, Minimal Footprint

trait InstrumentedActor extends Slf4jLogging with ActorMetrics with TrakkarExtractor!!object MyWorkerActor {! case object Initialize! case class DoSomeWork(desc: String)!}!!class MyWorkerActor extends InstrumentedActor {! def wrappedReceive = {! case Initialize =>! case DoSomeWork(desc) =>! }!}

Page 55: Akka in Production - ScalaDays 2015

Using Logback with Akka

• Pretty easy setup

• Include the Logback jar

• In your application.conf:event-handlers = ["akka.event.slf4j.Slf4jEventHandler"]

• Use a custom logging trait, not ActorLogging

• ActorLogging does not allow adjustable logging levels

• Want the Actor path in your messages?• org.slf4j.MDC.put(“actorPath”, self.path.toString)

Page 56: Akka in Production - ScalaDays 2015

Using Logback with Akka

trait Slf4jLogging extends Actor with ActorStack {! val logger = LoggerFactory.getLogger(getClass)! private[this] val myPath = self.path.toString!! logger.info("Starting actor " + getClass.getName)!! override def receive: Receive = {! case x =>! org.slf4j.MDC.put("akkaSource", myPath)! super.receive(x)! }!}