POC d'une architecture distribuee de calculs financiers

109
Système distribué de calculs financiers Par Xavier Bucchiotty

description

Présentation effectuée pendant l'Open-XKE de Xebia France. Ceci est le résultat d'un POC sur la création d'une architecture distribuée de calculs financiers. On y parle de Scala, programmation fonctionnelle, de Stream, du patter Iteratee, de Akka Actors et Akka Cluster

Transcript of POC d'une architecture distribuee de calculs financiers

Page 1: POC d'une architecture distribuee de calculs financiers

Système distribué de calculs financiers

Par Xavier Bucchiotty

Page 2: POC d'une architecture distribuee de calculs financiers

ME

@xbucchiotty

https://github.com/xbucchiotty

http://blog.xebia.fr/author/xbucchiotty

Page 3: POC d'une architecture distribuee de calculs financiers

Build a testable,

composable and scalable

cash-flow system

Page 4: POC d'une architecture distribuee de calculs financiers

Stream API Iteratees Akka actor Akka cluster

Step 4Step 1 Step 2 Step 3

Page 5: POC d'une architecture distribuee de calculs financiers

Use caseFinancial debt management

Page 6: POC d'une architecture distribuee de calculs financiers

CAUTION

Page 7: POC d'une architecture distribuee de calculs financiers
Page 8: POC d'une architecture distribuee de calculs financiers

initial = 1000 €duration = 5 yearsfixed interets rate = 5%

Date Amort Interests Outstanding

2013-01-01 200 € 50 € 800 €

2014-01-01 200 € 40 € 600 €

2015-01-01 200 € 30 € 400 €

2016-01-01 200 € 20 € 200 €

2017-01-01 200 € 10 € 0 €

1000 €

Page 9: POC d'une architecture distribuee de calculs financiers

initial = 1000 €duration = 5 yearsfixed interets rate = 5%

Date Amort Interests Outstanding

2013-01-01 200 € 50 € 800 €

2014-01-01 200 € 40 € 600 €

2015-01-01 200 € 30 € 400 €

2016-01-01 200 € 20 € 200 €

2017-01-01 200 € 10 € 0 €

1000 €

date = last date + (1 year)

Page 10: POC d'une architecture distribuee de calculs financiers

initial = 1000 €duration = 5 yearsfixed interets rate = 5%

Date Amort Interests Outstanding

2013-01-01 200 € 50 € 800 €

2014-01-01 200 € 40 € 600 €

2015-01-01 200 € 30 € 400 €

2016-01-01 200 € 20 € 200 €

2017-01-01 200 € 10 € 0 €

1000 €

amort = initial / duration

Page 11: POC d'une architecture distribuee de calculs financiers

initial = 1000 €duration = 5 yearsfixed interets rate = 5%

Date Amort Interests Outstanding

2013-01-01 200 € 50 € 800 €

2014-01-01 200 € 40 € 600 €

2015-01-01 200 € 30 € 400 €

2016-01-01 200 € 20 € 200 €

2017-01-01 200 € 10 € 0 €

1000 €

outstanding = last oustanding - amort

Page 12: POC d'une architecture distribuee de calculs financiers

initial = 1000 €duration = 5 yearsfixed interets rate = 5%

Date Amort Interests Outstanding

2013-01-01 200 € 50 € 800 €

2014-01-01 200 € 40 € 600 €

2015-01-01 200 € 30 € 400 €

2016-01-01 200 € 20 € 200 €

2017-01-01 200 € 10 € 0 €

1000 €

interests = last outstanding * rate

Page 13: POC d'une architecture distribuee de calculs financiers

val f = (last: Row) => new Row {

def date = last.date + (1 year)

def amortization = last amortization

def outstanding = last.outstanding - amortization

def interests = last.outstanding * fixedRate

}

Page 14: POC d'une architecture distribuee de calculs financiers

Step 1Stream API

Page 15: POC d'une architecture distribuee de calculs financiers

Date Amort Interests Outstanding

2013-01-01 200 € 50 € 800 €

2014-01-01 200 € 40 € 600 €

2015-01-01 200 € 30 € 400 €

2016-01-01 200 € 20 € 200 €

2017-01-01 200 € 10 € 0 €

Page 16: POC d'une architecture distribuee de calculs financiers

Date Amort Interests Outstanding

2013-01-01 200 € 50 € 800 €

2014-01-01 200 € 40 € 600 €

2015-01-01 200 € 30 € 400 €

2016-01-01 200 € 20 € 200 €

2017-01-01 200 € 10 € 0 €

first

f(first)

f(f(first))

Page 17: POC d'une architecture distribuee de calculs financiers

case class Loan( ... ) {

def first: Row

def f:(Row => Row)

def rows = Stream.iterate(first)(f) .take(duration)

}

Page 18: POC d'une architecture distribuee de calculs financiers

case class Portfolio(loans: Seq[Loan]) {

def rows =

loans.stream.flatMap(_.rows)

}

Page 19: POC d'une architecture distribuee de calculs financiers

3450 €Total

Date Amort Interests Total paid

2013-01-01 200 € 50 € 250 €2014-01-01 200 € 40 € 240 €2015-01-01 200 € 30 € 230 €2016-01-01 200 € 20 € 220 €2017-01-01 200 € 10 € 210 €2013-01-01 200 € 50 € 250 €2014-01-01 200 € 40 € 240 €2015-01-01 200 € 30 € 230 €2016-01-01 200 € 20 € 220 €2017-01-01 200 € 10 € 210 €2013-01-01 200 € 50 € 250 €2014-01-01 200 € 40 € 240 €2015-01-01 200 € 30 € 230 €2016-01-01 200 € 20 € 220 €2017-01-01 200 € 10 € 210 €

Loan 1

Loan 2

Loan 3

Page 20: POC d'une architecture distribuee de calculs financiers

// Produce rowsval totalPaid = portfolio.rows

// Transform rows to amount.map(row => row.interests + row.amortization)

//Consume amount.foldLeft(0 EUR)(_ + _)

Page 21: POC d'une architecture distribuee de calculs financiers

// Produce rowsval totalPaid = portfolio.rows

// Transform rows to amount.map(row => row.interests + row.amortization)

//Consume amount.foldLeft(0 EUR)(_ + _)

type RowProducer = Iterable[Row]

type RowTransformer[T] = (Row=>T)

type AmountConsumer[T] = (Iterable[Amount]=>T)

Page 22: POC d'une architecture distribuee de calculs financiers

//LoanStream.iterate(first)(f) take duration

//Porfolioloans => loans flatMap (loan => loan.rows)

RowProducer(Iterable[Row])

+ on demand computation- sequential computation

Page 23: POC d'une architecture distribuee de calculs financiers

object RowTransformer {

val totalPaid = (row: Row) =>

row.interests + row.amortization

}

+ function composition- type limited to «map»

RowTransformer(Row => T)

Page 24: POC d'une architecture distribuee de calculs financiers

object AmountConsumer {

def sum = (rows: Iterable[Amount]) => rows.foldLeft(Amount(0, EUR))(_ + _)

}

AmountConsumer(Iterable[Amount] => T)

+ function composition- synchronism

Page 25: POC d'une architecture distribuee de calculs financiers

Stream API

Step 1

5000 loans50 rows

~ 560 ms

Page 26: POC d'une architecture distribuee de calculs financiers

On demand computation

Function composition

Sequential computation

Synchronism

Transformation limited to «map»

Pros Cons

Page 27: POC d'une architecture distribuee de calculs financiers

Step 2Iteratees

Page 28: POC d'une architecture distribuee de calculs financiers

Integrating Play iterateeslibraryDependencies ++= Seq( "com.typesafe.play" %% "play-iteratees" % "2.2.0-RC2")

Page 29: POC d'une architecture distribuee de calculs financiers

Enumerator

Iteratee

Producer

Input Status

Consumer

Page 30: POC d'une architecture distribuee de calculs financiers

Enumerator

Iteratee

Input StatusIteratees are immutable

Asynchronous by design

Type safe

Page 31: POC d'une architecture distribuee de calculs financiers

Enumerator

enumerate and interleave

Page 32: POC d'une architecture distribuee de calculs financiers

case class Loan(initial: Amount, duration: Int, rowIt: RowIt) {

def rows(implicit ctx: ExecutionContext) =

Stream.iterate(first)(f).take(duration)

}

Data producer

Enumerator.enumerate(

)

Page 33: POC d'une architecture distribuee de calculs financiers

case class Portfolio(loans: Seq[Loansan]) {

def rows(implicit ctx: ExecutionContext) =

}

producers can be combined

Enumerator.interleave(loans.map(_.rows))

Page 34: POC d'une architecture distribuee de calculs financiers

2013-01-01 200 € 50 € 250 €2014-01-01 200 € 40 € 240 €2015-01-01 200 € 30 € 230 €2016-01-01 200 € 20 € 220 €2017-01-01 200 € 10 € 210 €

2013-01-01 200 € 50 € 250 €2014-01-01 200 € 40 € 240 €2015-01-01 200 € 30 € 230 €2016-01-01 200 € 20 € 220 €2017-01-01 200 € 10 € 210 €

Date Amort Interests Total paid

2013-01-01 200 € 50 € 250 €2014-01-01 200 € 40 € 240 €2015-01-01 200 € 30 € 230 €2016-01-01 200 € 20 € 220 €2017-01-01 200 € 10 € 210 €

3450 €Total

Page 35: POC d'une architecture distribuee de calculs financiers

Iteratee

Consumer as a state machine

Page 36: POC d'une architecture distribuee de calculs financiers

Iteratees consume Input

Page 37: POC d'une architecture distribuee de calculs financiers

object Input {

case class El[+E](e: E)

case object Empty

case object EOF

}

Page 38: POC d'une architecture distribuee de calculs financiers

and propagates a state

Page 39: POC d'une architecture distribuee de calculs financiers

object Step {

case class Done[+A, E](a: A, remaining: Input[E])

case class Cont[E, +A](k: Input[E] => Iteratee[E, A])

case class Error[E](msg: String, input: Input[E])

}

Page 40: POC d'une architecture distribuee de calculs financiers

Enumerator

Iterateedef step = ...val count = 0

Input

El(...)

Status

Continue

Iterateedef step = ...val count = 1

computes

Page 41: POC d'une architecture distribuee de calculs financiers

Iterateedef step = ...val count = 1

Iterateedef step = ...val count = 1

Enumerator

Input

EOF

Status

Done

computes

Page 42: POC d'une architecture distribuee de calculs financiers

Iterateedef step = ...val count = 1

Enumerator

Input

El(...)

Status

Error

Iterateedef step = ...val error = "Runtime Error"

computes

Page 43: POC d'une architecture distribuee de calculs financiers

val last: RowConsumer[Option[Row]] = {

def step(last: Option[Row]): K[Row,Option[Row]]= {

case Input.Empty => Cont(step(last))

case Input.EOF => Done(last, Input.EOF)

case Input.El(e) => Cont(step(Some(e)))

}

Cont(step(Option.empty[Row]))

}

Page 44: POC d'une architecture distribuee de calculs financiers

object AmountConsumer {

val sum: AmountConsumer[Amount] =

}

(rows: Iterable[Amount]) => rows.foldLeft(Amount(0, EUR))(_ + _)

Page 45: POC d'une architecture distribuee de calculs financiers

object AmountConsumer {

val sum: AmountConsumer[Amount] =

}

Iteratee.fold[Amount, Amount](Amount(0, EUR))(_ + _)

Page 46: POC d'une architecture distribuee de calculs financiers

import RowTransformer.totalPaidimport AmountConsumer.sum

val totalPaidComputation: Future[Amount] = portfolio.rows.run(sum)

Page 47: POC d'une architecture distribuee de calculs financiers

import RowTransformer.totalPaidimport AmountConsumer.sum

val totalPaidComputation: Future[Amount] = portfolio.rows |>>> sum

Page 48: POC d'une architecture distribuee de calculs financiers

Enumeratee

map and filter

Page 49: POC d'une architecture distribuee de calculs financiers

Enumerator

Iteratee

Producer

Input Status

Consumer

Page 50: POC d'une architecture distribuee de calculs financiers

Enumerator

Iteratee

Producer

Input[A]

Status

Consumer

EnumerateeTransformation

Input[B]

Page 51: POC d'une architecture distribuee de calculs financiers

Data transformation

object RowTransformer {

val totalPaid =

Enumeratee.map[Row](row =>

row.interests + row.amortization

)

}

Page 52: POC d'une architecture distribuee de calculs financiers

def until(date: DateMidnight) = Enumeratee.filter[Row](

row => !row.date.isAfter(date)

)

Data filtering

Page 53: POC d'une architecture distribuee de calculs financiers

type RowProducer = Iterable[Row]

type RowProducer = Enumerator[Row]

type AmountConsumer[T] = (Iterable[Amount]=>T)

type RowTransformer[T] = (Row=>T)

type RowTransformer[T] = Enumeratee[Row, T]

type AmountConsumer[T] = Iteratee[Amount, T]

Page 54: POC d'une architecture distribuee de calculs financiers

Futures are composable

map, flatMap, filteronComplete, onSuccess, onError, recover

Page 55: POC d'une architecture distribuee de calculs financiers

// Produce rowsval totalPaidComputation: Future[Amount] = portfolio.rows &> totalPaid |>>> sum

// Blocking the thread to wait for the resultval totalPaid =

Await.result(

totalPaidComputation,

atMost = defaultTimeout)

totalPaid should equal(3480 EUR)

Page 56: POC d'une architecture distribuee de calculs financiers

We still have function compositionand prepares the code for asynchronism

Page 57: POC d'une architecture distribuee de calculs financiers

RowProducer//LoanEnumerator.enumerate( Stream.iterate(first)(f).take(duration))

//PorfolioEnumerator.interleave(loans.map(_.rows))

+ on demand computation+ parallel computation

Page 58: POC d'une architecture distribuee de calculs financiers

RowTransformer

+ Function composition+ map, filter, ...

val totalPaid = Enumeratee.map[Row](row =>

row.interests + row.amortization

)

Page 59: POC d'une architecture distribuee de calculs financiers

AmountConsumer

+ Function composition+ Asynchronism

def sum = Iteratee.fold[Amount, Amount]

(Amount(0, EUR))(_ + _)

Page 60: POC d'une architecture distribuee de calculs financiers

Stream API

Step 1

5000 loans50 rows

~ 560 ms

Iteratees

Step 2

5000 loans50 rows

~ 3500 ms?

Page 61: POC d'une architecture distribuee de calculs financiers

simple test

complex test Thread.sleep((Math.random() * 1000) % 2) toLong)

Page 62: POC d'une architecture distribuee de calculs financiers

Stream API

Step 1

5000 loans50 rows

~ 560 ms

with pause~ 144900 ms

Iteratees

Step 2

5000 loans50 rows

~ 3500 ms

with pause~ 157285 ms

?

Page 63: POC d'une architecture distribuee de calculs financiers

Cost of using this implementation of iteratees

is greater than gain of interleaving for such small

operations

Page 64: POC d'une architecture distribuee de calculs financiers

Bulk interleaving

Page 65: POC d'une architecture distribuee de calculs financiers

//Portfolioval split =loans.map(_.stream).grouped(loans.size / 4)

Page 66: POC d'une architecture distribuee de calculs financiers

Stream API

Step 1

5000 loans50 rows

~ 560 ms

with pause~ 144900 ms

Iteratees

Step 2

5000 loans50 rows

~ 4571 ms

with pause~ 39042 ms

Page 67: POC d'une architecture distribuee de calculs financiers

On demand computation

Function composition

Sequential computation

Synchronism

Transformation limited to «map»

Pros Cons

Page 68: POC d'une architecture distribuee de calculs financiers

On demand computation

Function composition

Sequential computation

Synchronism

Pros Cons

Page 69: POC d'une architecture distribuee de calculs financiers

On demand computation

Pros Cons

Function composition

Parallel computation

Asynchronism

No error management

No elasticity

No resilience

Page 70: POC d'une architecture distribuee de calculs financiers

Step 3Akka actor

Page 71: POC d'une architecture distribuee de calculs financiers

Integrating AkkalibraryDependencies ++= Seq( "com.typesafe.akka" %% "akka-actor" % "2.2.0")

Page 72: POC d'une architecture distribuee de calculs financiers

Actors are objects

They communicate with each other by messages

asynchronously

Page 73: POC d'une architecture distribuee de calculs financiers

class Backend extends Actor {

def receive = {

case Compute(loan) => sender.tell( msg = loan.stream.toList, sender = self)

}}

case class Compute(loan: Loan)

Page 74: POC d'une architecture distribuee de calculs financiers

case class Loan

def rows(implicit calculator: ActorRef, ctx: ExecutionContext) = {

val responseFuture = ask(calculator,Compute(this))

val rowsFuture = responseFuture .mapTo[List[Row]]

rowsFuture.map(Enumerator.enumerate(_)) ) }}

Page 75: POC d'une architecture distribuee de calculs financiers

val system = ActorSystem.create("ScalaIOSystem")

val calculator = system.actorOf(Props[Backend].withRouter(

RoundRobinRouter(nrOfInstances = 10)),"calculator")

}

Page 76: POC d'une architecture distribuee de calculs financiers

Supervisionval simpleStrategy = OneForOneStrategy() { case _: AskTimeoutException => Resume case _: RuntimeException => Escalate}

system.actorOf(Props[Backend]....withSupervisorStrategy(simpleStrategy)), "calculator")

Page 77: POC d'une architecture distribuee de calculs financiers

Router

Routee 3

Routee 2

Routee 1

ComputeCompute

Page 78: POC d'une architecture distribuee de calculs financiers

Router

Routee 3

Routee 2

Routee 1

AskTimeoutException

Resume

Page 79: POC d'une architecture distribuee de calculs financiers

Router

Routee 3

Routee 2

Routee 1

Actor System

Page 80: POC d'une architecture distribuee de calculs financiers

RowProducer//Loanask(calculator,Compute(this))

.mapTo[List[Row]]

.map(Enumerator.enumerate(_))

//PorfolioEnumerator.interleave(loans.map(_.rows))

+ parallel computation- on demand computation

Page 81: POC d'une architecture distribuee de calculs financiers

RowTransformer

+ Nothing changed

val totalPaid = Enumeratee.map[Row](row =>

row.interests + row.amortization

)

Page 82: POC d'une architecture distribuee de calculs financiers

AmountConsumerdef sum = Iteratee.fold[Amount, Amount]

(Amount(0, EUR))(_ + _)

+ Nothing changed

Page 83: POC d'une architecture distribuee de calculs financiers

5000 loans50 rows

~ 4571 ms

with pause~ 39042 ms

Stream API

Step 1

5000 loans50 rows

~ 560 ms

with pause~ 144900 ms

Iteratees

Step 2

Akka actor

Step 3

5000 loans50 rows

~ 4271 ms

with pause~ 40882 ms

Page 84: POC d'une architecture distribuee de calculs financiers

On demand computation

Function composition

Parallel computation

Asynchronism

Pros Cons

No error management

No elasticity

No resilience

Page 85: POC d'une architecture distribuee de calculs financiers

On demand computation

Function composition

Parallel computation

Asynchronism

Pros Cons

No error management

No elasticity

No resilience

Page 86: POC d'une architecture distribuee de calculs financiers

No on demand computation

Function composition

Parallel computation

Asynchronism

Error management

Pros Cons

No elasticity

No resilience

Page 87: POC d'une architecture distribuee de calculs financiers

Step 4Akka cluster

Page 88: POC d'une architecture distribuee de calculs financiers

Integrating Akka ClusterlibraryDependencies ++= Seq( "com.typesafe.akka" %% "akka-cluster" % "2.2.0")

Page 89: POC d'une architecture distribuee de calculs financiers

Cluster RouterClusterRouterConfig

Can create actors on different nodes of the cluster Role Local actors or not Control number of actors per node per system

Page 90: POC d'une architecture distribuee de calculs financiers

Cluster RouterAdaptiveLoadBalancingRouter

Collect metrics (CPU, HEAP, LOAD) via JMX or Hyperic Sigar and make load balancing

Page 91: POC d'une architecture distribuee de calculs financiers

val calculator = system.actorOf(Props[Backend].withRouter(

RoundRobinRouter(nrOfInstances = 10)),"calculator")

}

val calculator = system.actorOf(Props[Backend] .withRouter(ClusterRouterConfig(

local = localRouter, settings = clusterSettings))

, "calculator")}

Page 92: POC d'une architecture distribuee de calculs financiers

Router

Routee 3

Routee 1

Actor System

Routee 4

Routee 3

Actor System

Routee 6

Routee 5

Actor System

Elasticity

Page 93: POC d'une architecture distribuee de calculs financiers

application.conf

cluster {

seed-nodes = ["akka.tcp://[email protected]:2551","akka.tcp://[email protected]:2552"] auto-down = on

}

Page 94: POC d'une architecture distribuee de calculs financiers

Router

Routee 3

Routee 1

Actor System

Routee 4

Routee 3

Actor System

Routee 6

Routee 5

Actor System

Resilience

Page 95: POC d'une architecture distribuee de calculs financiers

//Loanask(calculator,Compute(this))

.mapTo[List[Row]]

.map(Enumerator.enumerate(_))

//PorfolioEnumerator.interleave(loans.map(_.rows))

RowProducer

+ Nothing changed

Page 96: POC d'une architecture distribuee de calculs financiers

RowTransformer

+ Nothing changed

val totalPaid = Enumeratee.map[Row](row =>

row.interests + row.amortization

)

Page 97: POC d'une architecture distribuee de calculs financiers

AmountConsumerdef sum = Iteratee.fold[Amount, Amount]

(Amount(0, EUR))(_ + _)

+ Nothing changed

Page 98: POC d'une architecture distribuee de calculs financiers

Function composition

Parallel computation

Asynchronism

Error management

Pros Cons

No on demand computation

No elasticity

No resilience

Page 99: POC d'une architecture distribuee de calculs financiers

Function composition

Parallel computation

Asynchronism

Error management

Pros Cons

No on demand computation

No elasticity

No resilience

Page 100: POC d'une architecture distribuee de calculs financiers

Function composition

Parallel computation

Asynchronism

Error management

Elasticity

Resilience

Network serialization

Pros Cons

No on demand computation

Page 101: POC d'une architecture distribuee de calculs financiers

Stream API

Step 1

5000 loans50 rows

~ 560 ms

with pause~ 144900 ms

Iteratees

Step 2

5000 loans50 rows

~ 4571 ms

with pause~ 39042 ms

Akka actor

Step 3

5000 loans50 rows

~ 4271 ms

with pause~ 40882 ms

Akka cluster

Step 4

5000 loans50 rows

~ 6213 ms

with pause~ 77957 ms

1 node / 2 actors

Page 102: POC d'une architecture distribuee de calculs financiers

Stream API

Step 1

5000 loans50 rows

~ 560 ms

with pause~ 144900 ms

Iteratees

Step 2

5000 loans50 rows

~ 4571 ms

with pause~ 39042 ms

Akka actor

Step 3

5000 loans50 rows

~ 4271 ms

with pause~ 40882 ms

Akka cluster

Step 4

5000 loans50 rows

~ 5547 ms

with pause~ 39695 ms

2 nodes / 4 actors

Page 103: POC d'une architecture distribuee de calculs financiers

Conclusion

Page 104: POC d'une architecture distribuee de calculs financiers

Stream API

Step 1

powerful library

low memory

performance when single

threaded

Iteratees

Step 2

Akka actor

Step 3

error management

control on parallel execution via configuration

Akka cluster

Step 4

elasticity

resilience

monitoring

elegant API

enable asynchronism

and parallelism

Page 105: POC d'une architecture distribuee de calculs financiers

It’s all about trade-off

Page 106: POC d'une architecture distribuee de calculs financiers

But do you really need distribution?

Page 107: POC d'une architecture distribuee de calculs financiers

Hot subject

Recet blog post from «Mandubian» for Scalaz stream machines and iteratees [1]

Recent presentation from «Heather Miller» for spores (distribuables closures) [2]

Recent release of Scala 2.10.3 and performance optimization of Promise

Release candidate of play-iteratee module with performance optimization

Lots of stuff in the roadmap of Akka cluster 2.3.0

Page 109: POC d'une architecture distribuee de calculs financiers

YOUFOR watching

THANK

Merci!