On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next...

53
pICKLES & sPORES HEATHER MILLER On IMPROVING SCALA’S SUPPORT FOR DISTRIBUTED PROGRAMMING

Transcript of On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next...

Page 1: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

pICKLES&sPORES

HEATHER MILLER

On

IMPROVING SCALA’S SUPPORT FOR DISTRIBUTED PROGRAMMING

Page 2: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

with:PHILIPP HALLER

EUGENE BURMAKO

MARTIN ODERSKY

TYPESAFE

EPFL

EPFL/TYPESAFE

Page 3: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

wHAT IS THIS TALK ABOUT?

Making distributed programming

easier in scala

Page 4: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

This kind of distributed system

Page 5: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

This kind of distributed system

insert social network of your choice

here

ALso!

Page 6: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Bottomline:Machines CommunicatingHow can we simplify distribution at the language-level?

Page 7: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

sCALA pICKLINGAgenda

Spores

Page 8: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Pickles!scala.pickling

https://github.com/scala/pickling

Page 9: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

What is it?PICKLING == SERIALIZATION == MARSHALLING

very different from java serialization

https://github.com/scala/pickling

Page 10: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Closed!Slow!wait, why do we care?

not serializable exceptions at runtime

can’t retroactively make

classes serializable

https://github.com/scala/pickling

Page 11: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Enter: Scala Picklingfast: Serialization code generated at compile-

time and inlined at the use-site.

Flexible:Using typeclass pattern, retroactively make types serializable

NO BOILERPLATE:Typeclass instances generated at compile-time

pluggable formats:Effortlessly change format of serialized data: binary, JSON, invent your own!

typesafe:Picklers are type-specialized. Catch errors at compile-time! https://github.com/scala/pickling

Page 12: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

What does it look like?scala>  import  scala.pickling._  import  scala.pickling._  

https://github.com/scala/pickling

!scala>  import  json._  import  json._  !scala>  case  class  Person(name:  String,  age:  Int)  defined  class  Person  !scala>  Person("John  Oliver",  36)  res0:  Person  =  Person(John  Oliver,36)

scala>  res0.pickle  res1:  scala.pickling.json.JSONPickle  =    JSONPickle({      "tpe":  "Person",      "name":  "John  Oliver",      "age":  36  })  

Page 13: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

and... it’s pretty fast

https://github.com/scala/pickling

Page 14: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

collections: Time Benchmarks

Page 15: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Benchmarkscollections: free Memory

(more is better)

Page 16: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Benchmarkscollections: size

Page 17: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Benchmarksgeotrellis: time

Page 18: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Benchmarksevactor: time

Java runs out of memory

Page 19: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Benchmarksevactor: time (no java, more events)

Page 20: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

that’s just thebtw,

default behavior...

you can really customize scala pickling too.

https://github.com/scala/pickling

Page 21: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Previous examples used default behavior

Customizing Pickling

Pickling is very customizable

Before we can show these things,let's have a look at the building block of the framework...

Generated picklers Standard pickle format

Custom picklers for specific types Custom pickle format

Page 22: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

What about subclasses?Wait,

https://github.com/scala/pickling

abstract  class  Person  {        def  name:  String  }      case  class  Firefighter(name:  String,  since:  Int)  extends  Person  case  class  Doctor(name:  String,  since:  Int)  extends  Person  !class  Position(title:  String,  person:  Person)    

Scala is object-oriented too, remember?

To generate a pickler for Position, combine picklers for String and Person.

but person could dynamically be a

firefighter or doctor!

Page 23: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Goal: modular composition of picklers !

Pickling is very customizable

A `Pickler[Person]` is not good enough, it can only pickle instances whose dynamic type is `Person`. !

Generated picklers Standard pickle format

Custom picklers for specific types Custom pickle format

Subtyping

Page 24: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Pickler Combinators

trait  Pickler[T]  {    //  returns  next  write  position    def  pickle(arr:  Array[Byte],  i:  Int,  x:  T):  Int    //  returns  result  plus  next  read  position    def  unpickle(arr:  Array[Byte],  i:  Int):  (T,  Int) }

Elegant programming pearl that comes from functional programming.

A composable and "constructive" way to think about persisting data.

Compose picklers for simple types to build picklers for more complicated types

What is a pickler? simplified version of what’s actually used in scala- pickling

https://github.com/scala/pickling

Page 25: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

https://github.com/scala/pickling

Pickler CombinatorsWe need 2 things:

fully-implemented picklers for some basic types like primitives1 Picklers for base types

Functions that combine existing picklers to build compound picklers2

example: combinator that takes a Pickler[T] and returns a Pickler[List[T]]

Page 26: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Pickler CombinatorsBuild a pickler for pairs (Int, String), combine an Int pickler and a String pickler

val  myPairPickler  =  tuple2Pickler(intPickler,  stringPickler)

What’s the type?

Can we combine them automatically?

Pickler[T], can pickle objects of type T

def pickle(implicit pickler: Pickler[(Int, String)]) = { pickler.pickle((32, “yay!”)) }

Goal:

Can take intPickler and stringPickler as implicit parameterstuple2Pickler can be an implicit def

https://github.com/scala/pickling

Page 27: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Implicit Picklers

case  class  Person(name:  String,  age:  Int,  salary:  Int)  !class  CustomPersonPickler(implicit  val  format:  PickleFormat)  extends  SPickler[Person]  {      def  pickle(picklee:  Person,  builder:  PBuilder):  Unit  =  {          builder.beginEntry(picklee)          builder.putField("name",  b  =>              b.hintTag(FastTypeTag.ScalaString).beginEntry(picklee.name).endEntry())          builder.putField("age",  b  =>              b.hintTag(FastTypeTag.Int).beginEntry(picklee.age).endEntry())          builder.endEntry()      }  }  !implicit  def  genCustomPersonPickler(implicit  format:  PickleFormat)  =      new  CustomPersonPickler  

customize what you pickle!

https://github.com/scala/pickling

Page 28: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Pickle Formatoutput any format!

       trait  PickleFormat  {                  type  PickleType  <:  Pickle                  def  createBuilder():  PBuilder                  def  createReader(pickle:  PickleType,  mirror:  Mirror):  PReader              }  !        trait  PBuilder  extends  Hintable  {              def  beginEntry(picklee:  Any):  this.type              def  putField(name:  String,  pickler:  this.type  =>  Unit):  this.type              def  endEntry():  Unit              def  beginCollection(length:  Int):  this.type              def  putElement(pickler:  this.type  =>  Unit):  this.type              def  endCollection(length:  Int):  Unit              def  result():  Pickle          }  

https://github.com/scala/pickling

Page 29: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Pickle Format

https://gist.github.com/heathermiller/5760171

example

Output edn, Clojure’s data transfer format.

talk to a clojure app

toy builder implementation:

scala>  import  scala.pickling._  import  scala.pickling._  !scala>  import  edn._  import  edn._  !scala>  case  class  Person(name:  String,  kidsAges:  Array[Int])  defined  class  Person  !scala>  val  joe  =  Person("Joe",  Array(3,  4,  13))  joe:  Person  =  Person(Joe,[I@3d925789)  !scala>  joe.pickle.value  res0:  String  =  #pickling/Person  {  :name  "Joe"  :kidsAges  [3,  4,  13]  }

Page 30: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

What can be pickled?if there is an implicit of type Pickler[Foo] in scope, you can pickle instances of it

types for which an implicit pickler is in scope

types for which our framework can generate picklers

classes case classes generic classes singleton objects primitives & primitive arrays ...

IMPORTANT: The implicit picklers are used in the generation!

can’t (yet): instances of inner classes, Types https://github.com/scala/pickling

Page 31: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Status

Scala 2.11 as targetGoal:Plan:

1.0 release within the next few months

SIP for Scala 2.11

Integration with sbt, Spark, and Akka, ...

Experiment: use Scala-pickling to speed up Scala compiler

Release 0.8.0 for Scala 2.10.2

No support for inner classes, yet

ScalaCheck tests

Very soon: support for cyclic object graphs (for release 0.9.0)

https://github.com/scala/pickling

Page 32: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

And now onto something completely different.

Page 33: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

spores!scala

http://docs.scala-lang.org/sips/pending/spores.html

Page 34: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

butwecan’treallydistributethem.

Closures are great.

Page 35: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

we can’t really distribute themBut,

WHY?

because they capture stuff that’s not serializable.oFTEN NOT SERIALIZABLE

Easy to reference something and unknowingly capture it. ACCIDENTAL CAPTURE.

...enclosing

instead of compile-time checks.runtime errors

for a user, often unclear whether it’s a user-error or the framework

Who’s fault is it?

Page 36: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

we can’t really distribute themBut,

Consequences that follow from

these problems...

!...not just in their public APIs, but private ones too.

framework builders avoid them

Users shoot themselves in the foot and blame framework.

rightfully so.

When picking battles, framework designers tend to avoid issues with closures.

Page 37: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

eNTER:

spores

Page 38: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

wHAT ARE THEY?

http://docs.scala-lang.org/sips/pending/spores.htmlProposed for inclusion in Scala 2.11

behavior

small units of possibly mobile

functional

Page 39: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Spores

what are they?A closure-like abstraction for use in distributed or concurrent environments.

goal:Well-behaved closures with controlled environments that can avoid various hazards.

http://docs.scala-lang.org/sips/pending/spores.htmlProposed for inclusion in Scala 2.11

Potential hazards when using closures incorrectly: • memory leaks • race conditions due to capturing mutable references • runtime serialization errors due to unintended

capture of references

Page 40: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

sparkmotivating example:

http://docs.scala-lang.org/sips/pending/spores.htmlProposed for inclusion in Scala 2.11

class MyCoolRddApp { val param = 3.14 val log = new Log(...) ... def work(rdd: RDD[Int]) { rdd.map(x => x + param) .reduce(...) } }

Problem:

not serializable because it captures this of type MyCoolRddApp which is itself not serializable

(x => x + param)

Page 41: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Akka/futuresmotivating example:

http://docs.scala-lang.org/sips/pending/spores.htmlProposed for inclusion in Scala 2.11

def  receive  =  {    case  Request(data)  =>        future  {            val  result  =  transform(data)            sender  !  Response(result)        }  }

Problem: Akka actor spawns future to concurrently process incoming results

akka actor spawns a

future to concurrently

process incoming reqs

not a stable value! it’s a method call!

Page 42: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Serializationmotivating example:

http://docs.scala-lang.org/sips/pending/spores.htmlProposed for inclusion in Scala 2.11

case class Helper(name: String) !class Main { val helper = Helper("the helper") ! val fun: Int => Unit = (x: Int) => { val result = x + " " + helper.toString println("The result is: " + result) } }

Problem:fun not serializable. Accidentally captures this since helper.toString is really this.helper.toString, and Main (the type of this) is not serializable.

Page 43: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

We need safer closuresOk. Got it.

for concurrent &distributed scenarios.sure.

what do sporeslook like?

Page 44: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

What do spores look like?

Basic usage:val  s  =  spore  {      val  h  =  helper      (x:  Int)  =>  {          val  result  =  x  +  "  "  +  h.toString          println("The  result  is:  "  +  result)      }  }

THE BODY OF A SPORE CONSISTS OF 2 PARTS

2 a closure

a sequence of local value (val) declarations only (the “spore header”), and1

http://docs.scala-lang.org/sips/pending/spores.htmlProposed for inclusion in Scala 2.11

Page 45: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Spore

http://docs.scala-lang.org/sips/pending/spores.htmlProposed for inclusion in Scala 2.11

1. All captured variables are declared in the spore header, or using capture

2. The initializers of captured variables are executed once, upon creation of the spore

3. References to captured variables do not change during the spore’s execution

vsclosures( )A Guarantees...

Page 46: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

Spores&

http://docs.scala-lang.org/sips/pending/spores.htmlProposed for inclusion in Scala 2.11

closures

Evaluation semantics:Remove the spore marker, and the code behaves as before

spores & closures are related:

You can write a full function literal and pass it to something that expects a spore. (Of course, only if the function literal satisfies the spore rules.)

Page 47: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

How can you use a spore?

In APIs

http://docs.scala-lang.org/sips/pending/spores.htmlProposed for inclusion in Scala 2.11

def  sendOverWire(s:  Spore[Int,  Int]):  Unit  =  ...  //  ...  sendOverWire((x:  Int)  =>  x  *  x  -­‐  2)  

If you want parameters to be spores, then you can write it this way

Page 48: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

How can you use a spore?

for-comprehensions

http://docs.scala-lang.org/sips/pending/spores.htmlProposed for inclusion in Scala 2.11

def  lookup(i:  Int):  DCollection[Int]  =  ...  val  indices:  DCollection[Int]  =  ...  !for  {  i  <-­‐  indices              j  <-­‐  lookup(i)  }  yield  j  +  capture(i)  !trait  DCollection[A]  {    def  map[B](sp:  Spore[A,  B]):  DCollection[B]    def  flatMap[B](sp:  Spore[A,  DCollection[B]]):  DCollection[B]  }

Page 49: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

get you?

what doesall of that

Page 50: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

what does all of that

http://docs.scala-lang.org/sips/pending/spores.htmlProposed for inclusion in Scala 2.11

get you?Since...

Captured expressions are evaluated upon spore creation.

!

Spores are like function values with an immutable environment.

Plus, environment is specified and checked, no accidental capturing.

That means...

Page 51: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

what does all of that

http://docs.scala-lang.org/sips/pending/spores.htmlProposed for inclusion in Scala 2.11

get you?or, graphically...

During execution

Right after creation

Spores closures

1

2

5 ‘a’

5 ‘a’

??

I’m in ur stuff

draggin around ur object graf

Page 53: On pICKLES - Miller · 2014-01-23 · Goal: Scala 2.11 as target Plan: 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Experiment:

StatusSIP-21 Spores: on docs.scala-lang.org/sips now!

Get involved in the discussion!

Pull request for Scala 2.11 and Akka 2.2.1 in preparation

Integration with Scala-pickling planned

http://docs.scala-lang.org/sips/pending/spores.html